Products

Problems
we solve

We can help your business

Request a Free Demo / trial

Insights

Insights | How to | Performance Testing
26 May, 2022

What is Chaos Engineering? (and Why You Need It)

What is Chaos Engineering

Bang!! All of a sudden, one of your production servers goes down. What happens next? How does your system respond? Fortunately, the outage was only in your test environment, at least this time. It was introduced randomly, but deliberately, by chaos engineering.

To understand the need for chaos testing, it’s worth thinking about a few questions:

  • Do you know what would happen when outages happen in your production system?
  • Are you vulnerable to failures or attacks in your live infrastructure?
  • How do you respond to CPU failures, network failures, software service failures, etc?

How Chaos Testing Helps 

Incidents like those mentioned above have often been seen as too difficult, too expensive, or too low a priority to test. If you did test attacks and failures, you were in the minority, and you typically had one shot to get it right.

Chaos engineering allows you to throw a spanner in the works. Easily, repeatedly and affordably.

Chaos engineering identifies weaknesses before they become outages. It proactively tests how a system responds under stress and infrastructure failure. Chaos engineering lets you identify and fix weaknesses before they end up in the news.

What is Chaos Engineering?

Chaos engineering introduces failures into a distributed computing system to test its resilience.

Also known as chaos testing, chaos engineering was popularised by Netflix when they introduced ‘Chaos Monkey’ to test their redundant architecture. If you are running customer-facing systems, chaos testing is a useful and straightforward process. Of course, you need to have the right tools and testing processes in place.

It is worth noting that chaos engineering is essentially a modern take on ‘resilience testing’, which has been around the tech industry since the year dot.

Usually, with software testing you are assessing your solution’s ability to carry out your business processes. However, you assume that the underlying architecture is fully operational.

With chaos engineering, you still assess your business processes, but failures are deliberately introduced into the underlying architecture.

Incidents tested by chaos engineering include:

•             Severed network connections

•             Server outages

•             Software component crashes

•             Component degradation

•             Any many more

Why is Chaos Engineering Important?

We told you earlier about Netflix’s involvement. Since then, chaos testing has been adopted by Google, Microsoft, Amazon, LinkedIn, Facebook, and many others.  As well as tech companies, finance and retail are also seeing significant benefits.

There is often an underlying fear with software systems, an elephant in the room. What happens if part of your system goes down? How resilient are your systems?

Most systems nowadays are heavily interconnected, with multiple customer access points. If even a small component were to fall over, it could lead to much larger issues.

A lot of companies choose to bury their head in the sand. Ignoring these potential issues in the hope that they won’t happen. Unfortunately, though, they do happen.

Chaos engineering directly addresses these issues.

When you randomly introduce failures into your systems, you can:

•             Understand how failures affect your solution

•             Test your redundant systems (if you have any)

•             Identify components that need additional resilience

How Can I Get Started with Chaos Testing?

As part of its ongoing development, Micro Focus LoadRunner Professional (LRP) users can now integrate with Gremlin, one of the leading chaos engineering tools, to introduce chaos testing during performance testing.

LoadRunner Professional uses integrated Gremlin APIs to orchestrate chaos testing. LoadRunner sends a request to Gremlin to execute the predefined scenario and is called when you add Gremlin scenarios.

You can then compare how your solution responds during an attack against how it performs normally.

Visit the Micro Focus LoadRunner Help Center to learn how to incorporate Gremlin attacks into your LoadRunner Professional scenarios

How Does Gremlin Chaos Engineering Work?

Gremlin is one of the foremost chaos engineering tools and helps you test how your system responds under stress. By incorporating Gremlin attacks into your LRP load tests, you can understand how unexpected failures will impact your infrastructure and applications.

Once a Gremlin disruption event has been added to your LRP scenario, this will affect your chosen component for a specified duration.

For example, you could choose to run a CPU attack event for 5 minutes, starting after half an hour. When you run this scenario script, at the 30-minute mark, Gremlin will attack the chosen CPU.  

Gremlin allows you to define virtual attacks on different elements of your solution. You can perform high levels attacks on systems such as databases or web servers. You can also perform more specific, attacks, on components such as the CPU, disk, memory, etc. 

Attacks will generally impact the regular workflow, limiting response or reducing performance, so that the webserver works more slowly than usual and there are fewer successful transactions.

By using chaos engineering during performance testing, you add value to the process and learn about how your systems will cope with failures or attacks. Chaos engineering allows you to establish how resilient you are when something goes… bang!

To learn more about Chaos Engineering, contact Calleo today

Stephen Davis
by Stephen Davis

Stephen Davis is the founder of Calleo Software, a OpenText (formerly Micro Focus) Gold Partner. His passion is to help test professionals improve the efficiency and effectiveness of software testing.

To view Stephen's LinkedIn profile and connect 

Stephen Davis LinkedIn profile

26th May 2022
test management tools are the foundation

Build a Foundation for Testing Success: Choosing a Test Management Tool

Test management tools give unparalleled views of software development progress, provide quality assurance and peace of mind, and can generate positive returns on investment – more than just paying for themselves. This insight discusses some of the contenders and gives recommendations.

risk v reward

Risk v Reward: Are Test Management Tools Worth It?

Are test management tools worth the money? It’s easy and common to assume there are more impactful ways to spend project funds than test tools. But does this downplay the important role a professional test management tool can play in success?

The evolution of test management tools

Test Management Tools: Past, Present, and Future

Understanding where things have come from can often help inform where they are going. The story of test management tools goes back at least three decades and this insight offers a precis of their past, present, and future…

5 automation trends

Software Test Automation: 5 Important Trends for 2024

Software test automation has evolved massively over the last few years; gone are the days of flaky tools, gargantuan setup effort, and scripts that require constant human intervention. The integration of cutting-edge technologies and methodologies has redefined the role of test automation within the software development lifecycle.

Automate Everything With One Tool

Software Testing Simplified: Automate Everything With One Tool

With so many software test automation tools to choose from, companies often cherry-pick a suite of low-cost options to test their full landscape. Unfortunately this is highly problematic, adding unnecessary complexity, increasing costs and undoing any of the purported benefits.

UFT One Automation Heavy Lifting

Test Automation: 6 Reasons UFT One is The Only Tool You Need

In the fast-paced world of software development, functional testing is critical to ensure your solutions perform as expected. Unlike manual testing, which is time-consuming and prone to human error, test automation streamlines the testing process, offering a faster, more accurate, cost-effective solution. There is an array of automation tools available, but OpenText UFT One is the standout choice, offering a complete solution for all your needs.

cut testing costs

Cut Software Testing Costs With These Essential Tactics

Balancing rigorous testing with cost-effectiveness is a challenge. You can test too much, take too much time, and spend too much money on testing. Luckily, there are proven ways to refine your testing practices, increase efficiency, and cut your software testing costs.

performance testing is like herding cats

Herding Cats: Performance Testing Strategies for 2024

Performance testing is more challenging than ever. Technology is more complex, customer expectations are higher, and an increasing number of people and departments need to be involved to make performance testing effective. This article shines a light on strategies to help you execute successful performance test cycles in 2024.

Insights

Search

Related Articles

To get other software testing insights, like this, direct to you inbox join the Calleo mailing list.

You can, of course, unsubscribe 

at any time!

By signing up you consent to receiving regular emails from Calleo with updates, tips and ideas on software testing along with the occasional promotion for software testing products. You can, of course, unsubscribe at any time. Click here for the privacy policy.

Sign up to receive the latest, Software Testing Insights, news and to join the Calleo mailing list.

You can, of course, unsubscribe at any time!

By signing up you consent to receiving regular emails from Calleo with updates, tips and ideas on software testing along with the occasional promotion for software testing products. You can, of course, unsubscribe at any time. Click here for the privacy policy.