Products

Problems
we solve

We can help your business

Request a Free Demo / trial

Insights

Insights | How to | Performance Testing
26 May, 2022

What is Chaos Engineering? (and Why You Need It)

What is Chaos Engineering

Bang!! All of a sudden, one of your production servers goes down. What happens next? How does your system respond? Fortunately, the outage was only in your test environment, at least this time. It was introduced randomly, but deliberately, by chaos engineering.

To understand the need for chaos testing, it’s worth thinking about a few questions:

  • Do you know what would happen when outages happen in your production system?
  • Are you vulnerable to failures or attacks in your live infrastructure?
  • How do you respond to CPU failures, network failures, software service failures, etc?

How Chaos Testing Helps 

Incidents like those mentioned above have often been seen as too difficult, too expensive, or too low a priority to test. If you did test attacks and failures, you were in the minority, and you typically had one shot to get it right.

Chaos engineering allows you to throw a spanner in the works. Easily, repeatedly and affordably.

Chaos engineering identifies weaknesses before they become outages. It proactively tests how a system responds under stress and infrastructure failure. Chaos engineering lets you identify and fix weaknesses before they end up in the news.

What is Chaos Engineering?

Chaos engineering introduces failures into a distributed computing system to test its resilience.

Also known as chaos testing, chaos engineering was popularised by Netflix when they introduced ‘Chaos Monkey’ to test their redundant architecture. If you are running customer-facing systems, chaos testing is a useful and straightforward process. Of course, you need to have the right tools and testing processes in place.

It is worth noting that chaos engineering is essentially a modern take on ‘resilience testing’, which has been around the tech industry since the year dot.

Usually, with software testing you are assessing your solution’s ability to carry out your business processes. However, you assume that the underlying architecture is fully operational.

With chaos engineering, you still assess your business processes, but failures are deliberately introduced into the underlying architecture.

Incidents tested by chaos engineering include:

•             Severed network connections

•             Server outages

•             Software component crashes

•             Component degradation

•             Any many more

Why is Chaos Engineering Important?

We told you earlier about Netflix’s involvement. Since then, chaos testing has been adopted by Google, Microsoft, Amazon, LinkedIn, Facebook, and many others.  As well as tech companies, finance and retail are also seeing significant benefits.

There is often an underlying fear with software systems, an elephant in the room. What happens if part of your system goes down? How resilient are your systems?

Most systems nowadays are heavily interconnected, with multiple customer access points. If even a small component were to fall over, it could lead to much larger issues.

A lot of companies choose to bury their head in the sand. Ignoring these potential issues in the hope that they won’t happen. Unfortunately, though, they do happen.

Chaos engineering directly addresses these issues.

When you randomly introduce failures into your systems, you can:

•             Understand how failures affect your solution

•             Test your redundant systems (if you have any)

•             Identify components that need additional resilience

How Can I Get Started with Chaos Testing?

As part of its ongoing development, Micro Focus LoadRunner Professional (LRP) users can now integrate with Gremlin, one of the leading chaos engineering tools, to introduce chaos testing during performance testing.

LoadRunner Professional uses integrated Gremlin APIs to orchestrate chaos testing. LoadRunner sends a request to Gremlin to execute the predefined scenario and is called when you add Gremlin scenarios.

You can then compare how your solution responds during an attack against how it performs normally.

Visit the Micro Focus LoadRunner Help Center to learn how to incorporate Gremlin attacks into your LoadRunner Professional scenarios

How Does Gremlin Chaos Engineering Work?

Gremlin is one of the foremost chaos engineering tools and helps you test how your system responds under stress. By incorporating Gremlin attacks into your LRP load tests, you can understand how unexpected failures will impact your infrastructure and applications.

Once a Gremlin disruption event has been added to your LRP scenario, this will affect your chosen component for a specified duration.

For example, you could choose to run a CPU attack event for 5 minutes, starting after half an hour. When you run this scenario script, at the 30-minute mark, Gremlin will attack the chosen CPU.  

Gremlin allows you to define virtual attacks on different elements of your solution. You can perform high levels attacks on systems such as databases or web servers. You can also perform more specific, attacks, on components such as the CPU, disk, memory, etc. 

Attacks will generally impact the regular workflow, limiting response or reducing performance, so that the webserver works more slowly than usual and there are fewer successful transactions.

By using chaos engineering during performance testing, you add value to the process and learn about how your systems will cope with failures or attacks. Chaos engineering allows you to establish how resilient you are when something goes… bang!

To learn more about Chaos Engineering, contact Calleo today

Stephen Davis
by Stephen Davis

Stephen Davis is the founder of Calleo Software, a OpenText (formerly Micro Focus) Gold Partner. His passion is to help test professionals improve the efficiency and effectiveness of software testing.

To view Stephen's LinkedIn profile and connect 

Stephen Davis LinkedIn profile

26th May 2022
Test Automation Fails Smaller Teams

Why Test Automation Fails for Smaller Teams

Many small software teams turn to test automation, expecting substantial time and cost savings. However, they often fail to achieve any of these goals; instead of seeing a return on investment, they end up spending more effort and cost fixing their automation packs. This failure can leave lasting scars, deterring people from embracing automation and realising its many benefits…

breaking up with legacy tools

When to Move on From Legacy Test Tools

I often speak to people who want to abandon legacy test tools and transition to shiny new solutions. They cite several reasons for the switch, many of which are valid, while others need greater consideration to avoid a negative or costly outcome. On the other hand, I also speak to people who are reluctant to ever change tools, even though they’d see incredible benefits.

Shift Left

Shift Left Testing: 4 Myths and Why They Matter

Shift-left testing has become one of the most talked-about software development ideas. It sounds deceptively simple: test earlier in the process to avoid late surprises. But while the phrase is repeated at countless conferences and stand-ups, it is often misunderstood, misapplied, or reduced to a box-ticking activity (like many other testing initiatives).

Is speed destroying quality

Are Faster Releases Destroying Software Quality?

The relentless obsession with ever-faster software delivery puts increased pressure on projects and teams, forcing them to adopt new processes and behaviours, but at what cost? The need for speed has transformed release frequency into a core metric, but is this relentless pursuit of speed undermining quality?

AI in software testing

AI in Software Testing: Just Another Fad?

AI is everywhere. The software testing industry is flooded with buzzword-heavy solutions, and you’d be hard pressed to find a vendor that hasn’t marked at least one of their tools as AI-powered. But is AI another in a long list of cautionary tales, or does it genuinely herald a new era?

Test Automation Hype

Are Test Automation Claims Just Marketing Hype?

Read the marketing collateral from test automation vendors and you’ll encounter bold promises around costs, coverage, and defect reduction. However, for many who have been through multiple automation initiatives, the reality frequently fails to live up to the pitch.

Adding More Testers Makes Quality Worse

When Adding More Testers Makes Quality Worse!

You’re deep into a project, go-live is rapidly approaching, but there is a mountain of testing to get through. Then, a key stakeholder chimes in, “Let’s just pull more people into testing.” It sounds logical: bigger effort, higher quality. But doubling down on resources can easily lead to chaos, confusion, and worse software quality.

Is Open Source Trustworthy

Do You Trust Open-Source Tools for Enterprise Testing?

Open-source testing tools like JMeter and Selenium have obvious appeal—no licensing fees, endless customisation, and a community to lean on. But, if you’re using open-source for mission-critical testing, you need to ask—is it really worth the risk?

Should testers be allowed to block releases?

Should Testers Be Allowed to Block Releases?

Your testers find a critical bug the night before a major release. Should they have the power to stop the launch?

Testers provide essential insights into software quality and risk. Their analysis is critical for decision-makers, so would it make sense to give them the power to veto releases?

Insights

Search

Related Articles

To get other software testing insights, like this, direct to you inbox join the Calleo mailing list.

You can, of course, unsubscribe 

at any time!

By signing up you consent to receiving regular emails from Calleo with updates, tips and ideas on software testing along with the occasional promotion for software testing products. You can, of course, unsubscribe at any time. Click here for the privacy policy.

Sign up to receive the latest, Software Testing Insights, news and to join the Calleo mailing list.

You can, of course, unsubscribe at any time!

By signing up you consent to receiving regular emails from Calleo with updates, tips and ideas on software testing along with the occasional promotion for software testing products. You can, of course, unsubscribe at any time. Click here for the privacy policy.