As software developers, we rely on a variety of tools and processes to ensure that our code is reliable and performs well. We write unit tests to validate the behavior of individual components, and we use staging environments to simulate real-world situations and catch any issues before deployment. However, despite our best efforts, things can still go wrong in those crucial moments.

There are countless factors that can affect the performance and reliability of software in production, including production-specific configurations, environment variables, and unexpected interactions with other systems. No amount of unit tests or staging environments can completely eliminate these types of issues. Even code that has been thoroughly tested and appears to be working perfectly in a staging environment can fail in unexpected ways when it is deployed to production.

That's why it's important to have a plan in place for handling those types of issues. Testing in production drastically increases confidence when releasing software since you are literally testing the real thing, the actual feature that your end users are going to be using. In the following sections, we'll explore some strategies and key aspects of testing directly in production for improving reliability and minimizing risk.

The benefits of testing in production

Testing in production refers to the practice of deploying code changes and testing them in a live production environment, rather than in a staging or testing environment. There are several benefits to this approach:

Improved reliability: Testing in production allows you to catch and fix issues that may not have been identified in a staging or testing environment. Typically when dealing with payments, electronic signatures, or any other external systems that might be different in a staging environment. This is what is referred to as "integration hell", where each part works individually, but when you try to plug everything together it falls apart.
Better user experience: Testing in production allows you to test changes with real users, which can provide valuable insights into how the changes are impacting the user experience. This can help you make more informed decisions about which changes to keep and which to roll back.
More realistic testing: Testing in production allows you to test changes in a more realistic environment, as the production environment typically has more data, traffic, and complexity than a staging or testing environment.

Overall the benefits largely outweigh the cost of implementing this in your processes.

Testing only on a few users

Testing in production does not mean blindly deploying changes to all users. Instead, it involves selecting a handful of users that can be handpicked to test changes with, typically people from your own team or a set of trusted beta testers with whom you have a strong relationship. This way, if any issues or unexpected behaviors arise, they will only affect a limited number of users, rather than the entire user base.

As your confidence grows, you can gradually introduce the new feature to more users. Typically at this stage, you want to pick users at random based on a rollout percentage: 10% → 20% → 50% of your users for instance. Splitting users is actually pretty hard to do correctly:

A given user should always be served the same "version" of your app to avoid flickering. This is called stickiness.
The number of users who see a given version should follow your rollout percentage, and this percentage can change over time.
Users should be truly randomized, you don't want the same users to see all of your new features. This part is often poorly executed if you decide to internalize this process instead of using a turnkey solution like Tggl. If the split is not truly randomized any data that you collect is worthless and will likely lead you to poor decisions.

Once you have determined that the updated version is functioning properly and has had a positive impact on key performance indicators, you can roll out the feature to all users. At this stage, you may choose to retain the split testing code as a precautionary measure in case you need to revert the change, or you can remove any split testing code that is no longer needed.

Tools for testing in production

Being able to display a different version of your app to different users is key. Ideally, you should be able to change between different versions of your app instantly, without code change and deployment cycle. This allows you to quickly and easily test and evaluate changes, roll back, or make rapid adjustments as needed. This is where feature flagging tools really come in handy.

There are many tools available out there that can assist with this task, each offering a unique set of features and benefits. While we recommend using Tggl as it is the best compromise between ease of use and feature set, other options are also capable of handling basic requirements. To choose the right tool for you you may want to consider:

Pricing: this can vary quite significantly between solutions.
Ease of use: this is a major differentiator. Feature flagging tools are meant to be used by many different profiles: tech, product, and sometimes even sales... Most solutions have terrible user interfaces so make sure to take this into consideration based on your situation.
API and SDKs: some tools require a lot of manual work to get started, have a look at the documentation to estimate the load on your part.

In most cases, implementing a feature flagging solution should take a single developer only a few hours. Check out this article that walks you through the first steps of impementing feature flags in your workflow.

It is usually not recomended to implement your own solution. It might be tempting to write split conditions in your code but it defeats the purpose of having an external tool: you still need a developer to make changes, and you are still bound to slow deployment cycles. And while it might seem straight forward from the surface, correctly splitting users, with stickiness, in a way that mathematically makes sense for data is actually pretty hard.

Conclusion

In conclusion, testing in production can be a powerful tool for ensuring the quality and reliability of your software releases. By carefully planning and implementing testing strategies that take place in a live production environment, you can catch and fix bugs before they reach your users, leading to more stable and bug-free releases. While testing in production does come with its own set of challenges and risks, by following best practices and utilizing the right tools and techniques, you can effectively mitigate these risks and achieve the benefits of a more efficient and reliable software development process.

If you havn't already, give it a try, see what your team thinks, and enjoy bug free releases!

The power of testing in production

How to achieve bug free releases

The benefits of testing in production

Testing only on a few users

Tools for testing in production

Conclusion