Reka Gay

Engineering Fellow

Reka is an Engineering Fellow at Featurespace. Having joined back in 2016, she was involved with many aspects and functions of the company from the early days. Reka originally joined as a Software Development Engineer in Test (SDET), focusing on the test infrastructure but soon became responsible for a number of internal tools throughout the years which earned her the title of Head of System Tools Team. With a degree in Computer Engineering, her career started as a college lecturer, went onto procurement, and then testing before settling on software development. She loves exercise and nutrition and used to be a masters squash player.

Connect

1 Why would you have internal tooling?

With so many tools commercially available, why would any company allocate their resources to building tools internally? The Performance Tool highlights the many advantages of such a tea. It is an issue that exists across a variety of departments but has different use cases and application scenarios. This means that every team would like to use such a tool but all in a slightly differently way. Without centralized management around the solution, before each team member would end up spending a lot of time finding a tool that would suit only their specific need. They would allocate a lot of time investing in installation, resolving network and security issues and most times realizing that an off-the-shelf tool is unlikely to work in exactly the way they would like it to.

I like using everyday scenarios as analogies and in this case let’s think about the need to buy new clothes. Sure, clothing shops are widely available, and we can send everyone out individually to buy what they need (or what they think they need – more on this later). This means a lot of people would be spending a lot of time searching for the right supplier. If they are online shopping, they would need to select a product online and upon delivery, may find that it just doesn’t suit what they were looking for.

Alternatively, we could get a tailor, who would take the time to identify what the individual’s style preferences are, take their measurements and deliver a garment to that person’s exact requests. The process might take longer for the tailor, but their work won’t interfere with the general workload and the sum of all time spent will be much less than in the case of someone trying to find what they are looking for. As a bonus, the result is a product that matches exactly what the person wanted!

It is important to point out that not all problems require tailor-made solutions – which is the second function of an internal tools team: to analyze whether a third-party tool would satisfy the criteria and suit the needs, then move to implement the solution. They would already have the know-how around installation, network setup, and security procedures so the production team asking for the tool wouldn’t have to reinvent the wheel and waste time.

2 First step: identifying the problem

The problem definition was simple: our performance team was overloaded, resulting in long waiting times for individual performance scenarios to be tested causing huge delays between implementation and verification, as well as delays in reporting and root cause analysis. This meant we needed to redesign our performance testing process, but after a closer look we found that the issue wasn’t performance testing itself – we solved that a long time ago. What we needed to do was to utilize our in-house testing framework by adding some complex thread handling and providing our performance team with a library that allows them to focus on scenarios, rather than infrastructure.

Or did we?

Breaking it down to its smallest parts, performance testing ARIC™ Risk Hub is pretty simple: our system consumes events, all we have to do is ensure a stable, controlled load on our inputs and measurement of the effects. Sounds easy enough, especially as there is a multitude of systems that offer just that. So, why is it not widely done on a regular basis? To answer these questions, we cast a broad net across multiple departments within Featurespace asking the question: “How would you want to interact with a performance testing tool?”.

The question took us on a very exciting journey from trying to resolve a testing issue, to ensuring that we’re providing the correct input, gathering all the necessary data prior to starting the test and most importantly: making sure that we allow our users to configure our large, distributed system simply and reliably, and then deploy it to flexible infrastructure. These are repetitive tasks that are error-prone when done manually but lend themselves to automation.

2.1 Our system requirements

After talking to colleagues from a variety of departments we realized that the question is not “how to test” – that is already covered – but rather “what are we testing”. This, in our case, boils down to ensuring that all possible sources for input data (locations for the installer, configuration files, system initialization etc.) are easily discoverable within the performance testing environment. Then comes the exciting question of deployment. To have a realistic scenario, ARIC Risk Hub has to be installed on large (>20 nodes) clusters. Our in-house virtualization environment requires on-premises physical hardware to run on, meaning that when multiple teams would like to run continuous performance testing, we would very quickly run out of rack space.

This analysis identified the two major issues:

To install ARIC Risk Hub, we need to gather a variety of different packages that all reside on different systems and are difficult to acquire without thorough knowledge of our build pipelines and repositories.
Once all necessary packages are gathered, we still have a limitation of physical hardware available, greatly restricting the number of performance tests that can be run at a time.

3. Second step: build an MVP

Why bother with an MVP (Minimum Viable Product – also generally referred to as a Proof of Concept), I hear you ask?

There is usually a huge difference between what people think they want and what they actually need. This means that even after the most thorough problem analysis, all we can build is what our users thought they wanted at the time. Once a tool is built, we always find that some requirements fade while others emerge. Going back to the analogy of using a tailor, this would be something like wanting a pocket on the left side of your trousers only to realize that once worn, it really needs to be on the right side. If we spent time with some very elaborate pocket to start with, we might find that placing it on the other side is more complicated than we would like. So, it’s always best to deliver the bare minimum first, then gather feedback to identify how it is being used and what areas need more effort.

3.1 The technology stack

Easily accessible, simple to use requirements mean that we need to build a front-end for the tool. The Tools Team develop in React using the Carbon Design System – providing us with most of the components we need out of the box as well as forcing us to adhere to good design principles. Coding in JavaScript is always a minefield – especially from the perspective of someone coming from a C#/Java background.

Client-side capabilities have improved hugely in the past few years, there is a lot more we can do today. This means we regularly have to weigh up whether a particular functionality can be done in the front-end and whether our application needs a backend at all. Even when we do decide that a backend is required, we could always go “serverless” and deploy our backend logic into the cloud. In our case, due to the heavy orchestration needs and reliance on a variety of subsystems, a backend was essential. As our company profile is Java-based the only sensible choice here is SpringBoot.

The last piece of the puzzle here is the virtualization environment: where are we going to deploy our System Under Test? We can utilize our existing OpenStack infrastructure but with its limitations, we also need a more sustainable solution that is more flexible around company growth. The answer to this is deploying to the cloud – we don’t have to buy the hardware for our virtualization, we just have to “lease” it from a cloud provider for the duration of the test. We can ensure that these test machines are deployed into a virtual private cloud without internet access, requiring an office VPN connection to be reachable, adding security and peace of mind. This solution still has its challenges, though: such as an easy and flexible way to provision the infrastructure.

3.2 The front-end

At the start of the process, the main issue we wanted to address was the different sources of packages we needed to gather to start testing. Without going into detail about what these sources are, the aim was to allow the users to remain within the performance tool to gather all the necessary parts, as well as to guide them through the process and eliminate as many mistakes as possible. For this purpose, we built a wizard-like structure that guides the user through the process, providing autocomplete-based discovery of each part, completely hiding the underlying systems. The last challenge here was providing feedback once the setup was done and the performance test phase has started. Our initial thought was to implement this using WebSockets but we found that this technology is now outdated – plus it’s bi-directional and we only need to display server-side progress, so we decided on using the EventSource API instead. This is built into every modern client and allows us to “subscribe” to events emitted by the server – much like an RSS feed.

3.3 The backend

The first part of the backend is straightforward: we needed to provide some endpoints for the front-end, allowing us to hide potentially complex discoveries of packages. Then, we need to keep track of the scenarios a user has already set up and ran – which requires a database and allows a user to quickly re-use their previous work. The most complex part however is the “work” orchestration. Once the user has gone through the setup parts in the UI, the backend has all the necessary information to start the tests. It has to gather all the input packages (download phase) prepare the virtual environment and install the SUT (deploy phase) and run the tests (run phase). All the while it reports progress to the front-end – which it does via a Server-Side Events Emitter.

3.4 The infrastructure

We already had an ingenious way of describing our infrastructural needs for a particular deployment: through a YAML file, describing the number of nodes and what each of them is responsible for. The last step here is to make this virtualization-agnostic – the same YAML file should be applicable to both an AWS EC2 deployment and our internal one. This requires a mapping layer implemented in the backend that will allow the user to define their infrastructure needs without having to think about whether this will be deployed locally or into the cloud.

4. Last step: measure success

Once the MVP is built and deployed, we had to measure the uptake, the added value of the tool and gather additional requirements for a full version. We can rely on user feedback of course but evidence suggests that “happy” users are always less vocal than unhappy ones. So, it’s essential to add some analytics into the system with an easy way to provide feedback. We also scheduled user sessions for all our target audience and have a structured conversation about how the tool is being used, trying to identify additional requirements or issues that need fixing as we move forward.

Ultimately, we’re expecting this tool to speed up development by providing our teams with:

A quick and easy way to measure the effect of their work themselves,
To aid delivery by continuous testing of customer architecture,
To help the support team identify the source of a performance issue by easily running customer scenarios in-house and,
Even to help data science to improve model performance and catch potential issues early.

Easy to use performance testing of our distributed systems