Make Integration Testing Fun Again

Posted on May 13, 2022

At Hippo Insurance, we build an integration testing framework/infrastructure that makes writing integration tests a no-brainer and super productive (and fun).

Why integration tests?

Unit tests and TDD are amazing. Seriously. However, they are just not enough. We have a service-oriented architecture that runs about ~50 services in production. Between Frontend, Gateway, Backend, Workers, Queues, and more. Writing a unit test for each of these is simply not enough, and it does not cover you in a case of regression during configuration (or anything else for that matter).

I realize the statement above may seem controversial. However, we have seen this as a problem we need to solve.

Why are integration tests hard?

Integration tests are notoriously hard to write. They are also notoriously hard to keep in line with the services. So what you get is that you have decay in your integration tests over time. New ones are still good, but old ones quickly go out of date.

Writing integration tests is also typically not like writing service interactions. You need to set up things like HTTP calling the service, expectations, etc.

Another challenge is configuration management. You are testing service interactions with a set of config you have set-up for the testing, but what about the configuration you have in prod, or staging, or in a new feature?

How are we solving this?

Before I dive into the details, let me write them in highlights

  1. We are using the same configuration management system as for the services
  2. We use a real environment OR a local environment OR staging
  3. We use the real compiled clients from the service, using the same openapi definitions

Same config system

We use configuration management for our systems. The same configuration management, with the same variables, is used to generate your testing configuration.

For example: here’s how you configure a connection to our billing-service.

billing_service_v1:
  base_url: "https://billing-service.{{ env }}.{{ internal_domain }}/source/{{ service_name }}"
  username: "billing"

And here’s how you configure it for testing:

billing_service_v1:
  base_url: "https://billing-service.{{ env }}.{{ internal_domain }}/source/{{ service_name }}"
  username: "billing"

Notice anything different? No? Right. There is no difference!

When this configuration is run against an environment, it configures the service, when it runs against local (integration test) it spits out a configuration file that looks exactly like what our internal libraries expect.

Real Environment

We run our integration tests against a real environment that looks exactly like production. It has databases, queues, services, workers, everything.

You can run it against staging, dev, or your own real environment in the cloud. I have written before about ephemeral dev environments, we will not get into that as part of this post.

You can obviously run it against a docker compose with localstack, but mostly, we consider this to be accidental complexity. It’s too much effort to set-up in 99.9% of cases.

Running it against a real environment gives us much higher confidence it will work like tests say it will. You can also run it against a branch of course.

Same Clients

Each one of our services generates a client. Automatically!! Based on the open API definition of the service. Every time you release, you also release a new client.

Our tests use those clients in order to communicate with the services.

Here’s a sample code from our test:

const { id: lockboxFileId } = await billingService.lockboxFileCreate({ params });

This is the same code as you will find in our workers and services calling one another. You can test your integration tests with various versions of the client, against various environment as stated above, giving you a very clear picture of how your REAL code will behave.

Let’s say that your client now expects a different param, your tests will break without you changing a single line in them, exactly like they should. If you would write custom code to interact with the service, it would not be the same.

Validation

To test our success in this, we said the following: “Writing an integration tests that interacts with 5 services and 5 workers should take 10 minutes or less to set-up”.

We have achieved this goal. You can set-up an integration tests that really exercises complex flows in your system quite easily.

Questions

If you have any questions / comments, feel free to ping me on Twitter and ask. Would love to discuss.