A/B and multivariate testing are key parts of our product development process. From the brainstorming phase to a feature’s launch, our data team collaborates with product managers, designers, and developers to ensure new features are tested and their effects are understood.
Given how important experiments are to the Kinja team’s data-driven culture, we’ve spent the last year further integrating our testing infrastructure into the Kinja platform. The improvements include removing certain dependencies on our code base, writing consolidated command line tools to assist in creating experiments and pulling results, and improving how users are placed in test variants. You can read more about how our testing infrastructure works in this post.
In addition, we’ve worked for the last several months to build a user interface for setting up, monitoring the state of, and stopping A/B tests. This comes after noticing other companies had such in-house tools to manage their own testing platforms. In this post, I’ll discuss why we built this and where we hope to take this moving forward (we hope to explain some of the technical details in future posts).
Why build a dashboard for A/B testing?
Currently, A/B tests are managed by the data engineering team using various command line tools that bring together the API endpoints underpinning our testing infrastructure (these include Google Content Experiments, Google Ad Manager, Fastly, and Kinja itself). We have processes for alerting developers and non-technical stakeholders when an experiment starts and ends, but otherwise, the process can be somewhat of a black box to those outside our team.
The UI accomplishes a few goals, including:
A more accessible, reliable system
Our user interface allows technical and non-technical users to see how an experiment is set up, the percentage of traffic being exposed to the changes, and what experiments are currently running if any. And because our API abstracts away the individual third-party APIs and follows the same flow every time a test is set up, one no longer needs to worry about a step being skipped or done incorrectly.
Prior to the creation of our experiments pages, we had a number of systems in place to monitor the steps of setting up an A/B test. These primarily relied on notifications in Slack, which would fire, for example, when someone targeted a Kinja feature flag to a specific variant of the test (we use feature flags because we’re a continuous integration shop). However, some steps like pushing key-values for ad slots to Google Ad Manager were not as readily recorded.
With the new experiments dashboard, those on the Kinja team can easily see whether a test is running and take an action to pause the test if something goes awry. While anyone could have paused experiments in the past, the steps to do so became more complicated if we were to run with more complicated setups, such as exposing a test to only certain sites or running multiple experiments simultaneously.
Accessing results from the past
We run many A/B tests on the Kinja team, and sometimes it can be easy to forget what tests were run several months or years ago. While we had a record of every test in Google Content Experiments, seeing that list required knowing where to find them. And even if you knew what test we’ve run, the results of that test weren’t immediately available. Our experiments dashboard will now serve as the one-stop destination for users looking to see what we’ve done in the past and how changes performed.
Whenever we’ve finished analyzing the results for a test, we send an email with the analysis to the entire tech team. The downside of this method is that if someone joins the company, they won’t have access to all previous tests without asking someone to forward the results. Now we can link to experiment results (and other external links like tasks on our project management board) on the page.
Setting the groundwork for the future
Currently, all analysis tasks are still done manually by the data engineering team. However, in the future we could integrate things like basic metrics and add functionality as we adapt our testing strategy. Because we have a robust API underpinning the UI, building new features that are accessible by all of product and engineering should be much easier.
For example, this project involved creating the first Python microservice that runs on Kinja—typically our engineers have written all of our services in Scala. By laying the foundation here, our Python-based team will be able to iterate and add to the existing work we’ve built.
How does it work?
Each experiment object contains metadata about the test, including feature flags to enable by experiment variant, who set up the test, and timestamps. Our API, internally known as kinja-experiments, manages the different aspects of setting up and ending experiments based on the object’s state (i.e., is it a draft experiment, actively running, or stopped). As experiments move between different states, the kinja-experiments system works through our logic and reaches out to the services (like Fastly and the Kinja features service) to construct and deconstruct things as necessary.
Within the UI, one creates a draft experiment on the homepage and is able to add in experiment information such as a name, the number of variants, and feature flags to enable per variant. You can either save the experiment for later or start it if you’re ready to begin the test.
On the homescreen, users can see all previous experiments and tests scheduled for the future. In the future, users will be able to configure which of our sites an A/B test can appear (we have that functionality in the API, but alas that feature didn’t make it into the first version of the dashboard).
What’s on deck next?
We started using the A/B testing user interface in the last few weeks for all our current experiments, giving us an opportunity to resolve bugs and discover pain points to address in the future.
The team also has a backlog of ideas to implement, including the mentioned site selector. In addition to that, we hope to add some sort of sorting functionality and maybe someday search! We’ll document improvements to the system in the future, so be sure to look out for future blogs posts about this project.
For now though, the A/B testing dashboard we’ve built is the latest improvement to our line of data tools. It wouldn’t have been possible without the amazing work of several including Allison Wentz, Josh Holbrook, and Victor Amos, as well as the lovely designers on the Kinja team!