GrowthFast validation with A/B testing in Optimize
Sergio Mosquera
A bit of context
In the first quarter of 2022, the Growth Enablement team split into two sub-teams:
- The Labs team (a fast-paced team with a clear mission): The purpose of this team is to find and validate potential opportunities for growth, and build a basic PoC as fast as possible. Each of these opportunities comes with one (or more) hypothesis about the improvement we expect in our funnel after this change is implemented, e.g. increase organic traffic by 5%, raise the number of demos requested per month by about a hundred, etc ... Once the hypothesis is validated and the opportunity has a demonstrated potential, this information is passed to the Enablement team to refine the implementation.
- The Enablement team (the muscle): This team is in charge of delivering the opportunities validated by Labs in the best possible way. They are also responsible for defining and validating the hypotheses (and of course the implementation of a solution) for long term opportunities that cannot be validated fast enough, or necessary changes whose impact is very hard to measure, e.g. open a new channel (like our HR Job Portal), migrate our public pages from Ruby + Haml to Next.js, etc ...
As a result, we needed a reliable tool to help the Labs team to work faster and validate our hypotheses in the most reliable way possible. Not forgetting the fact that the complexity of the task increases when we take into account the fact that we need both technical and non-technical expertise in the hypotheses validation 🙂
Hypotheses Validation Steps
In the previous section we talked about validating hypotheses fast, but we didn’t actually discuss “the how”. Actually, we didn’t have to reinvent the wheel, so, just like many other companies, we keep it simple:
- Find potential improvements.
- Define the hypotheses and expected metrics based on these potential improvements.
- Define a set of experiments (variants) to validate the different selected approaches.
- Do an A/B test with some variants to find the best approach.
- Check if the results match our expectations and make a decision based on the approach that best suits our needs.
As seen above, in five steps, we can validate the workability of a solution, and then we can go ahead and launch and ask if it’s worth it.
In this article, we will focus on step number 4 and explain how we carry out A/B testing. This would involve how we create experiments with Google Optimize, measure them with Google Analytics, and how we determine if they are successful or not with Google (just kidding) 😅
A/B testing in Factorial
It is not out of place to make mistakes with our expectations, sometimes, we get it right, and at other times wrong; However, we are proud of both situations because we always learn something new 💪. Below, I’ll be sharing two instances of opportunities we’ve worked on - one whose hypothesis was correct and another where it was wrong:
Test different Headlines - Success ✅
We have had the same Headline for about 2 years, and we are sure that we could have a huge impact if we change it to one that better explains the value prop of Factorial.
With the above statement in mind, the Labs team proposed different variants according to the speech of the headlines: a human-centered (informal) one, a business-oriented one, and a practical one.
After carrying out tests in 9 countries, we found that in 6 of them the change was a complete success!! We also got very important insights about the different countries, something we will take into account for future experiments. The following fragment is extracted directly from the outcome analysis:
Improve “Thank you” pages - Failure ❌
When downloading a freebie or after requesting a demo, we are missing the opportunity to nurture the users with more content, such as showing other recommendations or promoting “Signup” to them. After doing some research and discussing it with the rest of the team, it was decided that we would carry out a small action that we believe could have quite an impact: reduce the text on the thank you page.
This was a good opportunity for fast validation. The premise is simple: to check if reducing friction has some impact in the funnel (it usually does, but not always...). With this idea in mind, we thought that by reducing the amount of text in the “thank you” pages, we might increase the conversion rate a lot (from 5 to 10%). In the US, the CR is much higher than in other countries (10%), and the difference we perceived between the pages was the amount of text.
After these changes were implemented and we measured the results for a month - usually evaluation periods are shorter, but in this case, some results were inconclusive, and so we added some more days. We discovered that our hypotheses were wrong and (same as in a successful opportunity) we gained some insights, in this case about how the amount of text behaves very differently between different markets.
Again, the below screenshot is extracted directly from the Outcome analysis 👇
Google Optimize + Analytics
I believe the process is now clearer- we derive some hypotheses from opportunities of growth, we run some experiments and we measure the results. With this in mind, we can further explain how we measure these experiments and their variants:
Google Marketing Platform offers a set of tools that are super useful in gathering all kinds of user-related metrics, visualizing them in different ways, and testing custom solutions without having to work directly on code. For instance, in Google Analytics you can check predefined metrics like page visits during a period of time, a bounce rate in the different landing pages, insights about devices or browsers the users navigate with, and so on ... but you can also receive custom events from your application or define custom goals (e.g. if a user lands on a specific page, or if an event with the type “Signup Complete” is received) which are better ways to have actionable insights related to your funnel metrics.
Google Optimize, on the other hand, is used to test variants of some experiment within a segment of users. The configuration of the experiments is very straightforward, and allows anyone (with or without technical knowledge) to set up a new experiment on a website. In addition to this, Optimize is connected directly with Analytics to get insights about which variant performs better according to some objective (predefined metric or custom goal).
As aforementioned, it is quite easy to configure a new experiment:
- First of all, you need to choose what type of experiment you want to run (the next section covers this in more detail).
- Define the different variants and how your audience will be split among them.
- Define the pages in which the experiment will run, it can be more than one.
- Last but not least, you have to choose your objective for the experiment so that Optimize will provide insights regarding this objective across all the variants.
And that is all you need to start a new experiment, simple and clean 😃. After this, you just need to wait patiently until you gather enough user interactions to derive conclusions. The next section explains some options that we use in Factorial for A/B testing with Optimize. There are more than those, but we haven’t used the others and so we’ll focus on those we have used 😉
Different A/B Test options in Optimize
Optimize can be used to modify the UI of your website directly, but it is not the only possibility to do A/B testing, and this section explains those we have experimented with. Along with each option, you can find a list of the pros and cons we have found so far. I have to admit that, even though none of them is perfect, all of them are very useful depending on the situation. There is no silver bullet in our case for A/B testing and we are pragmatic, so we don’t mind using one option or another depending on the opportunity:
Direct changes from Optimize Editor
Pros:
- It is the fastest and easiest option to make changes.
- Non-technical people can change basic styles (super basic CSS) or textual information from it without major issues.
Cons:
- It can generate flickering while the changes are applied → Poor CLS results
- The changes to be done are a bit limited. More suitable for minor style changes or copies. To change functionality, it is better to get a technical person involved or use other options.
- Weird results depending on the technology your website is built upon → Check how in Next.js hydration process makes this harder.
Inject information from Optimize and handle changes from code
Pros:
- Devs will have full control how and when the experiment will be run, so flickering can be handled more efficiently.
- More complex experiments can be implemented with simpler configurations from Optimize.
Cons:
- Slower process, as it requires some dev to get involved.
Working with redirects + DatoCMS
We recently discovered this option to test our experiments and it comes in very handy when taking into account the fact that we fetch our public pages’ content from a CMS. In some cases, it is as easy as creating a new page per variant and using a redirection experiment with those variant pages.
Pros:
- Suitable if creating a page in your CMS is a fast and frictionless process.
- Avoids flickering because there are no changes on the DOM after page loads.
- Allows simple variants or more complex ones, because it is possible to pass information within the redirected URL as query params.
Cons (a bit biased):
- CMS is usually shared with other teams, so the pages you are using for the experiments need to be “locked” to avoid changes during the evaluation phase.
- Also, if the experiment implies testing a new feature for your CMS, e.g. you add a new layout to display the content in a different way, you need to ensure others don’t use it until the evaluation finishes.
This is a brief overview of how we carry out A/B testing in Factorial, the decision-making process, and our evaluation of success. I had a swell time writing this, and I do hope you had the same reading it 😋