Over the past couple of years we have learned a great deal from our users and built out significant depth in features to meet their needs. We have worked to improve the ease of use and efficiency in communicating with beneficiaries at scale through features like google sheet integration, IVRS integrations, Natural Language Processing(NLP) using Dialogflow, and multiple profiles to name a few.
Beyond ease of use, we learned that it is crucial to have the capability to internally evaluate and improve your chat flows. Designing a chatbot is not a one-and-done exercise, it is an iterative process in which one builds, deploys, gathers user data and tweaks the chat flow to generate more value for the user. A/B testing is a method you can employ to do this.
Glific enables A/B testing using the ‘Split Randomly’ node. This node allows you to test two or more (upto 10) user journeys (single message, flow or campaign) against each other by measuring their performance against key metrics for a sample group, before deciding what to deploy into production.
It does this by randomly sending users down one of the paths that your have created following this node.
- You could send out different types of multimedia files and check to see which file generates a more favourable response from your users. (desired answer)
- If you are facing a problem with users not moving past a certain point in your flows then you could test modified versions of the flow against the original one to see which is generating better performance (flow completion).
The best way to design your A/B test is as follows:
- Decide why and what you are testing.
- Design the alternate flow/flows.
- Make sure outcomes are trackable.
- Deploy the test.
1) Decide What and Why you are A/B testing:
It is valuable to begin by outlining the following in a short document:
- Define your aims and use case– Do you want to share information, collect information, generate interest, teach something, etc.
- State your hypothesis– What do you think will work better, why.
- List what you want to test– Flow length, flow language, phrasing, multimedia design, flow triggers, etc.
- Define your sample: The A/B test should be run with a representative sample. You would need to decide which users and how many users will provide you with sufficient data to confidently make your decisions.
- Be Specific: Do not make multiple different types of changes in a single AB test. Make a few specific and related changes to ensure that you can attribute differences in outcomes to that change/those changes.
2) Design the alternate flow/flows.
- If you are testing an existing flow against a new one then go to the flows section of the Glific platform and select ‘make a copy’ on the existing flow, rename it as per your convenience. Ex: AB_Pilot_Registration
- If you are trying out something new, then simply create a new flow.
- Add a message and select the ‘split randomly’ node option. You can choose to create and test upto 10 different paths or ‘buckets’ for your users.
- Create/connect new flows based on the number of buckets you have selected on the A/B node.
3) Make sure test outcomes are trackable
The entire purpose of A/B testing is to enable data driven decision making for flow design. To ensure that you get the insights you need, design your flows so that they are trackable, and build out visualisations to see the data come in. Methods to do this are as follows:
- Add flow labels: (Start_a, start_b, end_a, end_b, other labels at every stage)- Flow data will be captured in the “flow labels” field in the messages table. You would run a unique count of phone numbers for a particular flow label.
- Add contact fields– Create and update contact fields based on the flow and stage in the flow. Contact data is captured in field label>field value. You would find this data in the contacts dataset. You would run a unique count of contacts with a particular field label and field value. (Ex- field label=Start_AB, field value=FlowA)
- Add contacts to collections– Add contact to collections at the start and end of any flow or at any desired/relevant point as per your use case. You can get a unique count of contacts in any collection within the Glific interface itself. We would not suggest this when you need many data points as you would be creating many collections
For visualisation and analysis, the best way to go would be with flow labels, for multiple step flows this allows you to identify which stage the problem is at, if any. Else you could use the collections method for a quick start and end check on the Glific platform itself.
4) Deploy the A/B test
Just publish your flow and watch the data come in. We advise that you run a small internal pilot to identify and iron out kinks in your flow design (or your data studio report), if any.
After this you can roll the A/B test out to your intended sample.
With this feature, we move another step towards helping our users leverage data driven decision making in their chatbot program. Organisations like Antarang Foundation and Key Education Foundation have already identified use cases for A/B tests and Glific will be working closely with them to see what, if any, improvements need to be made to this functionality.