Build a prediction

  • Updated

This article will help you:

  • Build a prediction in Amplitude Audiences
  • Create a predictive cohort from your prediction
  • Analyze your predictive cohort

This article will outline the steps of building and analyzing predictions.

NOTE: Be sure to check out our other articles on predictions—Predictions: use Amplitude's AI to help maximize lift and Use predictions in your campaigns.

Build a prediction

To build a prediction in Amplitude Audiences, follow these steps:

  1. In Amplitude Audiences, click Predictions in the left rail. Then click + Create Prediction.
  2. By default, Amplitude Audiences assumes you want this prediction to apply to all users who’ve been active in the last 90 days. To change this and use a different starting cohort, click Define your own.
  3. The first step is selecting the users who will be included in the cohort. Under Select Starting Cohort, select the events, properties, or statuses that users in your cohort share.
  4. Next, specify the action you want the starting cohort to take. Under Define a Future Outcome, you can specify events you want—or don’t want—your users to fire, the properties you want them to have after taking an action, or some combination of all three. Be sure to specify the time frame in which you want your users to take this future action.

    TIP: Another way to think about a prediction is as a cohort transition: you’re predicting the relative likelihood of a user to transition from Cohort A (the starting cohort) to Cohort B (the future outcome) in the coming week.

  5. Click Next. The Save Prediction tab will open.
  6. Give your prediction a name and add a brief description. Then click Build. It will take about an hour for Amplitude Audiences to build your prediction. You’ll receive an email when the process is done.

Analyze your prediction

Once Amplitude Audiences has finished building your prediction, you’ll want to take a look at the results. Depending on what you see, you’ll either save the prediction as a cohort, or start over with a new prediction.

  1. To view the results of your prediction, click the Predictions tab from the Cohorts page. This will show you a list of all the predictions created so far.
  2. Find your prediction and click it to open the prediction explorer. Here, you’ll see the distribution of all users in your starting cohort:
          • The Y-axis shows the likelihood a user will convert (i.e., arrive at the future outcome you specified earlier)
          • The X-axis shows the percentile of users

You can select a range of users by percentile and see how many users fall in the range, the predicted conversion rate of users in that range, and the likelihood of conversion for those users relative to the average.

NOTE: percentile and probability are not the same thing. If you select the 80% - 100% percentile range, this does NOT mean the users in it have an 80% - 99% probability to convert. Instead, it means they’re in the top 20% of users, as ranked by probability to convert.

At this point, you’ll want to evaluate whether your prediction is accurate or not. Amplitude Audiences provides for metrics for you to accomplish this:

  • Accuracy: technically, this is the area under the curve, a measure that weighs both true positive and false positive rates
  • True Positive Rate: this is the ratio of predicted users who convert
  • False Positive Rate: this is the ratio of predicted users who do not convert
  • Predicted vs Actuals: this compares the predicted conversion rates to observed historical conversion rates and gives you the difference, in percentage terms

Generally speaking, a good model will have an accuracy of at least 70%. Any model with an accuracy of 50% or less will be no better than a coin flip in its predictive ability.

Feature importance

“Black box” predictions aren’t generally insightful. That’s why Amplitude Audiences ranks the events and user properties that are most important to your predictive model in the Feature Importance table, which you can find just below the Percentile Breakdown chart.

The Ratio column is a ranking of events or properties according to their importance to the model. It’s computed by comparing the percentage of users in the selected percentile range who fire an event  to those not in the selected percentile range.

The % in Range column specifies the percentage of users in the selected percentile range who fired the respective event. Sort by this column to rank events according to overall level of engagement.

The Frequency column displays the average number of times a user in the selected range fires an event. Sort by this column to rank events according to overall level of engagement.

Build a cohort from your prediction

Once you’ve got a useful prediction, you can save it as a cohort. This enables you to return to it in the future and repeatedly use it in targeting campaigns.

To save your prediction as a cohort, follow these steps:

  1. Use the slider to select the desired percentile range on the chart. Then click Save as a Cohort.
  2. Give your cohort a name, toggle the discoverability switch to your preferred setting, and click Save.

While it can be tempting to just slice the starting cohort into two sections—i.e., top 20% vs bottom 80%, or top 50% vs bottom 50%—other approaches can give you far more useful results:

  • Probability inflection: Find the spot where the distribution graph spikes exponentially, and split users along the spikes. This will group users into broadly similar buckets of predicted conversion rates.
  • Sample size: If you have an idea of how many users you want to target in a growth campaign, then select that percentage on the right side of the chart. For example, if you want to target 2000 users and you have 20,000 users in the starting cohort, then simply select the top 10%.
  • Minimum detectable lift. If you plan to target the selected users in a growth campaign, make sure the sample size is large enough to detect incremental lift. For example, if the top 20% of a prediction is 20,000 users, but the predicted conversion rate is 1%, you won’t be able to detect lift at statistically significant levels. Instead, you must increase the sample size to top 45% of users at 45,000 users.

NOTE: When a user’s probabilities change, Amplitude Audiences will automatically adjust their cohort membership if they fall into or out of the selected percentile range.

Analyze your predictive cohort

Once you save a prediction as a cohort, you can use it for analysis in any Amplitude Analytics chart. Here are some suggestions for analyses using prediction-derived cohorts:

  • Create top 20% and bottom 80% cohorts to compare the best and worst users. Set them as different segments in the right module of any chart.
  • Event Segmentation: see the historical behavioral trends of best users vs worst users prior to converting.
  • Pathfinder: identify the different sequences of actions users take if they have a high likelihood vs low likelihood to convert.
  • Composition: break down the property values of the respective cohorts to differences in user properties (i.e., which countries the best users vs worst users are in).
  • Engagement Matrix: compare the events fired by the best users vs the worst users, based on the balance of frequency and % of users.
  • Funnel: compare relative conversion rates for any sequence of actions between the best users and worst users.