- How are the data points calculated?
- In the breakdown table, why doesn't the user count for the interval equal the total users row?
How are the data points calculated?
Amplitude uses a weighted average to calculate each data point. This means each data point is the number of unique users who have fired the return event during a specific day, week, or month, out of the total number of unique users who have reached that point in time. In other words, it's a weighted average percentage of the below row values. The weighted average uses unique users, which means it weeds out duplicates—if users fire the first event and the return event multiple times, Amplitude still counts them only once in the denominator. The user can be in each numerator data point once.
For N-day retention specifically, the numerator would be the unique count of users who fire the return event on the day, week, or month indicated, while the denominator would the total number of unique users who fired their start event on the first day, week, or month of the specified time frame. A user who fires the start and returns events multiple times in a time frame will be only counted once in the denominator. However, they may be included in the numerator once per day/week/month of each data point interval.
For unbound retention specifically, the denominator for a specific day, week, or month (let's call this number X) is users who have completed that day (or week, or month). It's counted towards the overall unbounded retention as long as X days, weeks, or months have passed since that user's starting event. The numerator would be the unique count of users that fired the return event on the Xth day (or week, or month) or later.
Since unbound retention measures users returned on that Xth day or later, a user will be included in all data points prior to when they fired an event. A user that fires the event on day two, for example, will also be included in the data point for days one and zero.
This article in the Amplitude Community has more details.
In the breakdown table, why doesn't the user count for the interval equal the number shown in the total users row?
The total user count of the retention analysis breakdown table sums unique users who fired the start event within the entire time period. Say a user fired the start action on June 7th and the 8th. The user is unique to June 7th and 8th, and will therefore be counted in the user totals for both days. However, this user will only be counted once for the overall user total. This is one reason why the sum of each day user totals will not always equal the total user count.
Another reason is that the current day is excluded from this sum because it's not over yet; Amplitude is still collecting the day's events and performing calculations with them.
If you look at the breakdown table, any value with an asterisk indicates a data point that is still calculating because the time frame is not yet over. Amplitude will not include these users in the overall retention calculation until the time frame is complete:
Why is my overall retention higher or lower than expected?
There are two potential explanations for this:
- Incomplete data points are not included in the retention calculation. If all the data points are higher or lower than the values in the breakdown table, it may be the case that many of these users have not completed the full time frame yet. Incomplete values that have not been included in the data points have an asterisk beside them.
- The data points are a deduplicated calculation of all users in the time frame. In that case in the overall retention calculation, Amplitude will only count that user once, but in the breakdown table, you may see the user more than once, making your retention appear higher than expected.
See this community article for more in-depth details on this topic.
I can only see 12 months of data in this retention chart—how can I see more?
Use Amplitude's custom bracket feature to add additional time frames beyond the default limits:
Why does the retention curve go up?
If you are looking at data that is currently ongoing, you may see your retention curve upwards. This is because users are only included in the weighted average calculation once they have had enough time to convert.
For example, if we were looking at the last 30 days, a user can only be in the day 4 data point once four days have elapsed since their starting event. Since users are only included once they pass that milestone, the later data points may seem higher. This is because the users who did not retain never passed that milestone. This usually occurs when only a small percentage of users have finished the chart's entire time frame. Once more users do so, you'll see your chart start to even out and more accurately reflect retention.
How far out does unbound retention go?
Unbound retention will extend to the current day. The last point in an Unbound Retention chart will show users who have fired the return event on or after a specific day, week, or month, and will check if they fired that event as recently as yesterday.