Many Amplitude analyses depend on accurately tracking unique users. This is often trickier than it sounds. Depending on the nature of your product, your users can log in and out at will, browse anonymously, or use multiple devices.
To ensure an accurate count of unique users, Amplitude employs a solution using a combination of device IDs, user IDs, and Amplitude IDs. This article will explain how that process works, and the roles each of these properties plays in it.
How Amplitude identifies unique users
As mentioned above, Amplitude uses a system of three different IDs to track users: device ID, user ID, and Amplitude ID:
- Device ID: Amplitude pulls the IDFV or generates a random alphanumeric string for the device ID; this string is stored locally in the browser's cookie or mobile device. For web-based applications, Amplitude will set the device ID to a randomly-generated UUID by default. It will persist unless a user clears their browser cookies and/ or is browsing in private mode.
- Prior to Version 6.1.0 of the Amplitude JavaScript SDK, the generated device ID ended with the letter 'R'.
- User ID: This identifier is configured by you. Many products use a username or an internal unique identifier to track their users. A user ID should be something that does not and will not change. If it does, Amplitude will have no way of knowing it's intended to represent a currently-existing user, and will instead assume it stands for a separate user.
Anonymous users should not be assigned a user ID. These users will still have Amplitude IDs and device IDs. Users will later be merged when Amplitude recognizes the different user IDs belong to the same user.
To set a user ID, use thesetUserId
method outlined in our SDK documentation. - Amplitude ID: After gathering the device and user IDs, Amplitude generates the Amplitude ID and associates it with the user and device IDs it has already collected for this user. Amplitude only needs one or the other to generate an Amplitude ID; however, the user ID is preferred, as multiple unique users could share the same device.
This section explains in detail how Amplitude generates an Amplitude ID in different scenarios.
Use advertising identifiers as the device ID
You can instruct Amplitude to use the Identifier for Advertiser (IDFA for iOS) and the Advertising Identifier (AdID for Android) as the device ID—see the relevant SDK installation guide for more information. This will cause the device ID to persist across installs.
iOS users have the option of resetting their IDFA on their devices at any time. As of iOS 10, if a user limits ad tracking, this generates an IDFA of all zeros. Amplitude will instead set the device ID as either IDFV or a randomly generated string.
Android users have the option of resetting the AdID on their devices at any time.
NOTE: Do you have questions about Apple's IDFA changes with iOS 14? See this article here.
Set up cross domain tracking
You can track anonymous behavior across two different domains. For example, imagine you have the following two domains:
- Site 1: https://www.landingpage.com
- Site 2: https://www.productpage.com
To track an anonymous user who starts on Site 1 and navigates to Site 2, you'll need to pass the device ID from Site 1 as a parameter to Site 2. After that, you'll have to reinitialize the SDK with the passed device ID:
- From Site 1, grab the device ID from
amplitude.options.deviceId
. - Pass the device ID to Site 2 via a URL parameter or third party cookie.
- Initialize the Amplitude SDK on Site 2 with
amplitude.init('YOUR_API_KEY', null, {deviceId: '$DEVICE_ID'})
.
How Amplitude assigns Amplitude IDs
In this section, we'll explore some common situations Amplitude encounters when trying to track unique users. Each one will include an explanation of how Amplitude uses the device ID or the user ID to either generate a new Amplitude ID, or determine if there is an existing Amplitude ID it can use instead.
No user ID is assigned
If your product doesn't assign user IDs, an Amplitude ID will be generated when Amplitude sees a particular device ID for the first time.
Device ID | User ID | Amplitude ID |
---|---|---|
A | null | 1 |
B | null | 2 |
B | null | 2 |
C | null | 3 |
A | null | 1 |
C | null | 3 |
There are three unique users in this set.
Here, the first event came from device A. Because there was no record of this device, Amplitude assigned it a new Amplitude ID of 1.
The second event came from device B. Since there was no record of this device, Amplitude assigned it a new Amplitude ID of 2.
The third event also came from device B, and since Amplitude already had a record of this device, it was assigned an Amplitude ID of 2. The same logic applies for device C.
User ID is assigned after anonymous events
If your product does assign user IDs, Amplitude assumes any previous events logged anonymously on that device belong to a single user, until a user ID is used. The Amplitude ID will then be used for both.
Device ID | User ID | Amplitude ID |
---|---|---|
G | null | 4 |
G | null | 4 |
G | John | 4 |
There is one unique user in this set.
Here, the first two events are logged anonymously on device G and are assigned an Amplitude ID of 4. The third event is the first event received from device G that includes a user ID. Amplitude assigns that user ID the same Amplitude ID as the anonymous events. We can then say that user John did all three events.
Same user ID on multiple device IDs, with no anonymous events
Amplitude prioritizes user IDs over device IDs when assigning an Amplitude ID.
Device ID | User ID | Amplitude ID |
---|---|---|
K | Zack | 5 |
L | Zack | 5 |
There is one unique user in this set.
User Zack fired an event on device K and was assigned an Amplitude ID of 5. Whenever Zack's user ID shows up again on any device, Amplitude will automatically assign it an Amplitude ID of 5, regardless of the device ID.
Multiple user IDs on the same device
If an event is sent anonymously after at least one user ID has already been seen associated with that device, Amplitude assumes the anonymous event was fired by the last known user and assigns it that user's Amplitude ID.
Device ID | User ID | Amplitude ID |
---|---|---|
R | Jane | 6 |
R | null | 6 |
R | Mary | 7 |
R | null | 7 |
R | null | 7 |
There are two unique users in this set.
Here, Jane logs the first event on device R and is assigned an Amplitude ID of 6. The next event is sent anonymously—maybe Jane logged out—but is still assigned an Amplitude ID of 6, because Jane was the last known user.
The third event is fired by Mary on device R and is assigned and Amplitude ID of 7. The next two events are fired anonymously; Amplitude assigns them an Amplitude ID of 7. Mary is assumed to have fired those events because she was the last known user.
If you want to explicitly log users out or log events under an anonymous user, then you will need to set the user ID to null
and regenerate a new device ID. Just follow the instructions in our iOS or Android SDK documentation to do so.
Same user ID on multiple devices, with anonymous events
Device ID | User ID | Amplitude ID |
---|---|---|
Y | David | 8 |
Z | null | 9 |
Z | David | 8 |
Z | null | 8 |
Here, David logs an event on device Y and is assigned an Amplitude ID of 8.
Next, an anonymous user logs an event on device Z and is assigned an Amplitude ID of 9, since there is no user associated with the device.
Then David logs into the product on device Z. This event has an Amplitude ID of 8, because that's David's Amplitude ID.
Finally, an event is logged anonymously on device Z. This events gets an Amplitude ID of 8; David was the last known user, so the event is attributed to him.
In this scenario, the anonymous events on device Z have two different Amplitude IDs, even though there's just one user ID associated with the device. Amplitude interprets this to mean that both Amplitude IDs 8 and 9 refer to a single user: David. When this happens, Amplitude will merge these IDs, so only one user will be counted. This merge can only happen because the event with Amplitude ID 9 did not have a User ID present.
NOTE: if you're using Redshift, the data are immutable, and the anonymous event will still have an Amplitude ID of 9 in the raw data. See the below merged users section for more details.
Merged users
The merged user problem arises when Amplitude determines that an anonymous user—i.e., one whose only ID is a device ID—is actually a recognized user with an Amplitude ID.
For example, when a user gets a new device, they'll log events in your product anonymously before signing in. To Amplitude, these anonymous events appear to be coming from a new user, so they're mapped to a new Amplitude ID.
But when the user logs into their existing account, all subsequent events are mapped to that user's existing Amplitude ID. A couple of problems arise from this:
- The user is counted twice in active and new user counts.
- Events received while the user was anonymous are “lost,” in that they won't be attributed to the true Amplitude ID and the user's actual User ID.
Amplitude solves this problem by cross referencing the list of Amplitude IDs with an internal mapping of Amplitude IDs that have been "merged" (those that Amplitude has determined to be the same user). This all occurs when you query on the dashboard. You can view a list of a user's merged IDs in their User Activity page, along with the merge times.
Important Note:
- User IDs cannot be merged. If you create a new user ID for an existing user, Amplitude will only recognize them as different unique users.
- This merged users solution does not apply to the raw data in Amazon Redshift, as the database contains the raw event logs and not a merged users table. Without this merged users logic, we have seen an average change of about 5% in DAU numbers for our users with higher-than-average change for web data. Merged users are available in Snowflake with the Query add-on.
Event IDs, device IDs, and merging users
When users are merged, the event ID count continues on a device-by-device basis:
Device ID | User ID | Amplitude ID | Event ID |
---|---|---|---|
A | null | 8 | 1 |
A | David | 8 | 2 |
A | David | 8 | 3 |
B | null | 9 | 1 |
B | David | 8 | 2 |
B | David | 8 | 3 |
If the product data is ever cleared (for example, if a user deletes the app and reinstalls), the Event ID will reset to 1 . When a user deletes the app and reinstalls, Amplitude will usually generate a new Device ID as well, which will subsequently trigger a new merging.
When users are merged, the user could "lose" user property values that were never meant to be changed (e.g. 'Start Version' or initial UTM parameters) because the new user property values will overwrite the original user property values. If you are a paying customer and this affects you, please reach out to our Support team.