Exporting Merged User ID Mappings
This article provides a high-level overview of the process to obtain a regularly exported list of User ID Mappings from Amplitude. This is useful for customers who want to perform advanced analysis and/ or create cohorts outside of Amplitude and require the merged identity mappings to reconcile Amplitude IDs.
Note: This feature is only available on our Enterprise and Growth plans. You will have to reach out to your Account Executive or Customer Success Manager to initiate this process.
Steps to Take to Obtain Merged User ID Mapping File(s)
There are three steps to obtain the Amplitude Merged User ID Mapping:
Step 1: Customer provides their Amazon S3 bucket used for storing Amplitude merged user id data (see requirements below).
Step 2: Amplitude will back-fill the merged user id data.
Step 3: Once the initial back-fill is complete, Amplitude will check the availability of the Merged User ID Mapping file every 15 minutes for that hour and copy it to the Customer’s bucket, if available.
Amazon S3 Bucket requirements
Before you make any changes per the requirements below, you may want to check your Amazon Web Server properties first. For guidance on how to check the properties of your Amazon S3 Bucket, refer to the section How Do I View the Properties for an S3 Bucket? of the Amazon S3 Console guide.
Set the Amazon Web Server (AWS) Role to:
Amazon Resource Name (ARN): arn:aws:iam::358203115967:root
The AWS Bucket permissions should be set to:
S3:GetObject
S3:PutObject
S3:DeleteObject
S3:AbortMultipartUpload
S3:ListBucket
Whitelisted IPs: 52.33.3.219, 35.162.216.242, 52.27.10.221
Please refer to the AWS documentation if you have questions on ARN, Bucket permissions or how to whitelist IP addresses: AWS Batch IAM Policies, Roles, and Permissions.
Once you have completed setting the ARN, Bucket Permissions and Whitelisting, you should reach out to your Account Representative or you Customer Success Manager to initiate the process for Amplitude.
Example of the Data Format and File Name
The S3 file name will follow the pattern -[application]_[date]_[hour].json.gz (following a 24 hour format).
-app_date_hour.json.gz
-36958_2018-08-30_1.json.gz
-36958_2018-08-30_13.json.gz
The contents of the data file follow the format of {“parameter name”:value, “parameter name”:value, etc.}.
{"scope":2,"merge_time":1483920000000,"merge_server_time":1484006400000,"amplitude_id":111,"merged_amplitude_id":2}
- Scope: scope of the apps, or app_id, Amplitude Project ID
- Merge_time: event_time
- Merge_server_time: server_upload_time
- Amplitude_id: user’s amplitude id
- Merged_amplitude_id: canonical amplitude id
Amplitude will merge different amplitude_id’s to this merged_amplitude_id.