This article describes Redshift schema for customers still using the Redshift ETL add-on. Each project has its own schema.
NOTE: For most customers, the Help Center article on exporting Amplitude data to Redshift will be more useful.
Schema structure
There is a schema created for each project you have on Amplitude. To list out all the schemas on Redshift, please use the following command when you log into pSQL:
\dn
To see all of the tables and views associated with a schema, use the following commands (with XXXX = app_id):
set search_path = appXXXX;
\d
Use \d+ view_name
to see a definition of each view.
By default, we have views created for each project in the public schema for easy access. The format of these views is: eventsXXXX
, where XXXX = app_id.
Tables nomenclature
There are three types of tables on Amplitude: dedicated tables, generic tables, and a universal table. Please note that the easiest way to query Amplitude data is to query off of the views.
Dedicated Tables: The first 20 event-types get their own tables.
- Example format:
events<app_id>_<event_type_id>
. event_type_id is internal identifier on Amplitude.
Generic Tables: There are 50 more generic tables with each generic table accommodating 20 event-types. Each event in this table has an individual view created in the project schema.
- Example format:
generic_<app_id>_y
, wherey
goes from 1 to 20 because there are 20 generic tables.- First generic table created is
generic_<app_id>_1
, and once 50 event types are put in it, for the 51st event type, we createdgeneric_<app_id>_2
and so on.
- First generic table created is
Universal Tables: Everything else after the first 1020 event-types fall into a universal table. First 1020 event-types also have individual views created in the project schema.
Views nomenclature
You may find that there are various tables (they are technically views) you can perform your Redshift queries on. Find below explanations of the nomenclature of the views you will be querying from.
- events###: A view of all event data in a specific project. The numbers following "event" are your Project ID, which you can find in your project's Settings page.
- events### is located in the public schema. An equivalent view can be found in the project schema, where "appXXX" is the project schema and "events" is the table name.
- Example: event123 is a view of all event data for Project 123 in the public schema. This very same view is called
events
in the app schema. Soevents###
=app###
.events. - Query on this view when you want to query on data of multiple events.
- Event and user properties are stored in a single column as a JSON string.
- app###.event_type: A view of all data for a specific event (event_type) in a specific project. The numbers following "app" are your Project ID and event_type is the name of the event you are querying on.
- Example: app123.add_friend is a view of all the add_friend event data for Project 123.
- Each event and user property has its own column.
Column schema
There is a limit of 400 user properties and 50 event properties that will be pulled into their own columns. Anything past the limit will still require the JSON_EXTRACT_PATH_TEXT function on the 'event_properties' or 'user_properties' columns.
Below is the list of the columns in an event view (event###), specifically the column type and a brief description of the column.
Column | Description |
---|---|
id
bigint |
A deprecated column. |
uuid
character varying (2048) |
A unique identifier per row (event sent). bf0b9b2a-304d-11e6-934f-22000b56058f |
app
integer |
Project ID found in your project's Settings page.
123456 |
amplitude_id
bigint |
An internal ID used to count unique users.
1234567890 |
device_id
character varying (1024) |
The device specific identifier.
"C8F9E604-F01A-4BD9-95C6-8E5357DF265D" |
user_id
character varying (1024) |
A readable ID specified by you.
"datamonster@gmail.com" |
event_time
timestamp w/o time zone |
Amplitude timestamp (UTC) which is the client_event_time adjusted by the difference between server_upload_time and client_upload_time, specifically: event_time = client_event_time + (server_received_time - client_upload_time) 2015-08-10T12:00:00.000000 |
client_event_time
timestamp w/o time zone |
Local timestamp (UTC) of when the device logged the event. 2015-08-10T12:00:00.000000 |
client_upload_time
timestamp w/o time zone |
The local timestamp (UTC) of when the device uploaded the event.
2015-08-10T12:00:00.000000 |
server_upload_time
timestamp w/o time zone |
Amplitude timestamp (UTC) of when our servers received the event. 2015-08-10T12:00:00.000000 |
event_id
integer |
A counter that distinguishes events.
1 |
session_id
bigint |
The session start time in milliseconds since epoch.
1396381378123 |
event_type
character varying (1024) |
The assigned type of event.
"Add Friend" |
amplitude_event_type
character varying (1024) |
Amplitude specific identifiers based on events Amplitude generates. This is a legacy field so event_type should suffice for all queries. |
first_event
boolean |
True if the event is the first event for a given amplitude_id, otherwise none (null).
true |
version_name
character varying (1024) |
The app version.
"1.0.0" |
os_name
character varying (1024) |
OS name.
"ios" |
os_version
character varying (1024) |
OS version.
"1.0" |
device_brand
character varying (1024) |
Device brand.
"Apple" |
device_manufacturer
character varying (1024) |
Device manufacturer.
"Apple" |
device_model
character varying (1024) |
The device model.
"iPad Mini" |
device_family
character varying (1024) |
Device family.
"Apple iPhone" |
device_type
character varying (1024) |
Device type.
"Apple iPhone 5s" |
device_carrier
character varying (1024) |
Device carrier.
"Verizon" |
country
character varying (1024) |
Country.
"United States" |
language
character varying (1024) |
Language.
"English" |
revenue
double precision |
Revenue generated by a revenue event.
1.00 |
product_id
character varying (1024) |
Product ID of a revenue event.
"SpecialOffer1" |
quantity
integer |
Quantity of a revenue event.
1 |
price
double precision |
Price of a revenue event.
1.00 |
location_lat
double precision |
Latitude.
12.3456789 |
location_lng
double precision |
Longitude.
-123.4567890 |
ip_address
character varying (1024) |
IP address.
"123.11.111.11" |
event_properties
character varying (65535) |
JSON string of event properties.
{"load_time": 0.8371, "cause": "button"} |
user_properties
character varying (65535) |
JSON string of user properties.
{"cohort": "Test A", "gender": "female"} |
region
character varying (1024) |
Region.
"California" |
city
character varying (1024) |
City.
"San Francisco" |
dma
character varying (1024) |
Designated marketing area (DMA).
"San Francisco-Oakland-San Jose, CA" |
paying
boolean |
True if the user has ever logged any revenue, otherwise none (null). Note: The property value can be modified via the Identify API. true |
platform
character varying (1024) |
Platform.
"iOS", "Android", "Web" |
start_version
character varying (1024) |
App version the user was first tracked on.
"1.0.0" |
user_creation_time
timestamp w/o time zone |
Event_time (UTC) of the user's first event.
2015-08-10T12:00:00.000000 |
library
character varying (1024) |
Library being used to send the event data.
"amplitude-js/2.5.0", "http/1.0" |
e_type
character varying (1024 |
Example custom event property 'type'. Any custom event property you have instrumented will be prefixed with "e_". |
e_length
character varying (1024) |
Example custom event property 'length'. Any custom user property you have instrumented will be prefixed with "u_". |
u_age
character varying (1024) |
Example custom user property 'age'. Any custom user property you have instrumented will be prefixed with "u_". |
u_gender
character varying (1024) |
Example custom user property 'gender'. Any custom user property you have instrumented will be prefixed with "u_". |