This article will help you:
- Query your Snowflake database using custom SQL
The Amplitude Query product add-on allows customers to query their raw data via their Amplitude-managed Snowflake database. Data is loaded into Snowflake every 30 minutes. Query also includes a powerful new chart type called Amplitude SQL that allows customers to write custom SQL against their Amplitude data directly inside the Amplitude platform.
NOTE: This feature is only available to Enterprise and Growth customers who have purchased the Query product add-on.
Getting started with Query
The Amplitude SQL is accessible just like any other chart type, via Create New > Chart > All chart types. To learn more about how to utilize this chart type, see the Amplitude SQL section below.
Alternatively, you can connect directly to your Snowflake database via a terminal or 3rd-party application such as SQL Workbench. Snowflake also has a connector for Python here. Please reach out to us here or contact your Success Manager for your Snowflake credentials.
NOTE: This feature is only available to Enterprise and Growth customers who have purchased the Query add-on.
Query schema description
Tables nomenclature
The Query package uses a simplified table schema for Snowflake and Amplitude SQL. Amplitude SQL uses a single table schema that can be referenced using the shorthand $events
for a particular project. If you want to access other tables, you can use the full name which can be found by checking Show Schema.
While the $events
table handles the merged user mappings automatically, the schema for the merged users table is also made available below the schema for $events
. This is helpful if you want to view the number of users that have been merged into one.
With Query, one of the major benefits is the unlimited number of columns that can exist in the table. Custom user properties and event properties are stored as variants, and they are queryable as individual columns. Custom user properties are prefixed with user_properties:
and all event properties are prefixed with event_properties:
.
NOTE: If your user or event properties contain a period or a space, then you will need to wrap the name of the property in quotes. For example, user_properties:"first name"
.
If you are searching for a certain value, you will have to wrap the value of the property in single quotes. For example, user_properties:"plan type"='enterprise'
.
Column schema
$events
Table
Column | Description |
---|---|
$amplitude_id NUMBER(38,0) |
The original Amplitude ID for the user. Use this field to automatically handle merged users. 2234540891 |
adid VARCHAR(16777216) |
(Android) Google Play Services advertising ID (AdID). This usually is wiped after ingestion and therefore will be blank "AEBE52E7-03EE-455A-B3C4-E57283966239" |
amplitude_attribution_ids | Anonymized hash of the advertising IDs that we store for internal purposes; not useful for the customer by any means. But this will appear if advertising IDs were sent which proves that adid/idfv existed even though currently wiped |
amplitude_event_type
VARCHAR(16777216) |
Amplitude specific identifiers based on events Amplitude generates. This is a legacy field so event_type should suffice for all queries. |
amplitude_id
NUMBER(38,0) |
An internal ID used to count unique users.
1234567890 |
app
NUMBER(38,0) |
Project ID found in your project's Settings page.
123456 |
city
VARCHAR |
City. "San Francisco" |
client_event_time
TIMESTAMP |
Local timestamp (UTC) of when the device logged the event. 2015-08-10T12:00:00.000000 |
client_upload_time
TIMESTAMP |
The local timestamp (UTC) of when the device uploaded the event.
2015-08-10T12:00:00.000000 |
country
VARCHAR |
Country.
"United States" |
data
VARIANT |
Dictionary where certain fields such as first_event and merged_amplitude_id are stored. |
device_brand
VARCHAR(16777216) |
Device brand.
"Apple" |
device_carrier
VARCHAR(16777216) |
Device carrier.
"Verizon" |
device_family
VARCHAR(16777216) |
Device family.
"Apple iPhone" |
device_id
VARCHAR(16777216) |
The device specific identifier.
"C8F9E604-F01A-4BD9-95C6-8E5357DF265D" |
device_manufacturer
VARCHAR(16777216) |
Device manufacturer.
"Apple" |
device_model
VARCHAR(16777216) |
The device model.
"iPad Mini" |
device_type
VARCHAR(16777216) |
Device type.
"Apple iPhone 5s" |
dma VARCHAR(16777216) |
Designated marketing area (DMA).
"San Francisco-Oakland-San Jose, CA" |
event_id
NUMBER(38,0) |
A counter that distinguishes events.
1 |
event_time
TIMESTAMP |
Amplitude timestamp (UTC) which is the client_event_time adjusted by the difference between server_received_time and client_upload_time, specifically: event_time = client_event_time + (server_received_time - client_upload_time) 2015-08-10T12:00:00.000000 We use this timestamp to organize events on Amplitude charts. Note: If the difference between server_received_time and client_upload_time is less than 60 seconds, the event_time will not be adjusted and will equal the client_event_time |
event_type
VARCHAR(16777216) |
The assigned type of event.
"Add Friend" |
followed_an_identify
BOOLEAN |
True if there was an |
groups
VARIANT |
Group types. See the Accounts documentation for more information. |
idfa
VARCHAR(16777216) |
(iOS) Identifier for Advertiser. This usually is wiped after ingestion and therefore will be blank
"AEBE52E7-03EE-455A-B3C4-E57283966239" |
ip_address
VARCHAR(16777216) |
IP address.
"123.11.111.11" |
location_lat
FLOAT |
Latitude.
12.3456789 |
location_lng
FLOAT |
Longitude.
-123.4567890 |
os_name
VARCHAR(16777216) |
OS name.
"ios" |
os_version
VARCHAR(16777216) |
OS version.
"1.0" |
paying
VARCHAR |
True if the user has ever logged any revenue, otherwise '(none)'. The property value can be modified via the Identify API. true |
region
VARCHAR |
Region.
"California" |
server_upload_time TIMESTAMP |
Amplitude timestamp (UTC) of when our servers received the event. 2015-08-10T12:00:00.000000 |
session_id
NUMBER(38,0) |
The session start time in milliseconds since epoch.
1396381378123 |
start_version
VARCHAR |
App version the user was first tracked on.
"1.0.0" |
user_creation_time
TIMESTAMP |
Event_time (UTC) of the user's first event. 2015-08-10T12:00:00.000000 |
user_id VARCHAR(16777216) |
A readable ID specified by you.
"datamonster@gmail.com" |
uuid
VARCHAR(16777216) |
A unique identifier per row (event sent). bf0b9b2a-304d-11e6-934f-22000b56058f |
version_name
VARCHAR(16777216) |
The app version.
"1.0.0" |
Merged users table
See this article to learn more about how Amplitude tracks unique users.
Column | Description |
---|---|
amplitude_id NUMBER(38,0) |
The Amplitude ID that is being merged into a user's original Amplitude ID. |
merge_event_time TIMESTAMP |
The time of the event a user's new Amplitude ID was associated with their original Amplitude ID. |
merge_server_time TIMESTAMP |
The server time of the event when a user's new Amplitude ID was associated with their original Amplitude ID. |
merged_amplitude_id NUMBER(38,0) |
The originally assigned Amplitude ID when the user is first created. |
Amplitude SQL chart type
Amplitude SQL analyses can be saved, shared, and added to a dashboard just like any other chart. This chart type allows you to write custom SQL queries against your own Amplitude data.
NOTE: This feature currently does not support queuing data for Portfolio Views.
Chart setup
By default, a simple SQL query is shown returning the date, unique users, and total events performed in the past 30 days. SQL syntax will be highlighted to help you distinguish SQL commands from the rest of your query.
In addition, Amplitude SQL supports autocomplete of columns in the table. As you type, the query editor will recommend columns to help expedite your typing.
Special fields
Because Amplitude SQL is built directly into the Amplitude chart experience, you can leverage much of the same familiar Amplitude user interface such as the datepicker chart saving experience. You can do this by utilizing Amplitude SQL's special fields for powerful shortcuts:
$date
: When using this shortcut, the time range chosen by the datepicker is automatically applied and will update the query over time. It refers to the event time on the event and will respect the timezone the project has been set to. Otherwise, Amplitude SQL will return data in UTC.- Note: This must be used in conjunction with
$events
.
- Note: This must be used in conjunction with
$events
: The shorthand used to refer to the table in your current project. When using this table, merged users are automatically handled.- Note: This must be used in conjunction with
$date
.
- Note: This must be used in conjunction with
$amplitude_id
: The original Amplitude ID for the user. Use this field to automatically handle merged users.
Query results
To execute your query, click Compute. When the query has finished executing, the results will be displayed below in a data table. The results of the data table and the CSV export are capped at 1,000 rows. In addition, a visualization of the data returned will be presented in a chart below the table. At this time, only a time series can be displayed.
To customize the visualization, there is a set of controls that you can use below the query editor. The options available in the visualization controls are the fields that you return in your SQL SELECT
statement. For example, in the query above, the 3 fields we can visualize are "DATE", "UNIQUES", and "TOTALS".
- X-axis column: Select what is plotted on the X-axis. Currently, only time series are supported.
- Metric column: Select which field is returned in your SQL
SELECT
statement is plotted on the Y-axis.
Applying group-bys
To group a chart by a column, enter the name of the column in the "Label columns" field. Then click Compute.
Sharing and saving queries
Like other Amplitude charts, you are able to export the results as a PNG, PDF, or CSV file by navigating to More > Export. Furthermore, you can save your analyses and share it with your team or even add the visualizations you create to a dashboard in Amplitude.
Snowflake ETL
In addition to the Query add-on, paying customers have access to a self-service UI to export Amplitude data to their own Snowflake account. This is included in all paid Amplitude plans and is supported in all regions. Through this feature, you can maintain ownership and privacy of your data warehouse, as well as perform maintenance, administration, and customization work on your own, without needing support from Amplitude engineering.
Amplitude's previous data sharing service, Snowflake ETL, is undergoing the sunsetting process. If you are currently using Snowflake ETL, please reach out to your success manager for guidance on migrating to the self-serve export.
This table outlines the advantages of the self-serve Snowflake export:
Features |
Data exported to customer-owned and managed Snowflake instance. Complete control and edit permissions over data and access. |
Accounts |
Requires snowflake relationship and warehouse. |
Best For |
Customers who want to use Snowflake as their primary data warehouse, with write capabilities for other data sets, and edit and access control. |