Write custom SQL with Amplitude Query

  • Updated

This article will help you:

  • Query your Snowflake database using custom SQL

The Amplitude Query product add-on allows customers to query their raw data via their Amplitude-managed Snowflake database. Data is loaded into Snowflake every 30 minutes. Query also includes a powerful new chart type called Amplitude SQL that allows customers to write custom SQL against their Amplitude data directly inside the Amplitude platform.

NOTE: This feature is only available to Enterprise and Growth customers who have purchased the Query product add-on.

Getting started with Query

The Amplitude SQL is accessible just like any other chart type, via Create New > Chart > All chart types. To learn more about how to utilize this chart type, see the Amplitude SQL section below.

Alternatively, you can connect directly to your Snowflake database via a terminal or 3rd-party application such as SQL Workbench. Snowflake also has a connector for Python here. Please reach out to us here or contact your Success Manager for your Snowflake credentials. 

NOTE: This feature is only available to Enterprise and Growth customers who have purchased the Query add-on.

Query schema description

Tables nomenclature

The Query package uses a simplified table schema for Snowflake and Amplitude SQL. Amplitude SQL uses a single table schema that can be referenced using the shorthand $events for a particular project. If you want to access other tables, you can use the full name which can be found by checking Show Schema.

While the $events table handles the merged user mappings automatically, the schema for the merged users table is also made available below the schema for $events. This is helpful if you want to view the number of users that have been merged into one.

Screen_Shot_2019-10-30_at_09.59.54.png

With Query, one of the major benefits is the unlimited number of columns that can exist in the table. Custom user properties and event properties are stored as variants, and they are queryable as individual columns. Custom user properties are prefixed with user_properties: and all event properties are prefixed with event_properties:.

NOTE: If your user or event properties contain a period or a space, then you will need to wrap the name of the property in quotes. For example, user_properties:"first name".

If you are searching for a certain value, you will have to wrap the value of the property in single quotes. For example, user_properties:"plan type"='enterprise'.

Column schema

$events Table

Column Description
$amplitude_id

NUMBER(38,0)

The original Amplitude ID for the user. Use this field to automatically handle merged users.

2234540891

adid

VARCHAR(16777216)

(Android) Google Play Services advertising ID (AdID). This usually is wiped after ingestion and therefore will be blank

"AEBE52E7-03EE-455A-B3C4-E57283966239" 

amplitude_attribution_ids Anonymized hash of the advertising IDs that we store for internal purposes; not useful for the customer by any means. But this will appear if advertising IDs were sent which proves that adid/idfv existed even though currently wiped
amplitude_event_type

VARCHAR(16777216)

Amplitude specific identifiers based on events Amplitude generates. This is a legacy field so event_type should suffice for all queries.
amplitude_id

NUMBER(38,0)

An internal ID used to count unique users.

1234567890

app

NUMBER(38,0)

Project ID found in your project's Settings page.

123456

city

VARCHAR

City.

"San Francisco"

client_event_time

TIMESTAMP

Local timestamp (UTC) of when the device logged the event.

2015-08-10T12:00:00.000000

client_upload_time

TIMESTAMP

The local timestamp (UTC) of when the device uploaded the event.

2015-08-10T12:00:00.000000

country

VARCHAR

Country.

"United States" 

data

VARIANT

Dictionary where certain fields such as first_event and merged_amplitude_id are stored.
device_brand

VARCHAR(16777216)

Device brand.

"Apple"

device_carrier

VARCHAR(16777216)

Device carrier.

"Verizon"

device_family

VARCHAR(16777216)

Device family.

"Apple iPhone"

device_id

VARCHAR(16777216)

The device specific identifier.

"C8F9E604-F01A-4BD9-95C6-8E5357DF265D"

device_manufacturer

VARCHAR(16777216)

Device manufacturer.

"Apple"

device_model

VARCHAR(16777216)

The device model.

"iPad Mini"

device_type

VARCHAR(16777216)

Device type.

"Apple iPhone 5s"

dma

VARCHAR(16777216)

Designated marketing area (DMA).

"San Francisco-Oakland-San Jose, CA"

event_id

NUMBER(38,0)

A counter that distinguishes events.

1

event_time

TIMESTAMP

Amplitude timestamp (UTC) which is the client_event_time adjusted by the difference between server_received_time and client_upload_time, specifically:

event_time = client_event_time + (server_received_time - client_upload_time)

2015-08-10T12:00:00.000000

We use this timestamp to organize events on Amplitude charts.

Note: If the difference between server_received_time and client_upload_time is less than 60 seconds, the event_time will not be adjusted and will equal the client_event_time

event_type

VARCHAR(16777216)

The assigned type of event.

"Add Friend"

followed_an_identify

BOOLEAN

True if there was an identify event between this current SDK event and the last SDK event seen.

groups

VARIANT

Group types. See the Accounts documentation for more information.
idfa

VARCHAR(16777216)

(iOS) Identifier for Advertiser. This usually is wiped after ingestion and therefore will be blank

"AEBE52E7-03EE-455A-B3C4-E57283966239"

ip_address

VARCHAR(16777216)

IP address.

"123.11.111.11"

location_lat

FLOAT

Latitude.

12.3456789 

location_lng

FLOAT

Longitude.

-123.4567890

os_name

VARCHAR(16777216)

OS name.

"ios"

os_version

VARCHAR(16777216)

OS version.

"1.0"

paying

VARCHAR

True if the user has ever logged any revenue, otherwise '(none)'. The property value can be modified via the Identify API.

true

region

VARCHAR

Region.

"California"

server_upload_time

TIMESTAMP

Amplitude timestamp (UTC) of when our servers received the event.

2015-08-10T12:00:00.000000

session_id

NUMBER(38,0)

The session start time in milliseconds since epoch.

1396381378123

start_version

VARCHAR

App version the user was first tracked on.

"1.0.0"

user_creation_time

TIMESTAMP

Event_time (UTC) of the user's first event.

2015-08-10T12:00:00.000000

user_id

VARCHAR(16777216)

A readable ID specified by you.

"datamonster@gmail.com"

uuid

VARCHAR(16777216)

A unique identifier per row (event sent).

bf0b9b2a-304d-11e6-934f-22000b56058f

version_name

VARCHAR(16777216)

The app version.

"1.0.0"

Merged users table

See this article to learn more about how Amplitude tracks unique users.

Column Description
amplitude_id

NUMBER(38,0)

The Amplitude ID that is being merged into a user's original Amplitude ID.

merge_event_time

TIMESTAMP

The time of the event a user's new Amplitude ID was associated with their original Amplitude ID.

merge_server_time

TIMESTAMP

The server time of the event when a user's new Amplitude ID was associated with their original Amplitude ID.

merged_amplitude_id

NUMBER(38,0)

The originally assigned Amplitude ID when the user is first created.

Amplitude SQL chart type

Amplitude SQL analyses can be saved, shared, and added to a dashboard just like any other chart. This chart type allows you to write custom SQL queries against your own Amplitude data.

NOTE: This feature currently does not support queuing data for Portfolio Views.

Chart setup

By default, a simple SQL query is shown returning the date, unique users, and total events performed in the past 30 days. SQL syntax will be highlighted to help you distinguish SQL commands from the rest of your query.

Screen_Shot_2019-10-30_at_10.01.20.png

In addition, Amplitude SQL supports autocomplete of columns in the table. As you type, the query editor will recommend columns to help expedite your typing.

Screen_Shot_2019-10-30_at_10.02.27.png

Special fields

Because Amplitude SQL is built directly into the Amplitude chart experience, you can leverage much of the same familiar Amplitude user interface such as the datepicker chart saving experience. You can do this by utilizing Amplitude SQL's special fields for powerful shortcuts:

  • $date: When using this shortcut, the time range chosen by the datepicker is automatically applied and will update the query over time. It refers to the event time on the event and will respect the timezone the project has been set to. Otherwise, Amplitude SQL will return data in UTC.
    • Note: This must be used in conjunction with $events.
  • $events: The shorthand used to refer to the table in your current project. When using this table, merged users are automatically handled.
    • Note: This must be used in conjunction with $date.
  • $amplitude_id: The original Amplitude ID for the user. Use this field to automatically handle merged users.

Query results

To execute your query, click Compute. When the query has finished executing, the results will be displayed below in a data table. The results of the data table and the CSV export are capped at 1,000 rows. In addition, a visualization of the data returned will be presented in a chart below the table. At this time, only a time series can be displayed.

Screen_Shot_2019-10-30_at_10.03.11.png

To customize the visualization, there is a set of controls that you can use below the query editor. The options available in the visualization controls are the fields that you return in your SQL SELECT statement. For example, in the query above, the 3 fields we can visualize are "DATE", "UNIQUES", and "TOTALS".

  • X-axis column: Select what is plotted on the X-axis. Currently, only time series are supported.
  • Metric column: Select which field is returned in your SQL SELECT statement is plotted on the Y-axis.

Screen_Shot_2019-10-30_at_10.03.55.png

Applying group-bys

To group a chart by a column, enter the name of the column in the "Label columns" field. Then click Compute.

Screen_Shot_2019-05-24_at_4.13.18_PM.png

Sharing and saving queries

Like other Amplitude charts, you are able to export the results as a PNG, PDF, or CSV file by navigating to More > Export. Furthermore, you can save your analyses and share it with your team or even add the visualizations you create to a dashboard in Amplitude.

Screen_Shot_2019-10-30_at_10.04.39.png

Snowflake ETL

In addition to the Query add-on, paying customers have access to a self-service UI to export Amplitude data to their own Snowflake account. This is included in all paid Amplitude plans and is supported in all regions. Through this feature, you can maintain ownership and privacy of your data warehouse, as well as perform maintenance, administration, and customization work on your own, without needing support from Amplitude engineering. 

Amplitude's previous data sharing service, Snowflake ETL, is undergoing the sunsetting process. If you are currently using Snowflake ETL, please reach out to your success manager for guidance on migrating to the self-serve export.

This table outlines the advantages of the self-serve Snowflake export:

Features

Data exported to customer-owned and managed Snowflake instance.

Complete control and edit permissions over data and access.

Accounts

Requires snowflake relationship and warehouse.

Best For

Customers who want to use Snowflake as their primary data warehouse, with write capabilities for other data sets, and edit and access control.