This article will help you:
This article describes several tips, tricks, and best practices for Amplitude customers who've purchased the Redshift ETL add-on.
1. Data for respective projects will be kept in their own schemas namespaces/packages in Redshift.
By default, every Redshift command you run will be under the public schema. However, you can select a different schema instead, using the
SET search_path command:
SET search_path = app123;
SELECT COUNT(*) FROM events;
Or you can include the schema as a prefix to the table:
SELECT COUNT(*) FROM app123.events;
2. Query directly from each project's table instead of the entire events.
The events from each of your Amplitude projects are stored in their own tables. The table name for each project is
events### , where the ### is the project number. You can find this in the URL of the Amplitude dashboard. The union of each project's events appears in a table called
events . Selecting
FROM events### whenever possible will make your queries faster and more efficient.
3. Custom event properties and custom user properties associated with an event type will be pulled into their own columns in the respective
Custom user properties will be appended by a
u_ and custom event properties will be appended by an
NOTE: This is limited to a maximum of 400 user properties and 50 event properties. Anything past the limit will still require the
4. Always include a date range in the
WHERE clause of your query.
Amplitude's Redshift tables do not have a primary key. Instead, they're sorted by the
event_time column. Adding a date range in the
WHERE clause of your query will significantly increase query speeds. We recommend using the
DATE() function with
event_time as the input.
SELECT * queries when possible.
The more columns you select, the slower your query will be. Selecting only relevant columns as opposed to all (*) columns will significantly increase query speeds and show only relevant data.