NOTE: For most customers, the Help Center article on exporting Amplitude data to Redshift will be more useful. This article is maintained as a courtesy.
Query Objective: Show the Distribution of Event Property Totals
SELECT DATE (event_time) AS DATE, e_type, COUNT(*) FROM app123.signUp WHERE DATE (event_time) BETWEEN '2017-03-30' AND '2017-04-06' GROUP BY DATE, e_type ORDER BY DATE, COUNT DESC;
Explanation
The query shows the distribution of the type property of the 'signUp' event every day for the first week in March. Because event properties are pulled into their own columns, we can query on the event property type directly and use GROUP BY to capture each property on each day. The red text of the query above should be adjusted to your specific case.
Query Objective: Count the Number of Users Who Did an Event More than Twice on a Specific Date
SELECT amplitude_id, COUNT(*) AS total FROM app123.signUp WHERE DATE(event_time) = '2017-04-06' GROUP BY amplitude_id HAVING COUNT(*) >= 2;
Explanation
The query above counts how many users did the 'signUp' event two or more times on April 6. The inner SELECT creates the table of users and how many times they did the 'signUp' event, and the outer SELECT only chooses those who have done it two or more times. The red text of the query above should be adjusted to your specific case.
Query Objective: Count the Number of Events Done by a Specific Set of Users Who Did Another Event
Specifically, we will count the number of ‘sentMessage’ events done by people who did the ‘signUp’ event in California between March 30 and April 6.
There will be two steps. First, we need to get the set of users who did ‘signUp’ in California from March 30 through April 6. The query below gets us this set. This will be an intermediate query that we will use in the final query.
SELECT DISTINCT(amplitude_id) FROM app123.signUp WHERE region = 'California' AND DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06';
The red text of the query above should be adjusted to your specific case. The next step is that we need to answer the question for users in this set, how many 'sentMessage' events happened during the same time period?
There are two ways to get this. One way is to use an IN and the other way is to use a JOIN. Both will require the intermediate query we defined above. We will explain both so you can choose which you feel more comfortable using.
Using an IN
SELECT COUNT(*) FROM app123.sentMessage WHERE DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06' AND amplitude_id IN (SELECT DISTINCT(amplitude_id) FROM app123.signUp WHERE region = 'California' AND DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06' ;
Explanation
The outer SELECT counts the number of ‘sentMessage’ events. The condition where amplitude_id IN() means it will only select from rows there the amplitude_id is IN the set of users inside that function. So what do we do? We add our intermediate query inside the IN function so that we are only counting messages from users who have done ‘signUp’ in California from March 30 to April 6. The red text of the query above should be adjusted to your specific case.
Using a JOIN DELETE
CREATE OR REPLACE VIEW CalisignUp0330to0406 AS SELECT DISTINCT(amplitude_id) FROM app123.signUp WHERE region = 'California' AND DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06'; SELECT COUNT(*) FROM app123.sentMessage INNER JOIN CalisignUp0330to0406 ON app123.sentMessage.amplitude_id = CalisignUp0330to0406.amplitude_id
WHERE DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06';
Explanation
The first part of the query is the intermediate query we defined above. We have turned it into a view (CalisignUp0330to0406) to make the query cleaner. The second part of the query is a JOIN -- we JOIN the events table with the created view. The JOIN selects out the amplitude_ids that appear in both tables (the users who did ‘signUp’) and the rest of the query only picks from these rows. The red text of the query above should be adjusted to your specific case.
Query Objective: Show the Distribution of Users Who Have Done an Event by Number of Times Done
SELECT amplitude_id, COUNT(*) AS messages FROM app123.sentMessage WHERE DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06' GROUP BY amplitude_id ORDER BY COUNT(*) DESC;
Explanation
The query’s output is a table of the messages and the number of users who sent that number of messages in between March 30 and April 6. Here is a sample of the output:
messages | users |
---|---|
1 | 29588 |
2 | 12625 |
3 | 6151 |
4 | 3568 |
5 | 2469 |
6 | 1808 |
7 | 1363 |
The inner SELECT creates a table of unique users and how many messages they have logged during the first two weeks of March. The outer SELECT creates a table based on the number of messages and the number of users who fell into that bucket.
Query Objective: Find the Last Three Events a User Does Before Churning
We will limit the analysis to people who used the product the month prior to last.
1. Define churn as people who have not logged in during the last month.
CREATE VIEW churned AS ( SELECT DISTINCT(amplitude_id) FROM events123 WHERE DATE(event_time) BETWEEN '2017-01-01' AND '2017-01-31' AND amplitude_id NOT IN ( SELECT DISTINCT(amplitude_id) FROM events123 WHERE DATE(event_time) BETWEEN '2017-02-01' AND '2017-02-28' ) );
2. Fetch the last three events per user.
CREATE TEMPORARY TABLE last3 AS (SELECT * FROM (
SELECT amplitude_id, event_type, row_number() over (PARTITION BY amplitude_id ORDER BY event_time DESC)
FROM events123
WHERE event_type NOT IN ('session_start', 'session_end')
AND amplitude_id IN (
SELECT amplitude_id
FROM churned)
)
WHERE row_number <= 3
);
3. Join the tables to combine the three events into one row.
CREATE VIEW last3joined AS ( SELECT a.amplitude_id, a.event_type AS e1, b.event_type AS e2, c.event_type AS e3 FROM (SELECT * FROM last3 WHERE row_number=1) AS a JOIN (SELECT * FROM last3 WHERE row_number=2) AS b ON a.amplitude_id = b.amplitude_id JOIN (SELECT * FROM last3 WHERE row_number=3) AS c ON a.amplitude_id = c.amplitude_id );
4. What were the last three events before the user churned?
SELECT e1 || ',' || e2 || ',' || e3 AS last3, COUNT(*) FROM last3joined GROUP BY last3 ORDER BY count DESC;
Explanation
This query shows the last three events users did out of the set of users who were active in January but not active in February. The red text of the query above should be adjusted to your specific case.