As of 1/17/17, we now refer to "apps" as "projects."
You can see the duration of time people are using your product. On the dashboard, session lengths are calculated by subtracting the MAX(client_event_time) and session_id (which is the number of milliseconds since epoch). The values in red are what you will need to replace with your own.
Table of Contents
- Query Objective: Show the Distribution of Session Lengths on a Specific Date
- Query Objective: Show the Average Session Length per Segment
Query Objective: Show the Distribution of Session Lengths on a Specific Date
SELECT DATEDIFF('milliseconds',timestamp 'epoch' + session_id / 1000.0 * INTERVAL '1 second',max) AS diff_millisec FROM (SELECT session_id, amplitude_id,MIN(client_event_time) as min, MAX(client_event_time) AS max FROM events123 WHERE session_id != -1 AND DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06' GROUP BY session_id, amplitude_id) WHERE DATE(min) = '2017-04-06' ORDER BY diff_millisec ASC;
The inner SELECT chooses distinct pairs of session_id and amplitude_id as well as the minimum and maximum timestamps per unique pair. The outer SELECT uses the datediff function to subtract the MAX(client_event_time) and session_id by turning the session_id into a timestamp. It does so by dividing by 1000 (gets to seconds), then multiplying by the 1-second interval, and then adding it to the epoch timestamp (which is 0).
The final WHERE clause restricts the calculation to sessions that started today (because they could have extended into the following day).
The red text of the query above should be adjusted to your specific case.
Query Objective: Show the Average Session Length per Segment
SELECT( SELECT SUM(length) FROM( SELECT DISTINCT session_id, amplitude_id, DATEDIFF('milliseconds',timestamp 'epoch' + session_id / 1000.0 * INTERVAL '1 second',max) AS length FROM( SELECT amplitude_id, session_id, MAX(client_event_time) OVER(PARTITION BY session_id ORDER BY amplitude_id, client_event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)AS max, MIN(client_event_time) OVER(PARTITION BY session_id ORDER BY amplitude_id, client_event_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS min FROM events123 WHERE country = 'United States' AND DATE(event_time) BETWEEN '2017-03-30' AND '2017-04-06' AND session_id != '-1') WHERE DATE(min)= '2017-03-30') ) / (SELECT CAST(COUNT(DISTINCT session_id) AS float) FROM events123 WHERE session_id != '-1' AND DATE(event_time)= '2017-03-30' AND country = 'United States') /1000 AS average ;
In the purple subquery, we are selecting amplitude_id, session_id, event_time, the MAX value for the client_event_time in a given session, and the MIN value for the client_event_time from your events table while ONLY looking at users from the United States and on January 1st. We partition the table by session_id. PARTITION is the group function, but it does not aggregate the IDs (each row with the same Amplitude ID stays independent) and within each partition, the client_event_time is sorted from earliest to latest.
The blue subquery selects the distinct number of session_ids, amplitude_ids, and the difference between the maximum and minimum client_event_times (to give you session length in milliseconds.)
The orange subquery sums the lengths of the sessions, which should give you the TOTAL time for all sessions.
The green subquery gives you the number of distinct sessions from users who were in the United States and on January 1st.
Finally, the black outer subquery simply divides the TOTAL session time by the number of sessions, giving you the average session length. We then divide by 1000 to get our average in seconds.
Text in red can be adjusted for your specific case.