Skip to content

Instantly share code, notes, and snippets.

@anirudhXshetty
Created August 28, 2019 16:26
Show Gist options
  • Select an option

  • Save anirudhXshetty/75ea493b61d004eaee83931196747bd6 to your computer and use it in GitHub Desktop.

Select an option

Save anirudhXshetty/75ea493b61d004eaee83931196747bd6 to your computer and use it in GitHub Desktop.
--User list with their favourite stores ordered by rank based on number of visits.
--Corresponding number for how often they visit on weekly basis for these 52 weeks.
--How many days between next purchase in each store for a user.
--Can be used out to filter and check users and stores for other features,
--like total regular visitors, total visitors who have visited only once for a store etc..
WITH usr_str_visits AS (
SELECT
user_id,
store_id,
count (DISTINCT transaction_id) AS num_stores_visits,
row_number() OVER (
PARTITION BY user_id ORDER BY count(distinct transaction_id) DESC) AS rank,
count (DISTINCT transaction_week) AS num_weeks_visited
FROM source_data_dunnhumby.line_item
WHERE user_id IS NOT NULL
AND transaction_week > 200636
AND transaction_week < 200737
GROUP BY user_id, store_id
ORDER BY 1 DESC),
user_store_ipt AS (
SELECT
user_id,
store_id,
round(avg(next_in_store_purchase), 0) AS avg_user_store_ipt
FROM (
SELECT
user_id,
store_id,
transaction_date,
lead(transaction_date, 1) OVER (PARTITION BY user_id, store_id ORDER BY transaction_date ASC) AS next_in_store_transaction,
timestampdiff('day', transaction_date, lead(transaction_date, 1) OVER (PARTITION BY user_id, store_id ORDER BY transaction_date ASC)) AS next_in_store_purchase
FROM (
SELECT DISTINCT
user_id,
store_id,
transaction_date
FROM source_data_dunnhumby.line_item
WHERE user_id IS NOT NULL
AND transaction_week > 200636
AND transaction_week < 200737
) li
) nb
GROUP BY 1,2 ORDER BY 1,2)
SELECT usv.user_id, usv.store_id, rank, num_stores_visits, num_weeks_visited, avg_user_store_ipt AS user_ipt
FROM usr_str_visits usv
JOIN user_store_ipt ui USING (user_id, store_id)
ORDER BY usv.user_id, rank;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment