Skip to content

Instantly share code, notes, and snippets.

@savadev
savadev / sampling_example.sql
Created October 25, 2024 04:14 — forked from ks--ks/sampling_example.sql
Data sampling in PostgreSQL
-- sample data and assign new or returning user type
sampledata as (
select a.user_id,
a.created_at
, case
when a.created_at between a.firstdate and a.firstdate+27 then 'new'
else 'returning'
end as type
from (
select user_id
@savadev
savadev / PostreSQL_creatingid.sql
Created October 25, 2024 04:13 — forked from ks--ks/PostreSQL_creatingid.sql
PostgreSQL: joining two tables which are not related by a foreign key relationship
WITH actions AS
(
SELECT ca.created_at::date as actdate,
COUNT(*) AS act,
CASE when ca.state IN ('CA', 'WA', 'DC', 'NY', 'IL') THEN ca.state ELSE 'other' END AS state,
CASE WHEN ca.current_medium IN ('source1', 'source2', 'source3') THEN ca.current_medium ELSE 'other' END AS source,
lower('id-' || ca.created_at::date ||
'-' || CASE when ca.state IN ('CA', 'WA', 'DC', 'NY', 'IL') THEN ca.state ELSE 'other' END ||
'-' || CASE WHEN ca.current_medium IN ('source1', 'source2', 'source3') THEN ca.current_medium ELSE 'other' END
) AS id
@savadev
savadev / InfluenceScore.sql
Created October 25, 2024 04:12 — forked from ks--ks/InfluenceScore.sql
PowerUser_Step2
, influence AS (
SELECT user_id
, recruit_score
, share_score
, days_share_score
, channel_share_score
, NTILE(100) OVER (ORDER BY recruit_score DESC, share_score DESC, days_share_score DESC, channel_share_score DESC)
FROM (
SELECT a.user_id
@savadev
savadev / UserProfiling.sql
Created October 25, 2024 04:10 — forked from ks--ks/UserProfiling.sql
Create 4 user profiles based on user actions
-- quantitative profiling
with profile_1 as (
select user_id
from (
select p.user_id as user_id, count(distinct p.item_id) as purchased
from user u
join tags t on t.id = u.tag_id
join purchase p on p.item_id = u.item_id
where t.name = 'scarf'
group by 1)
@savadev
savadev / UserEngagementHistogram.sql
Created October 25, 2024 04:09 — forked from ks--ks/UserEngagementHistogram.sql
User Engagement Histogram
SELECT active.num_days_active
, COUNT(DISTINCT active.user_id) AS num_users -- get how many users have a number of active days
FROM (
SELECT a.user_id, COUNT(DISTINCT a.created_at) AS num_days_active
FROM activity a
GROUP BY a.user_id) AS active
LEFT JOIN activity a2 ON a2.user_id = active.user_id
WHERE a2.created_at >= CURRENT_DATE - 28
GROUP BY active.num_days_active
ORDER BY active.num_days_active ASC;
@savadev
savadev / subscription_id.sql
Created October 25, 2024 04:04 — forked from ks--ks/subscription_id.sql
A way to generate a subscription id and group transactions per subscription per user
WITH aggregated_subs AS (
SELECT
USER_ID
,PRODUCT_ID
,TRANSACTION_START
,TRANSACTION_DATE_DUE
,ROW_NUMBER() OVER (PARTITION BY PRODUCT_ID ORDER BY TRANSACTION_START) AS rn
,LAG(TRANSACTION_DATE_DUE, 1) OVER (PARTITION BY PRODUCT_ID ORDER BY TRANSACTION_DATE_DUE) AS previous_sub_end
FROM SUBSCRIPTIONS
-- ORDER BY TRANSACTION_START, rn
@savadev
savadev / retention_cohorted_example.sql
Created October 25, 2024 03:55 — forked from ks--ks/retention_cohorted_example.sql
Multiple ways to get cohorted user retention in SQL
/* B2B. Getting daily or monthly retention for free users
Input tables:
signups - user_id, s.signup_date
activity - user_id, activity_date
*/
-- CREATE VIEW cohort_user_retention AS
WITH
recruits AS (
SELECT recruiter_id AS user_id
, item_name
, COUNT(DISTINCT recruited_id) AS recruits
FROM recruit_table
WHERE created_at::DATE BETWEEN '2017-01-01'-180 AND '2017-01-08'
GROUP BY 1, 2
)
with
dau as (
-- This part of the query can be pretty much anything.
-- The only requirement is that it have three columns:
-- dt, user_id, inc_amt
-- Where dt is a date and user_id is some unique identifier for a user.
-- Each dt-user_id pair should be unique in this table.
-- inc_amt represents the amount of value that this user created on dt.
-- The most common case is
-- inc_amt = incremental revenue from the user on dt

Advanced SQL techniques for beginners

image by author

On a scale from 1 to 10 how good are your data warehousing skills? Want to go above 7/10? This article is for you then.

How good is your SQL? Want to get ready for a job interview asap? This blog post explains the most intricate data warehouse SQL techniques in high detail. I will use BigQuery standard SQL dialect to scribble down a few thoughts on this.

1. Incremental tables and MERGE