savadev’s gists

savadev / sampling_example.sql

Created October 25, 2024 04:14 — forked from ks--ks/sampling_example.sql

Data sampling in PostgreSQL

	-- sample data and assign new or returning user type
	sampledata as (
	select a.user_id,
	a.created_at
	, case
	when a.created_at between a.firstdate and a.firstdate+27 then 'new'
	else 'returning'
	end as type
	from (
	select user_id

savadev / PostreSQL_creatingid.sql

Created October 25, 2024 04:13 — forked from ks--ks/PostreSQL_creatingid.sql

PostgreSQL: joining two tables which are not related by a foreign key relationship

	WITH actions AS
	(
	SELECT ca.created_at::date as actdate,
	COUNT(*) AS act,
	CASE when ca.state IN ('CA', 'WA', 'DC', 'NY', 'IL') THEN ca.state ELSE 'other' END AS state,
	CASE WHEN ca.current_medium IN ('source1', 'source2', 'source3') THEN ca.current_medium ELSE 'other' END AS source,
	lower('id-' \|\| ca.created_at::date \|\|
	'-' \|\| CASE when ca.state IN ('CA', 'WA', 'DC', 'NY', 'IL') THEN ca.state ELSE 'other' END \|\|
	'-' \|\| CASE WHEN ca.current_medium IN ('source1', 'source2', 'source3') THEN ca.current_medium ELSE 'other' END
	) AS id

savadev / InfluenceScore.sql

Created October 25, 2024 04:12 — forked from ks--ks/InfluenceScore.sql

PowerUser_Step2

	, influence AS (
	SELECT user_id
	, recruit_score
	, share_score
	, days_share_score
	, channel_share_score
	, NTILE(100) OVER (ORDER BY recruit_score DESC, share_score DESC, days_share_score DESC, channel_share_score DESC)

	FROM (
	SELECT a.user_id

savadev / UserProfiling.sql

Created October 25, 2024 04:10 — forked from ks--ks/UserProfiling.sql

Create 4 user profiles based on user actions

	-- quantitative profiling
	with profile_1 as (
	select user_id
	from (
	select p.user_id as user_id, count(distinct p.item_id) as purchased
	from user u
	join tags t on t.id = u.tag_id
	join purchase p on p.item_id = u.item_id
	where t.name = 'scarf'
	group by 1)

savadev / UserEngagementHistogram.sql

Created October 25, 2024 04:09 — forked from ks--ks/UserEngagementHistogram.sql

User Engagement Histogram

	SELECT active.num_days_active
	, COUNT(DISTINCT active.user_id) AS num_users -- get how many users have a number of active days
	FROM (
	SELECT a.user_id, COUNT(DISTINCT a.created_at) AS num_days_active
	FROM activity a
	GROUP BY a.user_id) AS active
	LEFT JOIN activity a2 ON a2.user_id = active.user_id
	WHERE a2.created_at >= CURRENT_DATE - 28
	GROUP BY active.num_days_active
	ORDER BY active.num_days_active ASC;

savadev / subscription_id.sql

Created October 25, 2024 04:04 — forked from ks--ks/subscription_id.sql

A way to generate a subscription id and group transactions per subscription per user

	WITH aggregated_subs AS (
	SELECT
	USER_ID
	,PRODUCT_ID
	,TRANSACTION_START
	,TRANSACTION_DATE_DUE
	,ROW_NUMBER() OVER (PARTITION BY PRODUCT_ID ORDER BY TRANSACTION_START) AS rn
	,LAG(TRANSACTION_DATE_DUE, 1) OVER (PARTITION BY PRODUCT_ID ORDER BY TRANSACTION_DATE_DUE) AS previous_sub_end
	FROM SUBSCRIPTIONS
	-- ORDER BY TRANSACTION_START, rn

savadev / retention_cohorted_example.sql

Created October 25, 2024 03:55 — forked from ks--ks/retention_cohorted_example.sql

Multiple ways to get cohorted user retention in SQL

	/* B2B. Getting daily or monthly retention for free users

	Input tables:
	signups - user_id, s.signup_date
	activity - user_id, activity_date
	*/


	-- CREATE VIEW cohort_user_retention AS

savadev / PowerUserAnalysis.sql

Created October 25, 2024 03:51 — forked from ks--ks/PowerUserAnalysis.sql

	WITH

	recruits AS (
	SELECT recruiter_id AS user_id
	, item_name
	, COUNT(DISTINCT recruited_id) AS recruits
	FROM recruit_table
	WHERE created_at::DATE BETWEEN '2017-01-01'-180 AND '2017-01-08'
	GROUP BY 1, 2
	)

savadev / growth_accounting_and_ltv.sql

Created November 19, 2023 23:56 — forked from hsurreal/growth_accounting_and_ltv.sql

	with
	dau as (
	-- This part of the query can be pretty much anything.
	-- The only requirement is that it have three columns:
	-- dt, user_id, inc_amt
	-- Where dt is a date and user_id is some unique identifier for a user.
	-- Each dt-user_id pair should be unique in this table.
	-- inc_amt represents the amount of value that this user created on dt.
	-- The most common case is
	-- inc_amt = incremental revenue from the user on dt

savadev / sql_for_beginners.md

Created December 5, 2022 01:35 — forked from mshakhomirov/sql_for_beginners.md

Advanced SQL techniques for beginners

On a scale from 1 to 10 how good are your data warehousing skills? Want to go above 7/10? This article is for you then.

How good is your SQL? Want to get ready for a job interview asap? This blog post explains the most intricate data warehouse SQL techniques in high detail. I will use BigQuery standard SQL dialect to scribble down a few thoughts on this.

savadev

Advanced SQL techniques for beginners

1. Incremental tables and MERGE