Skip to content

Instantly share code, notes, and snippets.

@karamanbk
Created August 23, 2019 12:56
Show Gist options
  • Select an option

  • Save karamanbk/ddf0a83d1ea9d6ce9ba8ff784e727c93 to your computer and use it in GitHub Desktop.

Select an option

Save karamanbk/ddf0a83d1ea9d6ce9ba8ff784e727c93 to your computer and use it in GitHub Desktop.

Revisions

  1. karamanbk created this gist Aug 23, 2019.
    21 changes: 21 additions & 0 deletions g9_twa_dataset.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,21 @@
    #create hv segment
    df_hv = pd.DataFrame()
    df_hv['customer_id'] = np.array([count for count in range(20000)])
    df_hv['segment'] = np.array(['high-value' for _ in range(20000)])
    df_hv['group'] = 'control'
    df_hv.loc[df_hv.index<10000,'group'] = 'test'
    df_hv.loc[df_hv.group == 'control', 'purchase_count'] = np.random.poisson(0.6, 10000)
    df_hv.loc[df_hv.group == 'test', 'purchase_count'] = np.random.poisson(0.8, 10000)


    df_lv = pd.DataFrame()
    df_lv['customer_id'] = np.array([count for count in range(20000,100000)])
    df_lv['segment'] = np.array(['low-value' for _ in range(80000)])
    df_lv['group'] = 'control'
    df_lv.loc[df_lv.index<40000,'group'] = 'test'
    df_lv.loc[df_lv.group == 'control', 'purchase_count'] = np.random.poisson(0.2, 40000)
    df_lv.loc[df_lv.group == 'test', 'purchase_count'] = np.random.poisson(0.3, 40000)

    df_customers = pd.concat([df_hv,df_lv],axis=0)