Skip to content

Instantly share code, notes, and snippets.

@masterdezign
Created November 1, 2021 17:35
Show Gist options
  • Select an option

  • Save masterdezign/0bbd663be11d8901c179db0cd0dab3cf to your computer and use it in GitHub Desktop.

Select an option

Save masterdezign/0bbd663be11d8901c179db0cd0dab3cf to your computer and use it in GitHub Desktop.

Revisions

  1. masterdezign created this gist Nov 1, 2021.
    42 changes: 42 additions & 0 deletions pandas101.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,42 @@
    import pandas as pd
    # .read_csv()
    # .shape
    # .head(N) .tail(N)
    # .dtypes
    # .loc[3, 'sepal_length'] .iloc
    # .to_csv()


    # pd.set_option('max_columns', 2)
    # pd.options.display.float_format = '{:,.2f}'.format

    # .isna()
    # .cumsum(skipna=False)

    # df['Profit'] = df.apply(lambda x:..., axis=1)
    # df['Xx'].map()
    # df.applymap(lambda x: len(str(x))) # To every element


    # .pivot .stack


    # .plot()
    # .plot.area(stacked=False)
    # .boxplot()
    # .describe()
    # .corr()


    # Packages
    # - pandas_profiling
    # from pandas_profiling import ProfileReport
    # profile = ProfileReport(df, title="Title")
    # profile.to_notebook_iframe()


    # Dask - parallel computing
    # import dask.dataframe as dd
    # df = dd.read_csv('...')

    # Koalas -> Pandas API to Apache Spark