Skip to content

Instantly share code, notes, and snippets.

View yifeiacc's full-sized avatar

yifeiacc

View GitHub Profile
@yifeiacc
yifeiacc / Spark Dataframe Cheat Sheet.py
Created December 13, 2017 04:17 — forked from evenv/Spark Dataframe Cheat Sheet.py
Cheat sheet for Spark Dataframes (using Python)
# A simple cheat sheet of Spark Dataframe syntax
# Current for Spark 1.6.1
# import statements
from pyspark.sql import SQLContext
from pyspark.sql.types import *
from pyspark.sql.functions import *
#creating dataframes
df = sqlContext.createDataFrame([(1, 4), (2, 5), (3, 6)], ["A", "B"]) # from manual data