weizh888

git clone xxx

git remote add <another-fork-alias> <another-fork-URL>

git checkout <local_branch>

How to install virtualenv:

Install pip first

sudo apt-get install python3-pip

Then install virtualenv using pip3

sudo pip3 install virtualenv

Spark internals through code

Nothing gives you more detail about spark internals than actually reading it source code. In addition, you get to learn many design techniques and improve your scala coding skills. These are the random notes I make while reading the spark code. The best way to comprehend the notes is to load spark code into an IDE, e.g. IntelliJ, and navigate the code on the side.

Genesis - creation of a spark cluster

The scripts for creating a spark cluster are: start-master.sh and start-slave.sh. Read them carefully, and you can see that both scripts are very similar except the values for $CLASS variable. For start-master.sh, the value is CLASS="org.apache.spark.deploy.master.Master", while the value for start-slave.sh is shown below with more context.

# NOTE: This exact class name is matched downstream by SparkSubmit.

NOTE: This is a question I found on StackOverflow which I’ve archived here, because the answer is so effing phenomenal.

Q: How can I make a chain of function decorators in Python?

If you are not into long explanations, see [Paolo Bergantino’s answer][2].

API workthough

Open a browser

# start an instance of firefox with selenium-webdriver
driver = Selenium::WebDriver.for :firefox
# :chrome -> chrome
# :ie     -> iexplore

Go to a specified URL

	git checkout <branch>
	git fetch <other-fork-alias>
	git cherry-pick <commit-hash>
	git push <your-fork-alias>