devdazed · February 12, 2016 17:12 · Feb 12, 2016 · Feb 12, 2016 · Feb 12, 2016 · Feb 12, 2016
diff --git a/gistfile1.md b/gistfile1.md
@@ -15,8 +15,8 @@
 ```
 > virtualenv .jupyter
 > source .jupyter/bin/activate
-> sudo pip install ipython
-> sudo pip install jupyter
+> pip install ipython
+> pip install jupyter
 > PYSPARK_SUBMIT_ARGS="$PYSPARK_SUBMIT_ARGS pyspark-shell" IPYTHON_OPTS="notebook --ip='*' --no-browser" dse pyspark
 ```
 

diff --git a/gistfile1.txt → gistfile1.md b/gistfile1.txt → gistfile1.md
diff --git a/gistfile1.txt b/gistfile1.txt
@@ -0,0 +1,29 @@
+## On your client machine
+
+### As the `root` user
+
+1. Install DSE
+2. In the cassandra.yml file, ensure the datacenter and cluster match your analytics datacenter
+3. In the cassandra-env.sh file add this configuration line toward the bottom
+   `JVM_OPTS="$JVM_OPTS -Dcassandra.join_ring=false"`
+   This will make your DSE node a coordinator only, it will not own any data.  You can use this node to submit jobs to DSE locally without the need to know which is the master node.
+3. start DSE
+4. Install python
+5. Install virtualenv
+
+### As the `cassandra` user
+```
+> virtualenv .jupyter
+> source .jupyter/bin/activate
+> sudo pip install ipython
+> sudo pip install jupyter
+> PYSPARK_SUBMIT_ARGS="$PYSPARK_SUBMIT_ARGS pyspark-shell" IPYTHON_OPTS="notebook --ip='*' --no-browser" dse pyspark
+```
+
+## Notes
+You can use something like `supervisord` to keep jupyter running in the background.
+
+If you are getting a permission denied error when starting pyspark that look slike this: 
+`OSError: [Errno 13] Permission denied: '/run/user/505/jupyter'`
+It is because the XDG_RUNTIME_DIR is set to your logged in user, in that case just add the following environment variable before starting pyspark:
+`JUPYTER_RUNTIME_DIR="$HOME/.jupyter/runtime`