Skip to content

Instantly share code, notes, and snippets.

@bartekdobija
Last active August 31, 2015 13:33
Show Gist options
  • Select an option

  • Save bartekdobija/4ca798a5407f57ab26a3 to your computer and use it in GitHub Desktop.

Select an option

Save bartekdobija/4ca798a5407f57ab26a3 to your computer and use it in GitHub Desktop.
#!/usr/bin/env bash
# Spark without hadoop dependencies.
# Don't forget to install snappy & snappy-devel on RHEL/CentOS etc.
# Spark dependencies should be configured as per this document https://spark.apache.org/docs/latest/hadoop-provided.html
# spark-defaults.conf:
# spark.rdd.compress true
# spark.serializer org.apache.spark.serializer.KryoSerializer
# spark.localExecution.enabled true
# spark.master yarn
# spark.yarn.jar hdfs:///user/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar
# spark-env.sh
# HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
# SPARK_DIST_CLASSPATH=$(hadoop classpath)
./make-distribution.sh --name without-hadoop --tgz -Phadoop-2.6 -Psparkr -Phadoop-provided -Phive -Phive-thriftserver -Pyarn -DzincPort=3038 -DskipTests -Dmaven.javadoc.skip=true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment