Skip to content

Instantly share code, notes, and snippets.

#start a screen
screen -S spark
# Start Spark
/opt/spark-1.3.1-bin-hadoop2.4/bin/pyspark --master yarn-client --num-executors 34 --spark.yarn.executor.memoryOverhead 2000 --spark.executor.memory 4g --spark.shuffle.spill true --spark.shuffle.memoryFraction .6 --spark.storage.memoryFraction .6 --spark.driver.memory 4g
###
# INITIALISE PYSPARK CONSOLE (copy and paste into console)
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
# GUIDE
# https://spark.apache.org/docs/latest/sql-programming-guide.html#overview
###
## HOW TO START PYSPARK CONSOLE (copy and paste into terminal)
###
/opt/spark-1.3.1-bin-hadoop2.4/bin/pyspark --master yarn-client --num-executors 34 --spark.yarn.executor.memoryOverhead 2000 --spark.executor.memory 4g --spark.shuffle.spill true --spark.shuffle.memoryFraction .6 --spark.storage.memoryFraction .6 --spark.driver.memory 4g
###
###