###Tested with:
- Spark 2.0.0 pre-built for Hadoop 2.7
- Mac OS X 10.11
- Python 3.5.2
Use s3 within pyspark with minimal hassle.
| docker rmi $(docker images | grep "^<none>" | awk "{print $3}") |
| name | gender | race | |
|---|---|---|---|
| shivani | f | indian | |
| isha | f | indian | |
| smt shyani devi | f | indian | |
| divya | f | indian | |
| mansi | f | indian | |
| mazida | f | indian | |
| pooja | f | indian | |
| kajal | f | indian | |
| meena | f | indian |
# referecing: # https://www.digitalocean.com/community/tutorials/how-to-install-apache-kafka-on-ubuntu-14-04 # https://chongyaorobin.wordpress.com/2015/07/08/step-by-step-of-install-apache-kafka-on-ubuntu-standalone-mode/
$ sudo useradd kafka -m
| # Configuration | |
| HOME_DIR=/home/[user]/ | |
| VERSION=3.2.0 | |
| # Installation | |
| sudo apt-get update | |
| sudo apt-get upgrade | |
| sudo apt-get install -y build-essential cmake pkg-config | |
| sudo apt-get install -y libjpeg8-dev libtiff5-dev libjasper-dev libpng12-dev | |
| sudo apt-get install -y libavcodec-dev libavformat-dev libswscale-dev libv4l-dev |