Last active
December 2, 2021 03:54
-
-
Save codspire/7b0955b9e67fe73f6118dad9539cbaa2 to your computer and use it in GitHub Desktop.
Revisions
-
codspire revised this gist
Jul 7, 2017 . 1 changed file with 2 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,3 +1,5 @@ # Zeppelin, Spark, PySpark Setup on Windows (10) I wish running Zeppelin on windows wasn't as hard as it is. Things go haiwire if you already have Spark installed on your computer. Zeppelin's embedded Spark interpreter does not work nicely with existing Spark and you may need to perform below steps (hacks!) to make it work. I am hoping that these will be fixed in newer Zeppelin versions. If you try to run Zeppelin after extracting the package, you might encounter **"The filename, directory name, or volume label syntax is incorrect."** -
codspire revised this gist
Jun 20, 2017 . 1 changed file with 1 addition and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -50,6 +50,7 @@ sc.version ``` It should print: value = 20 '2.1.0' ```bash %spark -
codspire revised this gist
Jun 20, 2017 . 2 changed files with 59 additions and 1 deletion.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1 +0,0 @@ This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,59 @@ I wish running Zeppelin on windows wasn't as hard as it is. Things go haiwire if you already have Spark installed on your computer. Zeppelin's embedded Spark interpreter does not work nicely with existing Spark and you may need to perform below steps (hacks!) to make it work. I am hoping that these will be fixed in newer Zeppelin versions. If you try to run Zeppelin after extracting the package, you might encounter **"The filename, directory name, or volume label syntax is incorrect."** Google search landed me to https://issues.apache.org/jira/browse/ZEPPELIN-1584, this link was helpful but wasn't enough to get Zeppelin working. Below is what I had to do to make it work on my Windows 10 computer. Existing software & configurations: * Spark version 2.1.0 * Python version 3.6.1 * Zeppelin version zeppelin-0.7.2-bin-all * My `SPARK_HOME` is "C:\Applications\spark-2.1.1-bin-hadoop2.7" Steps * Extract Zeppelin package to a folder (mine was "C:\Applications\zeppelin-0.7.2-bin-all") * Copy jars from existing Spark installation into Zeppelin ```bash $ cp %SPARK_HOME%\jars\*.jar %ZEPPELIN_HOME%\interpreter\spark $ del %ZEPPELIN_HOME%\interpreter\spark\datanucleus*.jar ``` * Copy `pyspark` from existing Spark installation ```bash $ cp %SPARK_HOME%\python\lib\*.zip %ZEPPELIN_HOME%\interpreter\spark ``` * Rename `%ZEPPELIN_HOME%\conf\zeppelin-env.cmd.template` to `%ZEPPELIN_HOME%\conf\zeppelin-env.cmd` * Update `zeppelin-env.cmd` ```bash set PYTHONPATH=%SPARK_HOME%\python;%SPARK_HOME%\python\lib\py4j-0.10.4-src.zip;%SPARK_HOME%\python\lib\pyspark.zip ``` * You have to subpress existing Spark installation to make it work nicely with Zeppelin. Add below line on top of `%ZEPPELIN_HOME%\bin\zeppelin.cmd` file ```bash set SPARK_HOME= ``` * Start Zeppelin and validate `Spark` & `pyspark` ```bash $ cd %ZEPPELIN_HOME% $ bin\zeppelin.cmd ``` * Open http://localhost:8080 and create a new notebook and try below code Validate `pyspark` ```bash %pyspark a=5*4 print("value = %i" % (a)) sc.version ``` It should print: value = 20 '2.1.0' ```bash %spark sc.version ``` It should print: '2.1.0' -
codspire revised this gist
Jun 20, 2017 . 1 changed file with 1 addition and 59 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,59 +1 @@ **test** -
codspire created this gist
Jun 20, 2017 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,59 @@ I wish running Zeppelin on windows wasn't as hard as it is. Things go haiwire if you already have Spark installed on your computer. Zeppelin's embedded Spark interpreter does not work nicely with existing Spark and you may need to perform below steps (hacks!) to make it work. I am hoping that these will be fixed in newer Zeppelin versions. If you try to run Zeppelin after extracting the package, you might encounter **"The filename, directory name, or volume label syntax is incorrect."** Google search landed me to https://issues.apache.org/jira/browse/ZEPPELIN-1584, this link was helpful but wasn't enough to get Zeppelin working. Below is what I had to do to make it work on my Windows 10 computer. Existing software & configurations: * Spark version 2.1.0 * Python version 3.6.1 * Zeppelin version zeppelin-0.7.2-bin-all * My `SPARK_HOME` is "C:\Applications\spark-2.1.1-bin-hadoop2.7" Steps * Extract Zeppelin package to a folder (mine was "C:\Applications\zeppelin-0.7.2-bin-all") * Copy jars from existing Spark installation into Zeppelin ```bash $ cp %SPARK_HOME%\jars\*.jar %ZEPPELIN_HOME%\interpreter\spark $ del %ZEPPELIN_HOME%\interpreter\spark\datanucleus*.jar ``` * Copy `pyspark` from existing Spark installation ```bash $ cp %SPARK_HOME%\python\lib\*.zip %ZEPPELIN_HOME%\interpreter\spark ``` * Rename `%ZEPPELIN_HOME%\conf\zeppelin-env.cmd.template` to `%ZEPPELIN_HOME%\conf\zeppelin-env.cmd` * Update `zeppelin-env.cmd` ```bash set PYTHONPATH=%SPARK_HOME%\python;%SPARK_HOME%\python\lib\py4j-0.10.4-src.zip;%SPARK_HOME%\python\lib\pyspark.zip ``` * You have to subpress existing Spark installation to make it work nicely with Zeppelin. Add below line on top of `%ZEPPELIN_HOME%\bin\zeppelin.cmd` file ```bash set SPARK_HOME= ``` * Start Zeppelin and validate `Spark` & `pyspark` ```bash $ cd %ZEPPELIN_HOME% $ bin\zeppelin.cmd ``` * Open http://localhost:8080 and create a new notebook and try below code Validate `pyspark` ```bash %pyspark a=5*4 print("value = %i" % (a)) sc.version ``` It should print: value = 20 '2.1.0' ```bash %spark sc.version ``` It should print: '2.1.0'