Last active
April 16, 2019 08:25
-
-
Save ssimeonov/72c8a9b01f99e35ba470 to your computer and use it in GitHub Desktop.
Revisions
-
ssimeonov revised this gist
Jul 22, 2015 . 1 changed file with 3 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,3 +1,6 @@ // This code is designed to be pasted in spark-shell in a *nix environment // On Windows, replace sys.env("HOME") with a directory of your choice import java.io.File import java.io.PrintWriter import org.apache.spark.sql.hive.HiveContext -
ssimeonov created this gist
Jul 22, 2015 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,29 @@ import java.io.File import java.io.PrintWriter import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql.SaveMode import org.apache.spark.sql.SaveMode val ctx = sqlContext.asInstanceOf[HiveContext] import ctx.implicits._ val json = """{"category" : "A", "num" : 5}""" val path = sys.env("HOME") + "/spark_sql_first.jsonlines" new PrintWriter(path) { write(json); close } ctx.read.json("file://" + path).registerTempTable("test_first") // OK, proof that the data was loaded correctly ctx.sql("select * from test_first").show // org.apache.spark.sql.AnalysisException: expression 'num' is neither present in the group by, nor is it an aggregate function. Add to group by or wrap in first() if you don't care which value you get. ctx.sql("select num from test_first group by category").show // ERROR RetryingHMSHandler: MetaException(message:NoSuchObjectException(message:Function default.first does not exist)) // INFO FunctionRegistry: Unable to lookup UDF in metastore: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:NoSuchObjectException(message:Function default.first does not exist)) // java.lang.RuntimeException: Couldn't find function first ctx.sql("select first(num) from test_first group by category").show // OK ctx.sql("select first_value(num) from test_first group by category").show new File(path).delete()