Skip to content

Instantly share code, notes, and snippets.

@Hugheym
Last active April 27, 2020 11:56
Show Gist options
  • Select an option

  • Save Hugheym/0d1055e7d276a19f6c4aaadc91893e81 to your computer and use it in GitHub Desktop.

Select an option

Save Hugheym/0d1055e7d276a19f6c4aaadc91893e81 to your computer and use it in GitHub Desktop.
import org.apache.spark.sql.functions._
val h3PickupStats = spark.table("ny_taxi_sample")
.withColumn("h3_pickup", geoToH3(col("pickup_latitude"), col("pickup_longitude"), lit(11))) // create h3 index
.groupBy("h3_pickup").agg(
count("*").alias("numPickups"),
sum("passenger_count").alias("totalPassangerCount"),
avg("tip_amount").alias("avg_tip")
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment