Skip to content

Instantly share code, notes, and snippets.

@SharathHebbar
Created October 16, 2024 13:58
Show Gist options
  • Select an option

  • Save SharathHebbar/8737bcca1b5312290dc1576ee6f5840b to your computer and use it in GitHub Desktop.

Select an option

Save SharathHebbar/8737bcca1b5312290dc1576ee6f5840b to your computer and use it in GitHub Desktop.

Revisions

  1. SharathHebbar created this gist Oct 16, 2024.
    15 changes: 15 additions & 0 deletions New.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,15 @@

    # Read the CSV (with the first row as data)
    df = spark.read.format("csv").option("header", "false").load("/path/to/csvfile")

    # Extract the first row as the header
    new_header = df.first()

    # Create a new DataFrame without the first row
    df_without_first_row = df.filter(df["_c0"] != new_header["_c0"])

    # Rename columns to match the values from the first row (header)
    new_column_names = [new_header[col] for col in df.columns]
    df_with_new_header = df_without_first_row.toDF(*new_column_names)

    df_with_new_header.show()