Skip to content

Instantly share code, notes, and snippets.

@arsalanyavari
Last active August 7, 2024 04:55
Show Gist options
  • Select an option

  • Save arsalanyavari/0d4a501ae6a7efdd70bb0dd3d563a00d to your computer and use it in GitHub Desktop.

Select an option

Save arsalanyavari/0d4a501ae6a7efdd70bb0dd3d563a00d to your computer and use it in GitHub Desktop.
import pandas as pd
numberOfDuplications = 20
fileName = "file.csv"
df = pd.read_csv(fileName)
df = df.groupby("major_item").head(numberOfDuplications)
outputFileName = "cleaned_file.csv"
df.to_csv(outputFileName, index=False)
print(f"Cleaned data saved to {outputFileName}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment