Skip to content

Instantly share code, notes, and snippets.

@gundamp
Created July 4, 2023 06:43
Show Gist options
  • Select an option

  • Save gundamp/ed76f34512d7e6e8a8af7a8088a129d9 to your computer and use it in GitHub Desktop.

Select an option

Save gundamp/ed76f34512d7e6e8a8af7a8088a129d9 to your computer and use it in GitHub Desktop.
categorise bank transactions - load data from csv files
header_name = ["Date", "Amount", "Description", "Balance"]
# Read in raw data
## These are downloaded from CBA netbank, filtering for "Outgo" only
path_CA = "/content/drive/MyDrive/Expense Tracking/Outgoing_Complete_Access_CY2022.csv"
path_label = "/content/drive/MyDrive/Expense Tracking/manual_class_training.csv"
data_raw_CA = pd.read_csv(path_CA, encoding = 'ISO-8859-1', names = header_name)
data_raw_label = pd.read_csv(path_label, encoding = 'ISO-8859-1', names = ["Description", "Class", "Source"])
data_raw_label.info()
# Import new data
path_CA_new = "/content/drive/MyDrive/Expense Tracking/Outgoing_Complete_Access_CY2021.csv"
data_raw_CA_new = pd.read_csv(path_CA_new, encoding = 'ISO-8859-1', names = header_name)
# Merge CA and SA transaction data and delete 'Balance' column as it's not useful
df_transaction_raw = data_raw_CA_new
df_transaction_CY22 = df_transaction_raw.drop('Balance', axis = 1)
text_test_raw = df_transaction_CY21['Description']
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment