Skip to content

Instantly share code, notes, and snippets.

@Intelrunner
Last active August 6, 2024 20:36
Show Gist options
  • Select an option

  • Save Intelrunner/af487c7e92e96cb5dde43526c3373918 to your computer and use it in GitHub Desktop.

Select an option

Save Intelrunner/af487c7e92e96cb5dde43526c3373918 to your computer and use it in GitHub Desktop.

Revisions

  1. Intelrunner revised this gist Aug 6, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion big-ole-file.py
    Original file line number Diff line number Diff line change
    @@ -6,7 +6,7 @@
    import random

    # 1000000 and 62 == roughly 1.3GB (will take a bit of time, go get a coffee)
    rows = 1000000
    rows = 1200000
    columns = 62

    def generate_random_row(col):
  2. Intelrunner revised this gist Aug 6, 2024. 1 changed file with 1 addition and 1 deletion.
    2 changes: 1 addition & 1 deletion big-ole-file.py
    Original file line number Diff line number Diff line change
    @@ -5,7 +5,7 @@
    import csv
    import random

    # 1000000 and 102 == roughly 2GB (will take a bit of time, go get a coffee)
    # 1000000 and 62 == roughly 1.3GB (will take a bit of time, go get a coffee)
    rows = 1000000
    columns = 62

  3. Intelrunner created this gist Aug 6, 2024.
    25 changes: 25 additions & 0 deletions big-ole-file.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,25 @@
    # This is not an original work, but crafted based on: https://gist.github.com/momota/ba302f0f0720ff5b2445fb81820c5b82
    # I updated it to make a file closer to the size I needed consistantly. All praise goes to: @momota and @andrewFarley for
    # The original gist.

    import csv
    import random

    # 1000000 and 102 == roughly 2GB (will take a bit of time, go get a coffee)
    rows = 1000000
    columns = 62

    def generate_random_row(col):
    a = []
    l = [i]
    for j in range(col):
    l.append(random.random())
    a.append(l)
    return a

    if __name__ == '__main__':
    f = open('sample.csv', 'w')
    w = csv.writer(f, lineterminator='\n')
    for i in range(rows):
    w.writerows(generate_random_row(columns))
    f.close()