Skip to content

Instantly share code, notes, and snippets.

@timehaven
Created July 19, 2017 15:48
Show Gist options
  • Select an option

  • Save timehaven/3458d168b70eb2ede67c8db7bc057e5c to your computer and use it in GitHub Desktop.

Select an option

Save timehaven/3458d168b70eb2ede67c8db7bc057e5c to your computer and use it in GitHub Desktop.

Revisions

  1. timehaven created this gist Jul 19, 2017.
    16 changes: 16 additions & 0 deletions mm3.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,16 @@
    while 1:
    ...
    df = df.sample(frac=1) # shuffle all rows
    ...
    i, j = 0, batch_size
    for _ in range(nbatches):
    sub = df.iloc[i:j]
    idx = sub.index.values
    X2 = bcolz.open(bcolz_dir)[idx]
    ...
    # Calculate X and Y appropriately
    ...
    yield [X, X2], Y
    i = j
    j += batch_size
    ...