Skip to content

Instantly share code, notes, and snippets.

@burningsky250
Last active January 7, 2019 18:46
Show Gist options
  • Select an option

  • Save burningsky250/852bc59f71b270d54262c7d1e7c70ce6 to your computer and use it in GitHub Desktop.

Select an option

Save burningsky250/852bc59f71b270d54262c7d1e7c70ce6 to your computer and use it in GitHub Desktop.
comparison with many ways to construct a big list(dict):

comparison with many ways to construct a big list:

  1. list repeat
python -m timeit -n 10 -r 10 -v '[0]*10000000'
raw times: 0.561 0.528 0.519 0.516 0.516 0.557 0.516 0.513 0.516 0.509
10 loops, best of 10: 50.9 msec per loop

hint: create python object 'int' once, repeat it 10,000,000 times.

  1. list comprehension
python -m timeit -n 10 -r 10 -v '[0 for _ in xrange(10000000)]'

raw times: 3.57 3.89 3.73 3.68 3.67 3.5 3.59 3.43 3.46 3.7
10 loops, best of 10: 343 msec per loop

hint: create python object 'int' 10,000,000 times, append 10000000 times

  1. repeatedly append
python -m timeit -n 10 -r 10 -v -s 'a=[]' "for _ in xrange(10000000): a.append(0)"

raw times: 8.94 7.85 11.5 8.11 10.2 9.41 9.43 6.92 6.98 7.22
10 loops, best of 10: 692 msec per loop

hint: create python object 'int' 10000000 times, get 'append' attribute (from memory) 10000000 times, append 10000000 times

  1. list factory
python -m timeit -n 10 -r 10 -v "list(xrange(10000000))"

raw times: 2.13 1.72 1.72 1.72 1.71 1.72 1.79 1.78 1.69 1.71
10 loops, best of 10: 169 msec per loop

hint: pre-allocate memory block,

question: why is list factory faster than list comprehension?

  • answer: The list comprehension executes the loop in Python bytecode, just like a regular for loop. While the list factory call iterates entirely in C code, which is far faster, besides, it can use the length of the xrange() object to pre-allocate the list object rather than grow it dynamically and possibly do memory copy infrequently.
  • likewise, dict.fromkeys(someList, True) is faster than {d:True for d in someList}. dict.fromkeys(<dict>|<set>|<frozenset>, True) will pre-allocate memory area, so is very fast. dict.fromkeys(<list>|<iterable>, True) will grow the dict dynamically, thus decay the performance slightly.

example:

list:
python -m timeit -n 10 -r 10 -v -s "a=list(xrange(1000000))" 'dict.fromkeys(a, True)'
raw times: 0.743 0.711 0.701 0.695 0.69 0.697 0.693 0.698 0.692 0.687
10 loops, best of 10: 68.7 msec per loop

dict:
python -m timeit -n 10 -r 10 -v -s "a=dict((x, True) for x in xrange(1000000))" 'dict.fromkeys(a, True)'
raw times: 0.424 0.399 0.389 0.405 0.401 0.422 0.422 0.416 0.429 0.42
10 loops, best of 10: 38.9 msec per loop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment