12.1. The timeit Module

“Premature optimization is the root of all evil” is a common saying in software development. (It’s often attributed to computer scientist Donald Knuth, who attributes it to computer scientist Tony Hoare. Tony Hoare, in turn, attributes it to Donald Knuth.) Premature optimization, or optimizing before knowing what needs to be optimized, often manifests itself when programmers try to use clever tricks to save memory or write faster code. For example, one of these tricks is using the XOR algorithm to swap two integer values without using a third, temporary variable:

>>> a, b = 42, 101 # Set up the two variables.
>>> print(a, b)
42 101
>>> # A series of ^ XOR operations will end up swapping their values:
>>> a = a ^ b
>>> b = a ^ b
>>> a = a ^ b
>>> print(a, b) # The values are now swapped.
101 42

Unless you’re unfamiliar with the XOR algorithm (which uses the ^ bit- wise operator), this code looks cryptic. The problem with using clever pro- gramming tricks is that they can produce complicated, unreadable code. Recall that one of the Zen of Python tenets is readability counts.

Even worse, your clever trick might turn out not to be so clever. You can’t just assume a crafty trick is faster or that the old code it’s replacing was even all that slow to begin with. The only way to find out is by measur- ing and comparing the runtime: the amount of time it takes to run a pro- gram or piece of code. Keep in mind that increasing the runtime means the program is slowing down: the program is taking more time to do the same amount of work. (We also sometimes use the term runtime to refer to the period during which the program is running. This error happened at runtime means the error happened while the program was running as opposed to when the program was being compiled into bytecode.)

The Python standard library’s timeit module can measure the runtime speed of a small snippet of code by running it thousands or millions of times, letting you determine an average runtime. The timeit module also temporar- ily disables the automatic garbage collector to ensure more consistent run- times. If you want to test multiple lines, you can pass a multiline code string or separate the code lines using semicolons:

>>> import timeit
>>> timeit.timeit('a, b = 42, 101; a = a ^ b; b = a ^ b; a = a ^ b')
0.12210549495648593

timeit.timeit("""a, b = 42, 101 ... a = a ^ b ... b = a ^ b ... a = a ^ b""") 0.13515726800000039

On my computer, the XOR algorithm takes roughly one-tenth of a sec- ond to run this code. Is this fast? Let’s compare it to some integer swapping code that uses a third temporary variable:

>>> import timeit
>>> timeit.timeit('a, b = 42, 101; temp = a; a = b; b = temp')
0.03074154700152576

That’s a surprise! Not only is using a third temporary variable more read- able, but it’s also twice as fast! The clever XOR trick might save a few bytes of memory but at the expense of speed and code readability. Sacrificing code readability to reduce a few bytes of memory usage or nanoseconds of runtime isn’t worthwhile.

Better still, you can swap two variables using the multiple assignment trick, also called iterable unpacking, which also runs in a small amount of time:

>>> timeit.timeit('a, b = 42, 101; a, b = b, a')
0.03199427493382245

Not only is this the most readable code, it’s also the quickest. We know this not because we assumed it, but because we objectively measured it.

The timeit.timeit() function can also take a second string argument of setup code. The setup code runs only once before running the first string’s code. You can also change the default number of trials by passing an integer for the number keyword argument. For example, the following test measures how quickly Python’s random module can generate 10,000,000 random num- bers from 1 to 100. (On my machine, it takes about 10 seconds.)

>>> timeit.timeit('random.randint(1, 100)', 'import random', number=10000000)
5.8313132780604064

By default, the code in the string you pass to timeit.timeit() won’t be able to access the variables and the functions in the rest of the program:

>>> import timeit
>>> spam = 'hello' # We define the spam variable.
>>> timeit.timeit('print(spam)', number=1) # We measure printing spam.
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "C:\Users\Al\AppData\Local\Programs\Python\Python37\lib\timeit.py",
line 232, in timeit
        return Timer(stmt, setup, timer, globals).timeit(number)
    File "C:\Users\Al\AppData\Local\Programs\Python\Python37\lib\timeit.py",
line 176, in timeit
        timing = self.inner(it, self.timer)
    File "<timeit-src>", line 6, in inner
NameError: name 'spam' is not defined

To fix this, pass the function the return value of globals() for the globals keyword argument:

>>> spam='hello'
>>> timeit.timeit('print(spam)', number=1, globals=globals())
hello
8.083891589194536e-05

A good rule for writing your code is to first make it work and then make it fast. Only once you have a working program should you focus on making it more efficient.