10.9. Consider Static Analysis via typing to Obviate Bugs¶

Providing documentation is a great way to help users of an API understand how to use it properly (see Item 84: “Write Docstrings for Every Function, Class, and Module”), but often it’s not enough, and incorrect usage still causes bugs. Ideally, there would be a programmatic mechanism to verify that callers are using your APIs the right way, and that you are using your downstream dependencies correctly. Many programming languages address part of this need with compile-time type checking, which can identify and eliminate some categories of bugs.

Historically Python has focused on dynamic features and has not provided compile-time type safety of any kind. However, more recently Python has introduced special syntax and the built-in typing module, which allow you to annotate variables, class fields, functions, and methods with type information. These type hints allow for gradual typing, where a codebase can be incrementally updated to specify types as desired.

The benefit of adding type information to a Python program is that you can run static analysis tools to ingest a program’s source code and identify where bugs are most likely to occur. The typing built-in module doesn’t actually implement any of the type checking functionality itself. It merely provides a common library for defining types, including generics, that can be applied to Python code and consumed by separate tools.

Much as there are multiple distinct implementations of the Python interpreter (e.g., CPython, PyPy), there are multiple implementations of static analysis tools for Python that use typing. As of the time of this writing, the most popular tools are mypy (https://github.com/python/mypy), pytype (https://github.com/google/pytype), pyright (https://github.com/microsoft/pyright), and pyre (https://pyre-check.org). For the typing examples in this book, I’ve used mypy with the –strict flag, which enables all of the various warnings supported by the tool. Here’s an example of what running the command line looks like:

$ python3 -m mypy --strict example.py

These tools can be used to detect a large number of common errors

before a program is ever run, which can provide an added layer of safety in addition to having good unit tests (see Item 76: “Verify Related Behaviors in TestCase Subclasses”). For example, can you find the bug in this simple function that causes it to compile fine but throw an exception at runtime?

def subtract(a, b):: return a - b

subtract(10, '5')

>>>
Traceback ...
TypeError: unsupported operand type(s) for -: 'int' and 'str'

Parameter and variable type annotations are delineated with a colon (such as name: type). Return value types are specified with -> type following the argument list. Using such type annotations and mypy, I can easily spot the bug:

def subtract(a: int, b: int) -> int: # Function annotation: return a - b

subtract(10, '5') # Oops: passed string value

$ python3 -m mypy --strict example.py .../example.py:4: error: Argument 2 to "subtract" has incompatible type "str"; expected "int"

Another common mistake, especially for programmers who have recently moved from Python 2 to Python 3, is mixing bytes and str instances together (see Item 3: “Know the Differences Between bytes and str”). Do you see the problem in this example that causes a runtime error?

def concat(a, b):: return a + b

concat('first', b'second')

>>>
Traceback ...
TypeError: can only concatenate str (not "bytes") to str

Using type hints and mypy, this issue can be detected statically before the program runs:

def concat(a: str, b: str) -> str:: return a + b

concat('first', b'second') # Oops: passed bytes value

$ python3 -m mypy --strict example.py .../example.py:4: error: Argument 2 to "concat" has ➥ incompatible type "bytes"; expected "str"

Type annotations can also be applied to classes. For example, this class has two bugs in it that will raise exceptions when the program is run:

>>> class Counter:
>>>     def __init__(self):
>>>         self.value = 0
>>>
>>>     def add(self, offset):
>>>         value += offset
>>>
>>>     def get(self) -> int:
>>>         self.value

The first one happens when I call the add method:

counter = Counter() counter.add(5)

>>>
Traceback ...
UnboundLocalError: local variable 'value' referenced before
➥ assignment

The second bug happens when I call get:

counter = Counter() found = counter.get() assert found == 0, found

>>>
Traceback ...
AssertionError: None

Both of these problems are easily found by mypy:

class Counter:

def __init__(self) -> None:: self.value: int = 0 # Field / variable annotation
def add(self, offset: int) -> None:: value += offset # Oops: forgot "self."
def get(self) -> int:: self.value # Oops: forgot "return"

counter = Counter() counter.add(5) counter.add(3) assert counter.get() == 8

$ python3 -m mypy --strict example.py .../example.py:6: error: Name 'value' is not defined .../example.py:8: error: Missing return statement

One of the strengths of Python’s dynamism is the ability to write generic functionality that operates on duck types (see Item 15: “Be Cautious When Relying on dict Insertion Ordering” and Item 43: “Inherit from collections.abc for Custom Container Types”). This allows one implementation to accept a wide range of types, saving a lot of duplicative effort and simplifying testing. Here, I’ve defined such a generic function for combining values from a list. Do you understand why the last assertion fails?

def combine(func, values):

assert len(values) > 0

result = values[0] for next_value in values[1:]:

result = func(result, next_value)

return result

def add(x, y):

return x + y

inputs = [1, 2, 3, 4j] result = combine(add, inputs) assert result == 10, result # Fails

>>>
Traceback ...
AssertionError: (6+4j)

I can use the typing module’s support for generics to annotate this function and detect the problem statically:

from typing import Callable, List, TypeVar

Value = TypeVar('Value') Func = Callable[[Value, Value], Value]

def combine(func: Func[Value], values: List[Value]) -> Value:

assert len(values) > 0

result = values[0] for next_value in values[1:]:

result = func(result, next_value)

return result

Real = TypeVar('Real', int, float)

def add(x: Real, y: Real) -> Real:: return x + y

inputs = [1, 2, 3, 4j] # Oops: included a complex number result = combine(add, inputs) assert result == 10

$ python3 -m mypy --strict example.py .../example.py:21: error: Argument 1 to "combine" has ➥ incompatible type "Callable[[Real, Real], Real]"; expected ➥ "Callable[[complex, complex], complex]"

Another extremely common error is to encounter a None value when you thought you’d have a valid object (see Item 20: “Prefer Raising Exceptions to Returning None”). This problem can affect seemingly simple code. Do you see the issue here?

def get_or_default(value, default):

if value is not None:: return value

return value

found = get_or_default(3, 5) assert found == 3

found = get_or_default(None, 5) assert found == 5, found # Fails

>>>
Traceback ...
AssertionError: None

The typing module supports option types, which ensure that programs only interact with values after proper null checks have been performed. This allows mypy to infer that there’s a bug in this code: The type used in the return statement must be None, and that doesn’t match the int type required by the function signature:

from typing import Optional

def get_or_default(value: Optional[int],

default: int) -> int:

if value is not None:: return value

return value # Oops: should have returned "default"

$ python3 -m mypy --strict example.py .../example.py:7: error: Incompatible return value type (got ➥ "None", expected "int")

A wide variety of other options are available in the typing module. See https://docs.python.org/3.8/library/typing for all of the details. Notably, exceptions are not included. Unlike Java, which has checked exceptions that are enforced at the API boundary of every method, Python’s type annotations are more similar to C#’s: Exceptions are not considered part of an interface’s definition. Thus, if you want to verify that you’re raising and catching exceptions properly, you need to write tests.

One common gotcha in using the typing module occurs when you need to deal with forward references (see Item 88: “Know How to Break Circular Dependencies” for a similar problem). For example, imagine that I have two classes and one holds a reference to the other:

>>> class FirstClass:
>>>     def __init__(self, value):
>>>         self.value = value
>>>
>>> class SecondClass:
>>>     def __init__(self, value):
>>>         self.value = value
>>>
>>> second = SecondClass(5)
>>> first = FirstClass(second)

If I apply type hints to this program and run mypy it will say that there are no issues:

>>> class FirstClass:
>>>     def __init__(self, value: SecondClass) -> None:
>>>         self.value = value
>>>
>>> class SecondClass:
>>>     def __init__(self, value: int) -> None:
>>>         self.value = value
>>>
>>> second = SecondClass(5)
>>> first = FirstClass(second)

$ python3 -m mypy --strict example.py

However, if you actually try to run this code, it will fail because SecondClass is referenced by the type annotation in the FirstClass.__init__ method’s parameters before it’s actually defined:

class FirstClass:

def __init__(self, value: SecondClass) -> None: # Breaks: self.value = value

class SecondClass:

def __init__(self, value: int) -> None:: self.value = value

second = SecondClass(5) first = FirstClass(second)

>>>
Traceback ...
NameError: name 'SecondClass' is not defined

One workaround supported by these static analysis tools is to use a string as the type annotation that contains the forward reference. The string value is later parsed and evaluated to extract the type information to check:

>>> class FirstClass:
>>>     def __init__(self, value: 'SecondClass') -> None: # OK
>>>         self.value = value
>>>
>>> class SecondClass:
>>>     def __init__(self, value: int) -> None:
>>>         self.value = value
>>>
>>> second = SecondClass(5)
>>> first = FirstClass(second)

A better approach is to use from future import annotations, which is available in Python 3.7 and will become the default in Python 4. This instructs the Python interpreter to completely ignore the values supplied in type annotations when the program is being run. This resolves the forward reference problem and provides a performance improvement at program start time:

>>> from __future__ import annotations
>>>
>>> class FirstClass:
>>>     def __init__(self, value: SecondClass) -> None: # OK
>>>         self.value = value
>>>
>>> class SecondClass:
>>>     def __init__(self, value: int) -> None:
>>>         self.value = value
>>>
>>> second = SecondClass(5)
>>> first = FirstClass(second)

Now that you’ve seen how to use type hints and their potential benefits, it’s important to be thoughtful about when to use them. Here are some of the best practices to keep in mind:

It’s going to slow you down if you try to use type annotations from the start when writing a new piece of code. A general strategy is to write a first version without annotations, then write tests, and then add type information where it’s most valuable.

Type hints are most important at the boundaries of a codebase, such as an API you provide that many callers (and thus other people) depend on. Type hints complement integration tests (see Item 77: “Isolate Tests from Each Other with setUp, tearDown, setUpModule, and tearDownModule”) and warnings (see Item 89: “Consider warnings to Refactor and Migrate Usage”) to ensure that your API callers aren’t surprised or broken by your changes.

It can be useful to apply type hints to the most complex and errorprone parts of your codebase that aren’t part of an API. However, it may not be worth striving for 100% coverage in your type annotations because you’ll quickly encounter diminishing returns.

If possible, you should include static analysis as part of your automated build and test system to ensure that every commit to your codebase is vetted for errors. In addition, the configuration used for type checking should be maintained in the repository to ensure that all of the people you collaborate with are using the same rules.

As you add type information to your code, it’s important to run the type checker as you go. Otherwise, you may nearly finish sprinkling type hints everywhere and then be hit by a huge wall of errors from the type checking tool, which can be disheartening and make you want to abandon type hints altogether.

Finally, it’s important to acknowledge that in many situations, you may not need or want to use type annotations at all. For small programs, ad-hoc code, legacy codebases, and prototypes, type hints may require far more effort than they’re worth.

10.9.1. Things to Remember¶

✦ Python has special syntax and the typing built-in module for annotating variables, fields, functions, and methods with type information.

✦ Static type checkers can leverage type information to help you avoid many common bugs that would otherwise happen at runtime.

✦ There are a variety of best practices for adopting types in your programs, using them in APIs, and making sure they don’t get in the way of your productivity.

10.8. Consider warnings to Refactor and Migrate Usage

Beyond the Basic Stuff with Python

Python 3 教程 文档

10.9. Consider Static Analysis via typing to Obviate Bugs¶

10.9.1. Things to Remember¶

Python 3 教程文档