10.3. Write Docstrings for Every Function, Class, and Module

Documentation in Python is extremely important because of the dynamic nature of the language. Python provides built-in support for attaching documentation to blocks of code. Unlike with many other languages, the documentation from a program’s source code is directly accessible as the program runs.

For example, you can add documentation by providing a docstring immediately after the def statement of a function:

>>> def palindrome(word):
>>>     """Return True if the given word is a palindrome."""
>>>     return word == word[::-1]
>>>
>>> assert palindrome('tacocat')
>>> assert not palindrome('banana')

You can retrieve the docstring from within the Python program by accessing the function’s doc special attribute:

>>> print(repr(palindrome.__doc__))
'Return True if the given word is a palindrome.'

You can also use the built-in pydoc module from the command line to run a local web server that hosts all of the Python documentation that’s accessible to your interpreter, including modules that you’ve written:

$ python3 -m pydoc -p 1234 Server ready at http://localhost:1234/ Server commands: [b]rowser, [q]uit server> b

Docstrings can be attached to functions, classes, and modules. This connection is part of the process of compiling and running a Python program. Support for docstrings and the doc attribute has three consequences:

  • The accessibility of documentation makes interactive development easier. You can inspect functions, classes, and modules to see their documentation by using the help built-in function. This makes the Python interactive interpreter (the Python “shell”) and tools like IPython Notebook (https://ipython.org) a joy to use while you’re developing algorithms, testing APIs, and writing code snippets.

  • A standard way of defining documentation makes it easy to build tools that convert the text into more appealing formats (like HTML). This has led to excellent documentation-generation tools for the Python community, such as Sphinx (https://www.sphinx-doc.org). It has also enabled community-funded sites like Read the Docs (https://readthedocs.org) that provide free hosting of beautiful-looking documentation for open source Python projects.

  • Python’s first-class, accessible, and good-looking documentation encourages people to write more documentation. The members of the Python community have a strong belief in the importance of documentation. There’s an assumption that “good code” also means well-documented code. This means that you can expect most open source Python libraries to have decent documentation.

To participate in this excellent culture of documentation, you need to follow a few guidelines when you write docstrings. The full details are discussed online in PEP 257 (https://www.python.org/dev/peps/pep-0257/). There are a few best practices you should be sure to follow.

10.3.1. Documenting Modules

Each module should have a top-level docstring—a string literal that is the first statement in a source file. It should use three double quotes (“““). The goal of this docstring is to introduce the module and its contents.

The first line of the docstring should be a single sentence describing the module’s purpose. The paragraphs that follow should contain the details that all users of the module should know about its operation. The module docstring is also a jumping-off point where you can highlight important classes and functions found in the module.

Here’s an example of a module docstring:

>>> # words.py
>>> #!/usr/bin/env python3
>>> """Library for finding linguistic patterns in words.
>>>
>>> Testing how words relate to each other can be tricky sometimes!
>>> This module provides easy ways to determine when words you've
>>> found have special properties.
>>>
>>> Available functions:
>>> - palindrome: Determine if a word is a palindrome.
>>> - check_anagram: Determine if two words are anagrams.
>>> ...
>>> """
>>> ...
Ellipsis

If the module is a command-line utility, the module docstring is also a great place to put usage information for running the tool.

Documenting Classes Each class should have a class-level docstring. This largely follows the same pattern as the module-level docstring. The first line is the single-sentence purpose of the class. Paragraphs that follow discuss important details of the class’s operation.

Important public attributes and methods of the class should be highlighted in the class-level docstring. It should also provide guidance to subclasses on how to properly interact with protected attributes (see Item 42: “Prefer Public Attributes Over Private Ones”) and the superclass’s methods.

Here’s an example of a class docstring:

>>> class Player:
>>>     """Represents a player of the game.
>>>
>>>     Subclasses may override the 'tick' method to provide
>>>     custom animations for the player's movement depending
>>>     on their power level, etc.
>>>
>>>     Public attributes:
>>>     - power: Unused power-ups (float between 0 and 1).
>>>     - coins: Coins found during the level (integer).
>>>     """
>>>
>>>     ...

10.3.2. Documenting Functions

Each public function and method should have a docstring. This follows the same pattern as the docstrings for modules and classes. The first line is a single-sentence description of what the function does. The paragraphs that follow should describe any specific behaviors and the arguments for the function. Any return values should be mentioned. Any exceptions that callers must handle as part of the function’s interface should be explained (see Item 20: “Prefer Raising Exceptions to Returning None” for how to document raised exceptions).

Here’s an example of a function docstring:

>>> def find_anagrams(word, dictionary):
>>>     """Find all anagrams for a word.
>>>
>>>     This function only runs as fast as the test for
>>>     membership in the 'dictionary' container.
>>>
>>>     Args:
>>>         word: String of the target word.
>>>         dictionary: collections.abc.Container with all
>>>             strings that are known to be actual words.
>>>
>>>     Returns:
>>>         List of anagrams that were found. Empty if
>>>         none were found.
>>>     """
>>>     ...

There are also some special cases in writing docstrings for functions that are important to know:

  • If a function has no arguments and a simple return value, a single-sentence description is probably good enough.

  • If a function doesn’t return anything, it’s better to leave out any mention of the return value instead of saying “returns None.”

  • If a function’s interface includes raising exceptions (see Item 20: “Prefer Raising Exceptions to Returning None” for an example), its docstring should describe each exception that’s raised and when it’s raised.

  • If you don’t expect a function to raise an exception during normal operation, don’t mention that fact.

  • If a function accepts a variable number of arguments (see Item 22: “Reduce Visual Noise with Variable Positional Arguments”) or keyword arguments (see Item 23: “Provide Optional Behavior with Keyword Arguments”), use *args and **kwargs in the documented list of arguments to describe their purpose.

  • If a function has arguments with default values, those defaults should be mentioned (see Item 24: “Use None and Docstrings to Specify Dynamic Default Arguments”).

  • If a function is a generator (see Item 30: “Consider Generators Instead of Returning Lists”), its docstring should describe what the generator yields when it’s iterated.

  • If a function is an asynchronous coroutine (see Item 60: “Achieve Highly Concurrent I/O with Coroutines”), its docstring should explain when it will stop execution.

10.3.3. Using Docstrings and Type Annotations

Python now supports type annotations for a variety of purposes (see Item 90: “Consider Static Analysis via typing to Obviate Bugs” for how to use them). The information they contain may be redundant with typical docstrings. For example, here is the function signature for find_anagrams with type annotations applied:

>>> from typing import Container, List
>>>
>>> def find_anagrams(word: str,
>>>                   dictionary: Container[str]) -> List[str]:
>>>     ...

There is no longer a need to specify in the docstring that the word argument is a string, since the type annotation has that information. The same goes for the dictionary argument being a collections.abc.Container. There’s no reason to mention that the return type will be a list, since this fact is clearly annotated. And when no anagrams are found, the return value still must be a list, so it’s implied that it will be empty; that doesn’t need to be noted in the docstring. Here, I write the same function signature from above along with the docstring that has been shortened accordingly:

>>> def find_anagrams(word: str,
>>>                   dictionary: Container[str]) -> List[str]:
>>>     """Find all anagrams for a word.
>>>
>>>     This function only runs as fast as the test for
>>>     membership in the 'dictionary' container.
>>>
>>>     Args:
>>>         word: Target word.
>>>         dictionary: All known actual words.
>>>
>>>     Returns:
>>>         Anagrams that were found.
>>>     """
>>>     ...

The redundancy between type annotations and docstrings should be similarly avoided for instance fields, class attributes, and methods. It’s best to have type information in only one place so there’s less risk that it will skew from the actual implementation.

10.3.4. Things to Remember

✦ Write documentation for every module, class, method, and function using docstrings. Keep them up-to-date as your code changes.

✦ For modules: Introduce the contents of a module and any important classes or functions that all users should know about.

✦ For classes: Document behavior, important attributes, and subclass behavior in the docstring following the class statement.

✦ For functions and methods: Document every argument, returned value, raised exception, and other behaviors in the docstring following the def statement.

✦ If you’re using type annotations, omit the information that’s already present in type annotations from docstrings since it would be redundant to have it in both places.