2.8. Know How to Construct Key-Dependent Default Values with __missing__

The built-in dict type’s setdefault method results in shorter code when handling missing keys in some specific circumstances (see Item 16: “Prefer get Over in and KeyError to Handle Missing Dictionary Keys” for examples). For many of those situations, the better tool for the job is the defaultdict type from the collections built-in module (see Item 17: “Prefer defaultdict Over setdefault to Handle Missing Items in Internal State” for why). However, there are times when neither setdefault nor defaultdict is the right fit.

For example, say that I’m writing a program to manage social network profile pictures on the filesystem. I need a dictionary to map profile picture pathnames to open file handles so I can read and write those images as needed. Here, I do this by using a normal dict instance and checking for the presence of keys using the get method and an assignment expression (introduced in Python 3.8; see Item 10: “Prevent Repetition with Assignment Expressions”):

>>> pictures = {}
>>> path = 'profile_1234.png'
>>>
>>> if (handle := pictures.get(path)) is None:
>>>     try:
>>>         handle = open(path, 'a+b')
>>>     except OSError:
>>>         print(f'Failed to open path {path}')
>>>         raise
>>>     else:
>>>         pictures[path] = handle
>>>
>>> handle.seek(0)
>>> image_data = handle.read()

When the file handle already exists in the dictionary, this code makes only a single dictionary access. In the case that the file handle doesn’t exist, the dictionary is accessed once by get, and then it is assigned in the else clause of the try/except block. (This approach also works with finally; see Item 65: “Take Advantage of Each Block in try/except/else/finally.”) The call to the read method stands clearly separate from the code that calls open and handles exceptions.

Although it’s possible to use the in expression or KeyError approaches to implement this same logic, those options require more dictionary accesses and levels of nesting. Given that these other options work, you might also assume that the setdefault method would work, too:

>>> try:
>>>     handle = pictures.setdefault(path, open(path, 'a+b'))
>>> except OSError:
>>>     print(f'Failed to open path {path}')
>>>     raise
>>> else:
>>>     handle.seek(0)
>>>     image_data = handle.read()

This code has many problems. The open built-in function to create the file handle is always called, even when the path is already present in the dictionary. This results in an additional file handle that may conflict with existing open handles in the same program. Exceptions may be raised by the open call and need to be handled, but it may not be possible to differentiate them from exceptions that may be raised by the setdefault call on the same line (which is possible for other dictionary-like implementations; see Item 43: “Inherit from collections.abc for Custom Container Types”).

If you’re trying to manage internal state, another assumption you might make is that a defaultdict could be used for keeping track of these profile pictures. Here, I attempt to implement the same logic as before but now using a helper function and the defaultdict class:

from collections import defaultdict

def open_picture(profile_path):
try:

return open(profile_path, 'a+b')

except OSError:

print(f'Failed to open path {profile_path}') raise

pictures = defaultdict(open_picture) handle = pictures[path] handle.seek(0) image_data = handle.read()

>>>
Traceback ...
TypeError: open_picture() missing 1 required positional
argument: 'profile_path'

The problem is that defaultdict expects that the function passed to its constructor doesn’t require any arguments. This means that the helper function that defaultdict calls doesn’t know which specific key is being accessed, which eliminates my ability to call open. In this situation, both setdefault and defaultdict fall short of what I need.

Fortunately, this situation is common enough that Python has another built-in solution. You can subclass the dict type and implement the __missing__ special method to add custom logic for handling missing keys. Here, I do this by defining a new class that takes advantage of the same open_picture helper method defined above:

class Pictures(dict):
def __missing__(self, key):

value = open_picture(key) self[key] = value return value

pictures = Pictures() handle = pictures[path] handle.seek(0) image_data = handle.read()

When the pictures[path] dictionary access finds that the path key isn’t present in the dictionary, the __missing__ method is called. This method must create the new default value for the key, insert it into the dictionary, and return it to the caller. Subsequent accesses of the same path will not call__missing__ since the corresponding item is already present (similar to the behavior of __getattr__; see Item 47: “Use __getattr__, __getattribute__, and __setattr__ for Lazy Attributes”).

2.8.1. Things to Remember

✦ The setdefault method of dict is a bad fit when creating the default value has high computational cost or may raise exceptions.

✦ The function passed to defaultdict must not require any arguments, which makes it impossible to have the default value depend on the key being accessed.

✦ You can define your own dict subclass with a __missing__ method in order to construct default values that must know which key was being accessed.