2.6. Prefer get Over in and KeyError to Handle Missing Dictionary Keys¶
The three fundamental operations for interacting with dictionaries are accessing, assigning, and deleting keys and their associated values. The contents of dictionaries are dynamic, and thus it’s entirely possible—even likely—that when you try to access or delete a key, it won’t already be present.
For example, say that I’m trying to determine people’s favorite type of bread to devise the menu for a sandwich shop. Here, I define a dictionary of counters with the current votes for each style:
>>> counters = {
>>> 'pumpernickel': 2,
>>> 'sourdough': 1,
>>> }
To increment the counter for a new vote, I need to see if the key exists, insert the key with a default counter value of zero if it’s missing, and then increment the counter’s value. This requires accessing the key two times and assigning it once. Here, I accomplish this task using an if statement with an in expression that returns True when the key is present:
>>> key = 'wheat'
>>>
>>> if key in counters:
>>> count = counters[key]
>>> else:
>>> count = 0
>>>
>>> counters[key] = count + 1
Another way to accomplish the same behavior is by relying on how dictionaries raise a KeyError exception when you try to get the value for a key that doesn’t exist. This approach is more efficient because it requires only one access and one assignment:
>>> try:
>>> count = counters[key]
>>> except KeyError:
>>> count = 0
counters[key] = count + 1 This flow of fetching a key that exists or returning a default value is so common that the dict built-in type provides the get method to accomplish this task. The second parameter to get is the default value to return in the case that the key—the first parameter—isn’t present. This also requires only one access and one assignment, but it’s much shorter than the KeyError example:
count = counters.get(key, 0) counters[key] = count + 1 It’s possible to shorten the in expression and KeyError approaches in various ways, but all of these alternatives suffer from requiring code duplication for the assignments, which makes them less readable and worth avoiding:
>>> if key not in counters:
>>> counters[key] = 0
>>> counters[key] += 1
>>>
>>> if key in counters:
>>> counters[key] += 1
>>> else:
>>> counters[key] = 1
>>>
>>> try:
>>> counters[key] += 1
>>> except KeyError:
>>> counters[key] = 1
Thus, for a dictionary with simple types, using the get method is the shortest and clearest option.
Note
If you’re maintaining dictionaries of counters like this, it’s worth considering the Counter class from the collections built-in module, which provides most of the facilities you are likely to need.
What if the values of the dictionary are a more complex type, like a list? For example, say that instead of only counting votes, I also want to know who voted for each type of bread. Here, I do this by associating a list of names with each key:
>>> votes = {
>>> 'baguette': ['Bob', 'Alice'],
>>> 'ciabatta': ['Coco', 'Deb'],
>>> }
>>> key = 'brioche'
>>> who = 'Elmer'
>>>
>>> if key in votes:
>>> names = votes[key]
>>> else:
>>> votes[key] = names = []
>>>
>>> names.append(who)
>>> print(votes)
{'baguette': ['Bob', 'Alice'], 'ciabatta': ['Coco', 'Deb'], 'brioche': ['Elmer']}
Relying on the in expression requires two accesses if the key is present, or one access and one assignment if the key is missing. This example is different from the counters example above because the value for each key can be assigned blindly to the default value of an empty list if the key doesn’t already exist. The triple assignment statement (votes[key] = names = []) populates the key in one line instead of two. Once the default value has been inserted into the dictionary, I don’t need to assign it again because the list is modified by reference in the later call to append.
It’s also possible to rely on the KeyError exception being raised when the dictionary value is a list. This approach requires one key access if the key is present, or one key access and one assignment if it’s missing, which makes it more efficient than the in condition:
>>> try:
>>> names = votes[key]
>>> except KeyError:
>>> votes[key] = names = []
>>>
>>> names.append(who)
Similarly, you can use the get method to fetch a list value when the key is present, or do one fetch and one assignment if the key isn’t present:
>>> names = votes.get(key)
>>> if names is None:
>>> votes[key] = names = []
>>>
>>> names.append(who)
The approach that involves using get to fetch list values can further be shortened by one line if you use an assignment expression ( introduced in Python 3.8; see Item 10: “Prevent Repetition with Assignment Expressions”) in the if statement, which improves readability:
>>> if (names := votes.get(key)) is None:
>>> votes[key] = names = []
>>>
>>> names.append(who)
The dict type also provides the setdefault method to help shorten this pattern even further. setdefault tries to fetch the value of a key in the dictionary. If the key isn’t present, the method assigns that key to the default value provided. And then the method returns the value for that key: either the originally present value or the newly inserted default value. Here, I use setdefault to implement the same logic as in the get example above:
>>> names = votes.setdefault(key, [])
>>> names.append(who)
This works as expected, and it is shorter than using get with an assignment expression. However, the readability of this approach isn’t ideal. The method name setdefault doesn’t make its purpose immediately obvious. Why is it set when what it’s doing is getting a value? Why not call it get_or_set? I’m arguing about the color of the bike shed here, but the point is that if you were a new reader of the code and not completely familiar with Python, you might have trouble understanding what this code is trying to accomplish because setdefault isn’t self-explanatory.
There’s also one important gotcha: The default value passed to setdefault is assigned directly into the dictionary when the key is missing instead of being copied. Here, I demonstrate the effect of this when the value is a list:
>>> data = {}
>>> key = 'foo'
>>> value = []
>>> data.setdefault(key, value)
>>> print('Before:', data)
>>> value.append('hello')
>>> print('After: ', data)
Before: {'foo': []}
After: {'foo': ['hello']}
This means that I need to make sure that I’m always constructing a new default value for each key I access with setdefault. This leads to a significant performance overhead in this example because I have to allocate a list instance for each call. If I reuse an object for the default value—which I might try to do to increase efficiency or readability—I might introduce strange behavior and bugs (see Item 24: “Use None and Docstrings to Specify Dynamic Default Arguments” for another example of this problem).
Going back to the earlier example that used counters for dictionary values instead of lists of who voted: Why not also use the setdefault method in that case? Here, I reimplement the same example using this approach:
>>> count = counters.setdefault(key, 0)
>>> counters[key] = count + 1
The problem here is that the call to setdefault is superfluous. You always need to assign the key in the dictionary to a new value after you increment the counter, so the extra assignment done by setdefault is unnecessary. The earlier approach of using get for counter updates requires only one access and one assignment, whereas using setdefault requires one access and two assignments.
There are only a few circumstances in which using setdefault is the shortest way to handle missing dictionary keys, such as when the default values are cheap to construct, mutable, and there’s no potential for raising exceptions (e.g., list instances). In these very specific cases, it may seem worth accepting the confusing method name setdefault instead of having to write more characters and lines to use get. However, often what you really should do in these situations is to use defaultdict instead (see Item 17: “Prefer defaultdict Over setdefault to Handle Missing Items in Internal State”).
2.6.1. Things to Remember¶
✦ There are four common ways to detect and handle missing keys in dictionaries: using in expressions, KeyError exceptions, the get method, and the setdefault method.
✦ The get method is best for dictionaries that contain basic types like counters, and it is preferable along with assignment expressions when creating dictionary values has a high cost or may raise exceptions.
✦ When the setdefault method of dict seems like the best fit for your problem, you should consider using defaultdict instead.