6.3. Use Descriptors for Reusable @property Methods

The big problem with the @property built-in (see Item 44: “Use Plain Attributes Instead of Setter and Getter Methods” and Item 45: “Consider @property Instead of Refactoring Attributes”) is reuse. The methods it decorates can’t be reused for multiple attributes of the same class. They also can’t be reused by unrelated classes.

For example, say I want a class to validate that the grade received by a student on a homework assignment is a percentage:

>>> class Homework:
>>>     def __init__(self):
>>>         self._grade = 0
>>>
>>>     @property
>>>     def grade(self):
>>>         return self._grade
>>>
>>>     @grade.setter
>>>     def grade(self, value):
>>>         if not (0 <= value <= 100):
>>>             raise ValueError(
>>>                 'Grade must be between 0 and 100')
>>>         self._grade = value

Using @property makes this class easy to use:

>>> galileo = Homework()
>>> galileo.grade = 95

Say that I also want to give the student a grade for an exam, where the exam has multiple subjects, each with a separate grade:

>>> class Exam:
>>>      def __init__(self):
>>>             self._writing_grade = 0
>>>             self._math_grade = 0
>>>
>>>      @staticmethod
>>>      def _check_grade(value):
>>>             if not (0 <= value <= 100):
>>>                  raise ValueError(
>>>                  'Grade must be between 0 and 100')

This quickly gets tedious. For each section of the exam I need to add a new @property and related validation:

>>> @property
>>> def writing_grade(self):
>>>       return self._writing_grade
>>>
>>> @writing_grade.setter
>>> def writing_grade(self, value):
>>>         self._check_grade(value)
>>>         self._writing_grade = value
>>>
>>> @property
>>> def math_grade(self):
>>>       return self._math_grade
>>>
>>> @math_grade.setter
>>> def math_grade(self, value):
>>>         self._check_grade(value)
>>>         self._math_grade = value

Also, this approach is not general. If I want to reuse this percentage validation in other classes beyond homework and exams, I’ll need to write the @property boilerplate and _check_grade method over and over again.

The better way to do this in Python is to use a descriptor. The descriptor protocol defines how attribute access is interpreted by the language. A descriptor class can provide __get__ and __set__ methods that let you reuse the grade validation behavior without boilerplate. For this purpose, descriptors are also better than mix-ins (see Item 41: “Consider Composing Functionality with Mix-in Classes”) because they let you reuse the same logic for many different attributes in a single class.

Here, I define a new class called Exam with class attributes that are Grade instances. The Grade class implements the descriptor protocol:

>>> class Grade:
>>>     def __get__(self, instance, instance_type):
>>>         ...
>>>
>>>     def __set__(self, instance, value):
>>>         ...
>>>
>>> class Exam:
>>>     # Class attributes
>>>     math_grade = Grade()
>>>     writing_grade = Grade()
>>>     science_grade = Grade()

Before I explain how the Grade class works, it’s important to understand what Python will do when such descriptor attributes are accessed on an Exam instance. When I assign a property:

>>> exam = Exam()
>>> exam.writing_grade = 40

it is interpreted as:

>>> Exam.__dict__['writing_grade'].__set__(exam, 40)

When I retrieve a property:

>>> exam.writing_grade

it is interpreted as:

>>> Exam.__dict__['writing_grade'].__get__(exam, Exam)

What drives this behavior is the __getattribute__ method of object (see Item 47: “Use __getattr__, __getattribute__, and __setattr__ for Lazy Attributes”). In short, when an Exam instance doesn’t have an attribute named writing_grade, Python falls back to the Exam class’s attribute instead. If this class attribute is an object that has __get__ and __set__ methods, Python assumes that you want to follow the descriptor protocol.

Knowing this behavior and how I used @property for grade validation in the Homework class, here’s a reasonable first attempt at implementing the Grade descriptor:

>>> class Grade:
>>>     def __init__(self):
>>>         self._value = 0
>>>
>>>     def __get__(self, instance, instance_type):
>>>         return self._value
>>>
>>>     def __set__(self, instance, value):
>>>         if not (0 <= value <= 100):
>>>             raise ValueError(
>>>                 'Grade must be between 0 and 100')
>>>         self._value = value

Unfortunately, this is wrong and results in broken behavior. Accessing multiple attributes on a single Exam instance works as expected:

>>> class Exam:
>>>     math_grade = Grade()
>>>     writing_grade = Grade()
>>>     science_grade = Grade()
>>>
>>> first_exam = Exam()
>>> first_exam.writing_grade = 82
>>> first_exam.science_grade = 99
>>> print('Writing', first_exam.writing_grade)
>>> print('Science', first_exam.science_grade)
Writing 82
Science 99

But accessing these attributes on multiple Exam instances causes unexpected behavior:

>>> second_exam = Exam()
>>> second_exam.writing_grade = 75
>>> print(f'Second {second_exam.writing_grade} is right')
>>> print(f'First  {first_exam.writing_grade} is wrong; '
>>>       f'should be 82')
Second 75 is right
First  75 is wrong; should be 82

The problem is that a single Grade instance is shared across all Exam instances for the class attribute writing_grade. The Grade instance for this attribute is constructed once in the program lifetime, when the Exam class is first defined, not each time an Exam instance is created.

To solve this, I need the Grade class to keep track of its value for each unique Exam instance. I can do this by saving the per-instance state in a dictionary:

>>> class Grade:
>>>     def __init__(self):
>>>         self._values = {}
>>>
>>>     def __get__(self, instance, instance_type):
>>>         if instance is None:
>>>             return self
>>>         return self._values.get(instance, 0)
>>>
>>>     def __set__(self, instance, value):
>>>         if not (0 <= value <= 100):
>>>             raise ValueError(
>>>                 'Grade must be between 0 and 100')
>>>         self._values[instance] = value

This implementation is simple and works well, but there’s still one gotcha: It leaks memory. The _values dictionary holds a reference to every instance of Exam ever passed to __set__ over the lifetime of the program. This causes instances to never have their reference count go to zero, preventing cleanup by the garbage collector (see Item 81: “Use tracemalloc to Understand Memory Usage and Leaks” for how to detect this type of problem).

To fix this, I can use Python’s weakref built-in module. This module provides a special class called WeakKeyDictionary that can take the place of the simple dictionary used for _values. The unique behavior of WeakKeyDictionary is that it removes Exam instances from its set of items when the Python runtime knows it’s holding the instance’s last remaining reference in the program. Python does the bookkeeping for me and ensures that the _values dictionary will be empty when all Exam instances are no longer in use:

>>> from weakref import WeakKeyDictionary
>>>
>>> class Grade:
>>>     def __init__(self):
>>>         self._values = WeakKeyDictionary()
>>>
>>>     def __get__(self, instance, instance_type):
>>>         ...
>>>
>>>     def __set__(self, instance, value):
>>>         ...

Using this implementation of the Grade descriptor, everything works as expected:

>>> class Exam:
>>>         math_grade = Grade()
>>>         writing_grade = Grade()
>>>         science_grade = Grade()
>>>
>>> first_exam = Exam()
>>> first_exam.writing_grade = 82
>>> second_exam = Exam()
>>> second_exam.writing_grade = 75
>>> print(f'First  {first_exam.writing_grade} is right')
>>> print(f'Second {second_exam.writing_grade} is right')
First  None is right
Second None is right

6.3.1. Things to Remember

✦ Reuse the behavior and validation of @property methods by defining your own descriptor classes.

✦ Use WeakKeyDictionary to ensure that your descriptor classes don’t cause memory leaks.

✦ Don’t get bogged down trying to understand exactly how __getattribute__ uses the descriptor protocol for getting and setting attributes.