6.7. Annotate Class Attributes with __set_name__¶
One more useful feature enabled by metaclasses is the ability to modify or annotate properties after a class is defined but before the class is actually used. This approach is commonly used with descriptors (see Item 46: “Use Descriptors for Reusable @property Methods”) to give them more introspection into how they’re being used within their containing class.
For example, say that I want to define a new class that represents a row in a customer database. I’d like to have a corresponding property on the class for each column in the database table. Here, I define a descriptor class to connect attributes to column names:
>>> class Field:
>>> def __init__(self, name):
>>> self.name = name
>>> self.internal_name = '_' + self.name
>>>
>>> def __get__(self, instance, instance_type):
>>> if instance is None:
>>> return self
>>> return getattr(instance, self.internal_name, '')
>>>
>>> def __set__(self, instance, value):
>>> setattr(instance, self.internal_name, value)
With the column name stored in the Field descriptor, I can save all of the per-instance state directly in the instance dictionary as protected fields by using the setattr built-in function, and later I can load state with getattr. At first, this seems to be much more convenient than building descriptors with the weakref built-in module to avoid memory leaks.
Defining the class representing a row requires supplying the database table’s column name for each class attribute:
>>> class Customer:
>>> # Class attributes
>>> first_name = Field('first_name')
>>> last_name = Field('last_name')
>>> prefix = Field('prefix')
>>> suffix = Field('suffix')
Using the class is simple. Here, you can see how the Field descriptors modify the instance dictionary __dict__ as expected:
>>> cust = Customer()
>>> print(f'Before: {cust.first_name!r} {cust.__dict__}')
>>>
>>> cust.first_name = 'Euclid'
>>> print(f'After: {cust.first_name!r} {cust.__dict__}')
Before: '' {}
After: 'Euclid' {'_first_name': 'Euclid'}
But the class definition seems redundant. I already declared the name of the field for the class on the left (‘field_name =’). Why do I also have to pass a string containing the same information to the Field constructor (Field(‘first_name’)) on the right?
>>> class Customer:
>>> # Left side is redundant with right side
>>> first_name = Field('first_name')
>>> ...
The problem is that the order of operations in the Customer class definition is the opposite of how it reads from left to right. First, the Field constructor is called as Field(‘first_name’). Then, the return value of that is assigned to Customer.field_name. There’s no way for a Field instance to know upfront which class attribute it will be assigned to.
To eliminate this redundancy, I can use a metaclass. Metaclasses let you hook the class statement directly and take action as soon as a class body is finished (see Item 48: “Validate Subclasses with __init_subclass__” for details on how they work). In this case, I can use the metaclass to assign Field.name and Field.internal_name on the descriptor automatically instead of manually specifying the field name multiple times:
>>> class Meta(type):
>>> def __new__(meta, name, bases, class_dict):
>>> for key, value in class_dict.items():
>>> if isinstance(value, Field):
>>> value.name = key
>>> value.internal_name = '_' + key
>>> cls = type.__new__(meta, name, bases, class_dict)
>>> return cls
Here, I define a base class that uses the metaclass. All classes representing database rows should inherit from this class to ensure that they use the metaclass:
>>> class DatabaseRow(metaclass=Meta):
>>> pass
To work with the metaclass, the Field descriptor is largely unchanged. The only difference is that it no longer requires arguments to be passed to its constructor. Instead, its attributes are set by the Meta.__new__ method above:
>>> class Field:
>>> def __init__(self):
>>> # These will be assigned by the metaclass.
>>> self.name = None
>>> self.internal_name = None
>>>
>>> def __get__(self, instance, instance_type):
>>> if instance is None:
>>> return self
>>> return getattr(instance, self.internal_name, '')
>>>
>>> def __set__(self, instance, value):
>>> setattr(instance, self.internal_name, value)
By using the metaclass, the new DatabaseRow base class, and the new Field descriptor, the class definition for a database row no longer has the redundancy from before:
>>> class BetterCustomer(DatabaseRow):
>>> first_name = Field()
>>> last_name = Field()
>>> prefix = Field()
>>> suffix = Field()
The behavior of the new class is identical to the behavior of the old one:
cust = BetterCustomer() print(f'Before: {cust.first_name!r} {cust.__dict__}') >>> Before: '' {}
>>> cust.first_name = 'Euler'
>>> print(f'After: {cust.first_name!r} {cust.__dict__}')
After: 'Euler' {'_first_name': 'Euler'}
The trouble with this approach is that you can’t use the Field class for properties unless you also inherit from DatabaseRow. If you somehow forget to subclass DatabaseRow, or if you don’t want to due to other structural requirements of the class hierarchy, the code will break:
>>> class BrokenCustomer:
>>> first_name = Field()
>>> last_name = Field()
>>> prefix = Field()
>>> suffix = Field()
cust = BrokenCustomer() cust.first_name = 'Mersenne'
>>>
Traceback ...
TypeError: attribute name must be string, not 'NoneType'
The solution to this problem is to use the __set_name__ special method for descriptors. This method, introduced in Python 3.6, is called on every descriptor instance when its containing class is defined. It receives as parameters the owning class that contains the descriptor instance and the attribute name to which the descriptor instance was assigned. Here, I avoid defining a metaclass entirely and move what the Meta.__new__ method from above was doing into __set_name__:
>>> class Field:
>>> def __init__(self):
>>> self.name = None
>>> self.internal_name = None
>>>
>>> def __set_name__(self, owner, name):
>>> # Called on class creation for each descriptor
>>> self.name = name
>>> self.internal_name = '_' + name
>>>
>>> def __get__(self, instance, instance_type):
>>> if instance is None:
>>> return self
>>> return getattr(instance, self.internal_name, '')
>>>
>>> def __set__(self, instance, value):
>>> setattr(instance, self.internal_name, value)
Now, I can get the benefits of the Field descriptor without having to inherit from a specific parent class or having to use a metaclass:
>>> class FixedCustomer:
>>> first_name = Field()
>>> last_name = Field()
>>> prefix = Field()
>>> suffix = Field()
>>>
>>> cust = FixedCustomer()
>>> print(f'Before: {cust.first_name!r} {cust.__dict__}')
>>> cust.first_name = 'Mersenne'
>>> print(f'After: {cust.first_name!r} {cust.__dict__}')
Before: '' {}
After: 'Mersenne' {'_first_name': 'Mersenne'}
6.7.1. Things to Remember¶
✦ Metaclasses enable you to modify a class’s attributes before the class is fully defined.
✦ Descriptors and metaclasses make a powerful combination for declarative behavior and runtime introspection.
✦ Define __set_name__ on your descriptor classes to allow them to take into account their surrounding class and its property names.
✦ Avoid memory leaks and the weakref built-in module by having descriptors store data they manipulate directly within a class’s instance dictionary.