- Item 37: Compose Classes Instead of Nesting Many Levels of Built-in Types
- Item 38: Accept Functions Instead of Classes for Simple Interfaces
- Item 39: Use @classmethod Polymorphism to Construct Objects Generically
- Item 40: Initialize Parent Classes with super
- Item 41: Consider Composing Functionality with Mix-in Classes
- Item 42: Prefer Public Attributes Over Private Ones
- Item 43: Inherit from collections.abc for Custom Container Types
Item 42: Prefer Public Attributes Over Private Ones
In Python, there are only two types of visibility for a class’s attributes: public and private:
class MyObject: def __init__(self): self.public_field = 5 self.__private_field = 10 def get_private_field(self): return self.__private_field
Public attributes can be accessed by anyone using the dot operator on the object:
foo = MyObject() assert foo.public_field == 5
Private fields are specified by prefixing an attribute’s name with a double underscore. They can be accessed directly by methods of the containing class:
assert foo.get_private_field() == 10
However, directly accessing private fields from outside the class raises an exception:
foo.__private_field >>> Traceback ... AttributeError: 'MyObject' object has no attribute ➥'__private_field'
Class methods also have access to private attributes because they are declared within the surrounding class block:
class MyOtherObject: def __init__(self): self.__private_field = 71 @classmethod def get_private_field_of_instance(cls, instance): return instance.__private_field bar = MyOtherObject() assert MyOtherObject.get_private_field_of_instance(bar) == 71
As you’d expect with private fields, a subclass can’t access its parent class’s private fields:
class MyParentObject: def __init__(self): self.__private_field = 71 class MyChildObject(MyParentObject): def get_private_field(self): return self.__private_field baz = MyChildObject() baz.get_private_field() >>> Traceback ... AttributeError: 'MyChildObject' object has no attribute ➥'_MyChildObject__private_field'
The private attribute behavior is implemented with a simple transformation of the attribute name. When the Python compiler sees private attribute access in methods like MyChildObject.get_private_field, it translates the __private_field attribute access to use the name _MyChildObject__private_field instead. In the example above, __private_field is only defined in MyParentObject.__init__, which means the private attribute’s real name is _MyParentObject__private_field. Accessing the parent’s private attribute from the child class fails simply because the transformed attribute name doesn’t exist (_MyChildObject__private_field instead of _MyParentObject__private_field).
Knowing this scheme, you can easily access the private attributes of any class—from a subclass or externally—without asking for permission:
assert baz._MyParentObject__private_field == 71
If you look in the object’s attribute dictionary, you can see that private attributes are actually stored with the names as they appear after the transformation:
print(baz.__dict__) >>> {'_MyParentObject__private_field': 71}
Why doesn’t the syntax for private attributes actually enforce strict visibility? The simplest answer is one often-quoted motto of Python: “We are all consenting adults here.” What this means is that we don’t need the language to prevent us from doing what we want to do. It’s our individual choice to extend functionality as we wish and to take responsibility for the consequences of such a risk. Python programmers believe that the benefits of being open—permitting unplanned extension of classes by default—outweigh the downsides.
Beyond that, having the ability to hook language features like attribute access (see Item 47: “Use __getattr__, __getattribute__, and __setattr__ for Lazy Attributes”) enables you to mess around with the internals of objects whenever you wish. If you can do that, what is the value of Python trying to prevent private attribute access otherwise?
To minimize damage from accessing internals unknowingly, Python programmers follow a naming convention defined in the style guide (see Item 2: “Follow the PEP 8 Style Guide”). Fields prefixed by a single underscore (like _protected_field) are protected by convention, meaning external users of the class should proceed with caution.
However, many programmers who are new to Python use private fields to indicate an internal API that shouldn’t be accessed by subclasses or externally:
class MyStringClass: def __init__(self, value): self.__value = value def get_value(self): return str(self.__value) foo = MyStringClass(5) assert foo.get_value() == '5'
This is the wrong approach. Inevitably someone—maybe even you—will want to subclass your class to add new behavior or to work around deficiencies in existing methods (e.g., the way that MyStringClass.get_value always returns a string). By choosing private attributes, you’re only making subclass overrides and extensions cumbersome and brittle. Your potential subclassers will still access the private fields when they absolutely need to do so:
class MyIntegerSubclass(MyStringClass): def get_value(self): return int(self._MyStringClass__value) foo = MyIntegerSubclass('5') assert foo.get_value() == 5
But if the class hierarchy changes beneath you, these classes will break because the private attribute references are no longer valid. Here, the MyIntegerSubclass class’s immediate parent, MyStringClass, has had another parent class added, called MyBaseClass:
class MyBaseClass: def __init__(self, value): self.__value = value def get_value(self): return self.__value class MyStringClass(MyBaseClass): def get_value(self): return str(super().get_value()) # Updated class MyIntegerSubclass(MyStringClass): def get_value(self): return int(self._MyStringClass__value) # Not updated
The __value attribute is now assigned in the MyBaseClass parent class, not the MyStringClass parent. This causes the private variable reference self._MyStringClass__value to break in MyIntegerSubclass:
foo = MyIntegerSubclass(5) foo.get_value() >>> Traceback ... AttributeError: 'MyIntegerSubclass' object has no attribute ➥'_MyStringClass__value'
In general, it’s better to err on the side of allowing subclasses to do more by using protected attributes. Document each protected field and explain which fields are internal APIs available to subclasses and which should be left alone entirely. This is as much advice to other programmers as it is guidance for your future self on how to extend your own code safely:
class MyStringClass: def __init__(self, value): # This stores the user-supplied value for the object. # It should be coercible to a string. Once assigned in # the object it should be treated as immutable. self._value = value ...
The only time to seriously consider using private attributes is when you’re worried about naming conflicts with subclasses. This problem occurs when a child class unwittingly defines an attribute that was already defined by its parent class:
class ApiClass: def __init__(self): self._value = 5 def get(self): return self._value class Child(ApiClass): def __init__(self): super().__init__() self._value = 'hello' # Conflicts a = Child() print(f'{a.get()} and {a._value} should be different') >>> hello and hello should be different
This is primarily a concern with classes that are part of a public API; the subclasses are out of your control, so you can’t refactor to fix the problem. Such a conflict is especially possible with attribute names that are very common (like value). To reduce the risk of this issue occurring, you can use a private attribute in the parent class to ensure that there are no attribute names that overlap with child classes:
class ApiClass: def __init__(self): self.__value = 5 # Double underscore def get(self): return self.__value # Double underscore class Child(ApiClass): def __init__(self): super().__init__() self._value = 'hello' # OK! a = Child() print(f'{a.get()} and {a._value} are different') >>> 5 and hello are different
Things to Remember
Private attributes aren’t rigorously enforced by the Python compiler.
Plan from the beginning to allow subclasses to do more with your internal APIs and attributes instead of choosing to lock them out.
Use documentation of protected fields to guide subclasses instead of trying to force access control with private attributes.
Only consider using private attributes to avoid naming conflicts with subclasses that are out of your control.