Object Behavior and Special Methods
Objects in Python are generally classified according to their behaviors and the features that they implement. For example, all of the sequence types such as strings, lists, and tuples are grouped together merely because they all happen to support a common set of sequence operations such as s[n], len(s), etc. All basic interpreter operations are implemented through special object methods. The names of special methods are always preceded and followed by double underscores (__). These methods are automatically triggered by the interpreter as a program executes. For example, the operation x + y is mapped to an internal method, x.__add__(y), and an indexing operation, x[k], is mapped to x.__getitem__(k). The behavior of each data type depends entirely on the set of special methods that it implements.
User-defined classes can define new objects that behave like the built-in types simply by supplying an appropriate subset of the special methods described in this section. In addition, built-in types such as lists and dictionaries can be specialized (via inheritance) by redefining some of the special methods.
The next few sections describe the special methods associated with different categories of interpreter features.
Object Creation and Destruction
The methods in Table 3.11 create, initialize, and destroy instances. __new__() is a class method that is called to create an instance. The __init__() method initializes the attributes of an object and is called immediately after an object has been newly created. The __del__() method is invoked when an object is about to be destroyed. This method is invoked only when an object is no longer in use. It’s important to note that the statement del x only decrements an object’s reference count and doesn’t necessarily result in a call to this function. Further details about these methods can be found in Chapter 7.
Table 3.11 Special Methods for Object Creation and Destruction
Method |
Description |
__new__(cls [,*args [,**kwargs]]) |
A class method called to create a new instance |
__init__(self [,*args [,**kwargs]]) |
Called to initialize a new instance |
__del__(self) |
Called when an instance is being destroyed |
The __new__() and __init__() methods are used together to create and initialize new instances. When an object is created by calling A(args), it is translated into the following steps:
x = A._ _new_ _(A,args) is isinstance(x,A): x._ _init_ _(args)
In user-defined objects, it is rare to define __new__() or __del__(). __new__() is usually only defined in metaclasses or in user-defined objects that happen to inherit from one of the immutable types (integers, strings, tuples, and so on). __del__() is only defined in situations in which there is some kind of critical resource management issue, such as releasing a lock or shutting down a connection.
Object String Representation
The methods in Table 3.12 are used to create various string representations of an object.
Table 3.12 Special Methods for Object Representation
Method |
Description |
__format__(self, format_spec) |
Creates a formatted representation |
__repr__(self) |
Creates a string representation of an object |
__str__(self) |
Creates a simple string representation |
The __repr__() and __str__() methods create simple string representations of an object. The __repr__() method normally returns an expression string that can be evaluated to re-create the object. This is also the method responsible for creating the output of values you see when inspecting variables in the interactive interpreter. This method is invoked by the built-in repr() function. Here’s an example of using repr() and eval() together:
a = [2,3,4,5] # Create a list s = repr(a) # s = '[2, 3, 4, 5]' b = eval(s) # Turns s back into a list
If a string expression cannot be created, the convention is for __repr__() to return a string of the form <...message...>, as shown here:
f = open("foo") a = repr(f) # a = "<open file 'foo', mode 'r' at dc030>"
The __str__() method is called by the built-in str() function and by functions related to printing. It differs from __repr__() in that the string it returns can be more concise and informative to the user. If this method is undefined, the __repr__() method is invoked.
The __format__() method is called by the format() function or the format() method of strings. The format_spec argument is a string containing the format specification. This string is the same as the format_spec argument to format(). For example:
format(x,"spec") # Calls x._ _format_ _("spec") "x is {0:spec}".format(x) # Calls x._ _format_ _("spec")
The syntax of the format specification is arbitrary and can be customized on an object-by-object basis. However, a standard syntax is described in Chapter 4.
Object Comparison and Ordering
Table 3.13 shows methods that can be used to perform simple tests on an object. The __bool__() method is used for truth-value testing and should return True or False. If undefined, the __len__() method is a fallback that is invoked to determine truth. The __hash__() method is defined on objects that want to work as keys in a dictionary. The value returned is an integer that should be identical for two objects that compare as equal. Furthermore, mutable objects should not define this method; any changes to an object will alter the hash value and make it impossible to locate an object on subsequent dictionary lookups.
Table 3.13 Special Methods for Object Testing and Hashing
Method |
Description |
_ _bool_ _(self) |
Returns False or True for truth-value testing |
_ _hash_ _(self) |
Computes an integer hash index |
Objects can implement one or more of the relational operators (<, >, <=, >=, ==, !=). Each of these methods takes two arguments and is allowed to return any kind of object, including a Boolean value, a list, or any other Python type. For instance, a numerical package might use this to perform an element-wise comparison of two matrices, returning a matrix with the results. If a comparison can’t be made, these functions may also raise an exception. Table 3.14 shows the special methods for comparison operators.
Table 3.14 Methods for Comparisons
Method |
Result |
_ _lt_ _(self,other) |
self < other |
_ _le_ _(self,other) |
self <= other |
_ _gt_ _(self,other) |
self > other |
_ _ge_ _(self,other) |
self >= other |
_ _eq_ _(self,other) |
self == other |
_ _ne_ _(self,other) |
self != other |
It is not necessary for an object to implement all of the operations in Table 3.14. However, if you want to be able to compare objects using == or use an object as a dictionary key, the __eq__() method should be defined. If you want to be able to sort objects or use functions such as min() or max(), then __lt__() must be minimally defined.
Type Checking
The methods in Table 3.15 can be used to redefine the behavior of the type checking functions isinstance() and issubclass(). The most common application of these methods is in defining abstract base classes and interfaces, as described in Chapter 7.
Table 3.15 Methods for Type Checking
Method |
Result |
_ _instancecheck_ _(cls,object) |
isinstance(object, cls) |
_ _subclasscheck_ _(cls, sub) |
issubclass(sub, cls) |
Attribute Access
The methods in Table 3.16 read, write, and delete the attributes of an object using the dot (.) operator and the del operator, respectively.
Table 3.16 Special Methods for Attribute Access
Method |
Description |
_ _getattribute_ _(self,name) |
Returns the attribute self.name. |
_ _getattr_ _(self, name) |
Returns the attribute self.name if not found through normal attribute lookup or raise AttributeError. |
_ _setattr_ _(self, name, value) |
Sets the attribute self.name = value. Overrides the default mechanism. |
_ _delattr_ _(self, name) |
Deletes the attribute self.name. |
Whenever an attribute is accessed, the __getattribute__() method is always invoked. If the attribute is located, it is returned. Otherwise, the __getattr__() method is invoked. The default behavior of __getattr__() is to raise an AttributeError exception. The __setattr__() method is always invoked when setting an attribute, and the __delattr__() method is always invoked when deleting an attribute.
Attribute Wrapping and Descriptors
A subtle aspect of attribute manipulation is that sometimes the attributes of an object are wrapped with an extra layer of logic that interact with the get, set, and delete operations described in the previous section. This kind of wrapping is accomplished by creating a descriptor object that implements one or more of the methods in Table 3.17. Keep in mind that descriptions are optional and rarely need to be defined.
Table 3.17 Special Methods for Descriptor Object
Method |
Description |
_ _get_ _(self,instance,cls) |
Returns an attribute value or raises AttributeError |
_ _set_ _(self,instance,value) |
Sets the attribute to value |
_ _delete_ _(self,instance) |
Deletes the attribute |
The __get__(), __set__(), and __delete__() methods of a descriptor are meant to interact with the default implementation of __getattribute__(), __setattr__(), and __delattr__() methods on classes and types. This interaction occurs if you place an instance of a descriptor object in the body of a user-defined class. In this case, all access to the descriptor attribute will implicitly invoke the appropriate method on the descriptor object itself. Typically, descriptors are used to implement the low-level functionality of the object system including bound and unbound methods, class methods, static methods, and properties. Further examples appear in Chapter 7.
Sequence and Mapping Methods
The methods in Table 3.18 are used by objects that want to emulate sequence and mapping objects.
Table 3.18 Methods for Sequences and Mappings
Method |
Description |
_ _len_ _(self) |
Returns the length of self |
_ _getitem_ _(self, key) |
Returns self[key] |
_ _setitem_ _(self, key, value) |
Sets self[key] = value |
_ _delitem_ _(self, key) |
Deletes self[key] |
_ _contains_ _(self,obj) |
Returns True if obj is in self; otherwise, returns False |
Here’s an example:
a = [1,2,3,4,5,6] len(a) # a.__len__() x = a[2] # x = a.__getitem__(2) a[1] = 7 # a.__setitem__(1,7) del a[2] # a.__delitem__(2) 5 in a # a.__contains__(5)
The _ _len_ _ method is called by the built-in len() function to return a nonnegative length. This function also determines truth values unless the __bool__() method has also been defined.
For manipulating individual items, the __getitem__() method can return an item by key value. The key can be any Python object but is typically an integer for sequences. The __setitem__() method assigns a value to an element. The __delitem__() method is invoked whenever the del operation is applied to a single element. The __contains__() method is used to implement the in operator.
The slicing operations such as x = s[i:j] are also implemented using __getitem__(), __setitem__(), and __delitem__(). However, for slices, a special slice object is passed as the key. This object has attributes that describe the range of the slice being requested. For example:
a = [1,2,3,4,5,6] x = a[1:5] # x = a.__getitem__(slice(1,5,None)) a[1:3] = [10,11,12] # a.__setitem__(slice(1,3,None), [10,11,12]) del a[1:4] # a.__delitem__(slice(1,4,None))
The slicing features of Python are actually more powerful than many programmers realize. For example, the following variations of extended slicing are all supported and might be useful for working with multidimensional data structures such as matrices and arrays:
a = m[0:100:10] # Strided slice (stride=10) b = m[1:10, 3:20] # Multidimensional slice c = m[0:100:10, 50:75:5] # Multiple dimensions with strides m[0:5, 5:10] = n # extended slice assignment del m[:10, 15:] # extended slice deletion
The general format for each dimension of an extended slice is i:j[:stride], where stride is optional. As with ordinary slices, you can omit the starting or ending values for each part of a slice. In addition, the ellipsis (written as ...) is available to denote any number of trailing or leading dimensions in an extended slice:
a = m[..., 10:20] # extended slice access with Ellipsis m[10:20, ...] = n
When using extended slices, the _ _getitem__(), __setitem__(), and __delitem__() methods implement access, modification, and deletion, respectively. However, instead of an integer, the value passed to these methods is a tuple containing a combination of slice or Ellipsis objects. For example,
a = m[0:10, 0:100:5, ...]
invokes __getitem__() as follows:
a = m.__getitem__((slice(0,10,None), slice(0,100,5), Ellipsis))
Python strings, tuples, and lists currently provide some support for extended slices, which is described in Chapter 4. Special-purpose extensions to Python, especially those with a scientific flavor, may provide new types and objects with advanced support for extended slicing operations.
Iteration
If an object, obj, supports iteration, it must provide a method, obj.__iter__(), that returns an iterator object. The iterator object iter, in turn, must implement a single method, iter.next() (or iter._ _next_ _() in Python 3), that returns the next object or raises StopIteration to signal the end of iteration. Both of these methods are used by the implementation of the for statement as well as other operations that implicitly perform iteration. For example, the statement for x in s is carried out by performing steps equivalent to the following:
_iter = s.__iter__() while 1: try: x = _iter.next()(#_iter._ _next_ _() in Python 3) except StopIteration: break # Do statements in body of for loop ...
Mathematical Operations
Table 3.19 lists special methods that objects must implement to emulate numbers. Mathematical operations are always evaluated from left to right according the precedence rules described in Chapter 4; when an expression such as x + y appears, the interpreter tries to invoke the method x.__add__(y). The special methods beginning with r support operations with reversed operands. These are invoked only if the left operand doesn’t implement the specified operation. For example, if x in x + y doesn’t support the __add__() method, the interpreter tries to invoke the method y.__radd__(x).
Table 3.19 Methods for Mathematical Operations
Method |
Result |
_ _add_ _(self,other) |
self + other |
_ _sub_ _(self,other) |
self - other |
_ _mul_ _(self,other) |
self * other |
_ _div_ _(self,other) |
self / other (Python 2 only) |
_ _truediv_ _(self,other) |
self / other (Python 3) |
_ _floordiv_ _(self,other) |
self // other |
_ _mod_ _(self,other) |
self % other |
_ _divmod_ _(self,other) |
divmod(self,other) |
_ _pow_ _(self,other [,modulo]) |
self ** other, pow(self, other, modulo) |
_ _lshift_ _(self,other) |
self << other |
_ _rshift_ _(self,other) |
self >> other |
_ _and_ _(self,other) |
self & other |
_ _or_ _(self,other) |
self | other |
_ _xor_ _(self,other) |
self ^ other |
_ _radd_ _(self,other) |
other + self |
_ _rsub_ _(self,other) |
other - self |
_ _rmul_ _(self,other) |
other * self |
_ _rdiv_ _(self,other) |
other / self (Python 2 only) |
_ _rtruediv_ _(self,other) |
other / self (Python 3) |
_ _rfloordiv_ _(self,other) |
other // self |
_ _rmod_ _(self,other) |
other % self |
_ _rdivmod_ _(self,other) |
divmod(other,self) |
_ _rpow_ _(self,other) |
other ** self |
_ _rlshift_ _(self,other) |
other << self |
_ _rrshift_ _(self,other) |
other >> self |
_ _rand_ _(self,other) |
other & self |
_ _ror_ _(self,other) |
other | self |
_ _rxor_ _(self,other) |
other ^ self |
_ _iadd_ _(self,other) |
self += other |
_ _isub_ _(self,other) |
self -= other |
_ _imul_ _(self,other) |
self *= other |
_ _idiv_ _(self,other) |
self /= other (Python 2 only) |
_ _itruediv_ _(self,other) |
self /= other (Python 3) |
_ _ifloordiv_ _(self,other) |
self //= other |
_ _imod_ _(self,other) |
self %= other |
_ _ipow_ _(self,other) |
self **= other |
_ _iand_ _(self,other) |
self &= other |
_ _ior_ _(self,other) |
self |= other |
_ _ixor_ _(self,other) |
self ^= other |
_ _ilshift_ _(self,other) |
self <<= other |
_ _irshift_ _(self,other) |
self >>= other |
_ _neg_ _(self) |
-self |
_ _pos_ _(self) |
+self |
_ _abs_ _(self) |
abs(self) |
_ _invert_ _(self) |
~self |
_ _int_ _(self) |
int(self) |
_ _long_ _(self) |
long(self) (Python 2 only) |
_ _float_ _(self) |
float(self) |
_ _complex_ _(self) |
complex(self) |
The methods __iadd__(), __isub__(), and so forth are used to support in-place arithmetic operators such as a+=b and a-=b (also known as augmented assignment). A distinction is made between these operators and the standard arithmetic methods because the implementation of the in-place operators might be able to provide certain customizations such as performance optimizations. For instance, if the self parameter is not shared, the value of an object could be modified in place without having to allocate a newly created object for the result.
The three flavors of division operators—__div__(), __truediv__(), and __floordiv__()—are used to implement true division (/) and truncating division (//) operations. The reasons why there are three operations deal with a change in the semantics of integer division that started in Python 2.2 but became the default behavior in Python 3. In Python 2, the default behavior of Python is to map the / operator to __div__(). For integers, this operation truncates the result to an integer. In Python 3, division is mapped to __truediv__() and for integers, a float is returned. This latter behavior can be enabled in Python 2 as an optional feature by including the statement from __future__ import division in a program.
The conversion methods __int__(), __long__(), __float__(), and __complex__() convert an object into one of the four built-in numerical types. These methods are invoked by explicit type conversions such as int() and float(). However, these methods are not used to implicitly coerce types in mathematical operations. For example, the expression 3 + x produces a TypeError even if x is a user-defined object that defines __int__() for integer conversion.
Callable Interface
An object can emulate a function by providing the __call__(self [,*args [, **kwargs]]) method. If an object, x, provides this method, it can be invoked like a function. That is, x(arg1, arg2, ...) invokes x.__call__(self, arg1, arg2, ...). Objects that emulate functions can be useful for creating functors or proxies. Here is a simple example:
class DistanceFrom(object): def __init__(self,origin): self.origin = origin def __call__(self, x): return abs(x - self.origin) nums = [1, 37, 42, 101, 13, 9, -20] nums.sort(key=DistanceFrom(10)) # Sort by distance from 10
In this example, the DistanceFrom class creates instances that emulate a single-argument function. These can be used in place of a normal function—for instance, in the call to sort() in the example.
Context Management Protocol
The with statement allows a sequence of statements to execute under the control of another object known as a context manager. The general syntax is as follows:
with context [ as var]: statements
The context object shown here is expected to implement the methods shown in Table 3.20. The __enter__() method is invoked when the with statement executes. The value returned by this method is placed into the variable specified with the optional as var specifier. The __exit__() method is called as soon as control-flow leaves from the block of statements associated with the with statement. As arguments, __exit__() receives the current exception type, value, and traceback if an exception has been raised. If no errors are being handled, all three values are set to None.
Table 3.20 Special Methods for Context Managers
Method |
Description |
_ _enter_ _(self) |
Called when entering a new context. The return value is placed in the variable listed with the as specifier to the with statement. |
_ _exit_ _(self, type, value, tb) |
Called when leaving a context. If an exception occurred, type, value, and tb have the exception type, value, and traceback information. The primary use of the context management interface is to allow for simplified resource control on objects involving system state such as open files, network connections, and locks. By implementing this interface, an object can safely clean up resources when execution leaves a context in which an object is being used. Further details are found in Chapter 5, “Program Structure and Control Flow.” |
Object Inspection and dir()
The dir() function is commonly used to inspect objects. An object can supply the list of names returned by dir() by implementing __dir__(self). Defining this makes it easier to hide the internal details of objects that you don’t want a user to directly access. However, keep in mind that a user can still inspect the underlying __dict__ attribute of instances and classes to see everything that is defined.