Special Methods
All the built-in data types implement a collection of special object methods. The names of special methods are always preceded and followed by double underscores (__). These methods are automatically triggered by the interpreter as a program executes. For example, the operation x + y is mapped to an internal method, x.__add__(y), and an indexing operation, x[k], is mapped to x.__getitem__(k). The behavior of each data type depends entirely on the set of special methods that it implements.
User-defined classes can define new objects that behave like the built-in types simply by supplying an appropriate subset of the special methods described in this section. In addition, built-in types such as lists and dictionaries can be specialized (via inheritance) by redefining some of the special methods.
Object Creation, Destruction, and Representation
The methods in Table 3.9 create, initialize, destroy, and represent objects. __new__() is a static method that is called to create an instance (although this method is rarely redefined). The __init__() method initializes the attributes of an object and is called immediately after an object has been newly created. The __del__() method is invoked when an object is about to be destroyed. This method is invoked only when an object is no longer in use. It’s important to note that the statement del x only decrements an object’s reference count and doesn’t necessarily result in a call to this function. Further details about these methods can be found in Chapter 7.
Table 3.9 Special Methods for Object Creation, Destruction, and Representation
Method |
Description |
__new__(cls [,*args [,**kwargs]]) |
A static method called to create a new instance |
__init__(self [,*args [,**kwargs]]) |
Called to initialize a new instance |
__del__(self) |
Called to destroy an instance |
__repr__(self) |
Creates a full string representation of an object |
__str__(self) |
Creates an informal string representation |
__cmp__(self,other) |
Compares two objects and returns negative, zero, or positive |
__hash__(self) |
Computes a 32-bit hash index |
__nonzero__(self) |
Returns 0 or 1 for truth-value testing |
__unicode__(self) |
Creates a Unicode string representation |
The __new__() and __init__() methods are used to create and initialize new instances. When an object is created by calling A(args), it is translated into the following steps:
x = A.__new__(A,args) is isinstance(x,A): x.__init__(args)
The __repr__() and __str__() methods create string representations of an object. The __repr__() method normally returns an expression string that can be evaluated to re-create the object. This method is invoked by the built-in repr() function and by the backquotes operator (´). For example:
a = [2,3,4,5] # Create a list s = repr(a) # s = '[2, 3, 4, 5]' # Note : could have also used s = ´a´ b = eval(s) # Turns s back into a list
If a string expression cannot be created, the convention is for __repr__() to return a string of the form <...message...>, as shown here:
f = open("foo") a = repr(f) # a = "<open file ‘foo’, mode ‘r’ at dc030>"
The __str__() method is called by the built-in str() function and by the print statement. It differs from __repr__() in that the string it returns can be more concise and informative to the user. If this method is undefined, the __repr__() method is invoked.
The __cmp__(self,other) method is used by all the comparison operators. It returns a negative number if self < other, zero if self == other, and positive if self > other. If this method is undefined for an object, the object will be compared by object identity. In addition, an object may define an alternative set of comparison functions for each of the relational operators. These are known as rich comparisons and are described shortly. The __nonzero__() method is used for truth-value testing and should return 0 or 1 (or True or False). If undefined, the __len__() method is invoked to determine truth.
Finally, the __hash__() method computes an integer hash key used in dictionary operations (the hash value can also be returned using the built-in function hash()). The value returned should be identical for two objects that compare as equal. Further-more, mutable objects should not define this method; any changes to an object will alter the hash value and make it impossible to locate an object on subsequent dictionary lookups. An object should not define a __hash__() method without also defining __cmp__().
Attribute Access
The methods in Table 3.10 read, write, and delete the attributes of an object using the dot (.) operator and the del operator, respectively.
Table 3.10 Special Methods for Attribute Access
Method |
Description |
__getattribute__(self,name) |
Returns the attribute self.name. |
__getattr__(self, name) |
Returns the attribute self.name if not found through normal attribute lookup. |
__setattr__(self, name, value) |
Sets the attribute self.name = value. Overrides the default mechanism. |
__delattr__(self, name) |
Deletes the attribute self.name. |
An example will illustrate:
class Foo(object): def __init__(self): self.x = 37 f = Foo() a = f.x # Invokes __getattribute__(f,"x") b = f.y # Invokes __getattribute__(f,"y") --> Not found # Then invokes __getattr__(f,"y") f.x = 42 # Invokes __setattr__(f,"x",42) f.y = 93 # Invokes __setattr__(f,"y",93) del f.y # Invokes __delattr__(f,"y")
Whenever an attribute is accessed, the __getattribute__() method is always invoked. If the attribute is located, it is returned. Otherwise, the __getattr__() method is invoked. The default behavior of __getattr__() is to raise an AttributeError exception. The __setattr__() method is always invoked when setting an attribute, and the __delattr__() method is always invoked when deleting an attribute.
A subtle aspect of attribute access concerns a special kind of attribute known as a descriptor. A descriptor is an object that implements one or more of the methods in Table 3.11.
Table 3.11 Special Methods for Descriptor Attributes
Method |
Description |
__get__(self,instance,owner) |
Returns an attribute value or raises AttributeError |
__set__(self,instance,value) |
Sets the attribute to value |
__delete__(self,instance) |
Deletes the attribute |
Essentially, a descriptor attribute knows how to compute, set, and delete its own value whenever it is accessed. Typically, it is used to provide advanced features of classes such as static methods and properties. For example:
class SimpleProperty(object): def __init__(self,fget,fset): self.fget = fget self.fset = fset def __get__(self,instance,cls): return self.fget(instance) # Calls instance.fget() def __set__(self,instance,value) return self.fset(instance,value) # Calls instance.fset(value) class Circle(object): def __init__(self,radius): self.radius = radius def getArea(self): return math.pi*self.radius**2 def setArea(self): self.radius = math.sqrt(area/math.pi) area = SimpleProperty(getArea,setArea)
In this example, the class SimpleProperty defines a descriptor in which two functions, fget and fset, are supplied by the user to get and set the value of an attribute (note that a more advanced version of this is already provided using the property() function described in Chapter 7). In the Circle class that follows, these functions are used to create a descriptor attribute called area. In subsequent code, the area attribute is accessed transparently.
c = Circle(10) a = c.area # Implicitly calls c.getArea() c.area = 10.0 # Implicitly calls c.setArea(10.0)
Underneath the covers, access to the attribute c.area is being translated into an operation such as Circle.__dict__[‘area’].__get__(c,Circle).
It is important to emphasize that descriptors can only be created at the class level. It is not legal to create descriptors on a per-instance basis by defining descriptor objects inside __init__() and other methods.
Sequence and Mapping Methods
The methods in Table 3.12 are used by objects that want to emulate sequence and mapping objects.
Table 3.12 Methods for Sequences and Mappings
Method |
Description |
__len__(self) |
Returns the length of self |
__getitem__(self, key) |
Returns self[key] |
__setitem__(self, key, value) |
Sets self[key] = value |
__delitem__(self, key) |
Deletes self[key] |
__getslice__(self,i,j) |
Returns self[i:j] |
__setslice__(self,i,j,s) |
Sets self[i:j] = s |
__delslice__(self,i,j) |
Deletes self[i:j] |
__contains__(self,obj) |
Returns True if obj is in self; otherwise, returns False |
Here's an example:
a = [1,2,3,4,5,6] len(a) # __len__(a) x = a[2] # __getitem__(a,2) a[1] = 7 # __setitem__(a,1,7) del a[2] # __delitem__(a,2) x = a[1:5] # __getslice__(a,1,5) a[1:3] = [10,11,12] # __setslice__(a,1,3,[10,11,12]) del a[1:4] # __delslice__(a,1,4)
The __len__ method is called by the built-in len() function to return a nonnegative length. This function also determines truth values unless the __nonzero__() method has also been defined.
For manipulating individual items, the __getitem__() method can return an item by key value. The key can be any Python object, but is typically an integer for sequences. The __setitem__() method assigns a value to an element. The __delitem__() method is invoked whenever the del operation is applied to a single element.
The slicing methods support the slicing operator s[i:j]. The __getslice__() method returns a slice, which is normally the same type of sequence as the original object. The indices i and j must be integers, but their interpretation is up to the method. Missing values for i and j are replaced with 0 and sys.maxint, respectively. The __setslice__() method assigns values to a slice. Similarly, __delslice__() deletes all the elements in a slice.
The __contains__() method is used to implement the in operator.
In addition to implementing the methods just described, sequences and mappings implement a number of mathematical methods, including __add__(), __radd__(), __mul__(), and __rmul__() to support concatenation and sequence replication. These methods are described shortly.
Finally, Python supports an extended slicing operation that's useful for working with multidimensional data structures such as matrices and arrays. Syntactically, you specify an extended slice as follows:
a = m[0:100:10] # Strided slice (stride=10) b = m[1:10, 3:20] # Multidimensional slice c = m[0:100:10, 50:75:5] # Multiple dimensions with strides m[0:5, 5:10] = n # extended slice assignment del m[:10, 15:] # extended slice deletion
The general format for each dimension of an extended slice is i:j[:stride], where stride is optional. As with ordinary slices, you can omit the starting or ending values for each part of a slice. In addition, a special object known as the Ellipsis and written as ... is available to denote any number of trailing or leading dimensions in an extended slice:
a = m[..., 10:20] # extended slice access with Ellipsis m[10:20, ...] = n
When using extended slices, the __getitem__(), __setitem__(), and __delitem__() methods implement access, modification, and deletion, respectively. However, instead of an integer, the value passed to these methods is a tuple containing one or more slice objects and at most one instance of the Ellipsis type. For example,
a = m[0:10, 0:100:5, ...]
invokes __getitem__() as follows:
a = __getitem__(m, (slice(0,10,None), slice(0,100,5), Ellipsis))
Python strings, tuples, and lists currently provide some support for extended slices, which is described in Chapter 4. Special-purpose extensions to Python, especially those with a scientific flavor, may provide new types and objects with advanced support for extended slicing operations.
Iteration
If an object, obj, supports iteration, it must provide a method, obj.__iter__(), that returns an iterator object. The iterator object iter, in turn, must implement a single method, iter.next(), that returns the next object or raises StopIteration to signal the end of iteration. Both of these methods are used by the implementation of the for statement as well as other operations that implicitly perform iteration. For example, the statement for x in s is carried out by performing steps equivalent to the following:
_iter = s.__iter__() while 1: try: x = _iter.next() except StopIteration: break # Do statements in body of for loop ...
Mathematical Operations
Table 3.13 lists special methods that objects must implement to emulate numbers. Mathematical operations are always evaluated from left to right; when an expression such as x + y appears, the interpreter tries to invoke the method x.__add__(y). The special methods beginning with r support operations with reversed operands. These are invoked only if the left operand doesn’t implement the specified operation. For example, if x in x + y doesn’t support the __add__() method, the interpreter tries to invoke the method y.__radd__(x).
Table 3.13 Methods for Mathematical Operations
Method |
Result |
__add__(self,other) |
self + other |
__sub__(self,other) |
self - other |
__mul__(self,other) |
self * other |
__div__(self,other) |
self / other |
__truediv__(self,other) |
self / other (future) |
__floordiv__(self,other) |
self // other |
__mod__(self,other) |
self % other |
__divmod__(self,other) |
divmod(self,other) |
__pow__(self,other [,modulo]) |
self ** other, pow(self, other, modulo) |
__lshift__(self,other) |
self << other |
__rshift__(self,other) |
self >> other |
__and__(self,other) |
self & other |
__or__(self,other) |
self | other |
__xor__(self,other) |
self ^ other |
__radd__(self,other) |
other + self |
__rsub__(self,other) |
other - self |
__rmul__(self,other) |
other * self |
__rdiv__(self,other) |
other / self |
__rtruediv__(self,other) |
other / self (future) |
__rfloordiv__(self,other) |
other // self |
__rmod__(self,other) |
other % self |
__rdivmod__(self,other) |
divmod(other,self) |
__rpow__(self,other) |
other ** self |
__rlshift__(self,other) |
other << self |
__rrshift__(self,other) |
other >> self |
__rand__(self,other) |
other & self |
__ror__(self,other) |
other | self |
__rxor__(self,other) |
other ^ self |
__iadd__(self,other) |
self += other |
__isub__(self,other) |
self -= other |
__imul__(self,other) |
self *= other |
__idiv__(self,other) |
self /= other |
__itruediv__(self,other) |
self /= other (future) |
__ifloordiv__(self,other) |
self //= other |
__imod__(self,other) |
self %= other |
__ipow__(self,other) |
self **= other |
__iand__(self,other) |
self &= other |
__ior__(self,other) |
self |= other |
__ixor__(self,other) |
self ^= other |
__ilshift__(self,other) |
self <<= other |
__irshift__(self,other) |
self >>= other |
__neg__(self) |
–self |
__pos__(self) |
+self |
__abs__(self) |
abs(self) |
__invert__(self) |
~self |
__int__(self) |
int(self) |
__long__(self) |
long(self) |
__float__(self) |
float(self) |
__complex__(self) |
complex(self) |
__oct__(self) |
oct(self) |
__hex__(self) |
hex(self) |
__coerce__(self,other) |
Type coercion |
The methods __iadd__(), __isub__(), and so forth are used to support in-place arithmetic operators such as a+=b and a-=b (also known as augmented assignment). A distinction is made between these operators and the standard arithmetic methods because the implementation of the in-place operators might be able to provide certain customizations such as performance optimizations. For instance, if the self parameter is not shared, it might be possible to modify its value in place without having to allocate a newly created object for the result.
The three flavors of division operators, __div__(), __truediv__(), and __floordiv__(), are used to implement true division (/) and truncating division (//) operations. The separation of division into two types of operators is a relatively recent change to Python that was started in Python 2.2, but which has far-reaching effects. As of this writing, the default behavior of Python is to map the / operator to __div__(). In the future, it will be remapped to __truediv__(). This latter behavior can currently be enabled as an optional feature by including the statement from __future__ import division in a program.
The conversion methods __int__(), __long__(), __float__(), and __complex__() convert an object into one of the four built-in numerical types. The __oct__() and __hex__() methods return strings representing the octal and hexadecimal values of an object, respectively.
The __coerce__(x,y) method is used in conjunction with mixed-mode numerical arithmetic. This method returns either a 2-tuple containing the values of x and y converted to a common numerical type, or NotImplemented (or None) if no such conversion is possible. To evaluate the operation x op y, where op is an operation such as +, the following rules are applied, in order:
If x has a __coerce__() method, replace x and y with the values returned by x.__coerce__(y). If None is returned, skip to step 3.
If x has a method __op__(), return x.__op__(y). Otherwise, restore x and y to their original values and continue.
If y has a __coerce__() method, replace x and y with the values returned by y.__coerce__(x). If None is returned, raise an exception.
If y has a method __rop__(), return y.__rop__(x). Otherwise, raise an exception.
Although strings define a few arithmetic operations, the __coerce__() method is not used in mixed-string operations involving standard and Unicode strings.
The interpreter supports only a limited number of mixed-type operations involving the built-in types, in particular the following:
If x is a string, x % y invokes the string-formatting operation, regardless of the type of y.
If x is a sequence, x + y invokes sequence concatenation.
If either x or y is a sequence and the other operand is an integer, x * y invokes sequence repetition.
Comparison Operations
Table 3.14 lists special methods that objects can implement to provide individualized versions of the relational operators (<, >, <=, >=, ==, !=). These are known as rich comparisons. Each of these functions takes two arguments and is allowed to return any kind of object, including a Boolean value, a list, or any other Python type. For instance, a numerical package might use this to perform an element-wise comparison of two matrices, returning a matrix with the results. If a comparison can’t be made, these functions may also raise an exception.
Table 3.14 Methods for Comparisons
Method |
Result |
__lt__(self,other) |
self < other |
__le__(self,other) |
self <= other |
__gt__(self,other) |
self > other |
__ge__(self,other) |
self >= other |
__eq__(self,other) |
self == other |
__ne__(self,other) |
self != other |
Callable Objects
Finally, an object can emulate a function by providing the __call__(self [,*args [, **kwargs]]) method. If an object, x, provides this method, it can be invoked like a function. That is, x(arg1, arg2, ...) invokes x.__call__(self, arg1, arg2, ...).