Effective Python: 4 Best Practices for Function Arguments
Save 35% off the list price* of the related book or multi-format eBook (EPUB + MOBI + PDF) with discount code ARTICLE.
* See informit.com/terms
Item 18: Reduce Visual Noise with Variable Positional Arguments
Accepting optional positional arguments (often called star args in reference to the conventional name for the parameter, *args) can make a function call more clear and remove visual noise.
For example, say you want to log some debug information. With a fixed number of arguments, you would need a function that takes a message and a list of values.
def
log(message, values):if not
values:else
: values_str=
', '
.join(str
(x)for
xin
values)'%s: %s'
%
(message, values_str)) log('My numbers are'
, [1
,2
]) log('Hi there'
, []) >>> My numbers are: 1, 2 Hi there
Having to pass an empty list when you have no values to log is cumbersome and noisy. It’d be better to leave out the second argument entirely. You can do this in Python by prefixing the last positional parameter name with *. The first parameter for the log message is required, whereas any number of subsequent positional arguments are optional. The function body doesn’t need to change, only the callers do.
def
log(message,*
values):# The only difference
if not
values:else
: values_str=
', '
.join(str
(x)for
xin
values)'%s: %s'
%
(message, values_str)) log('My numbers are'
,1
,2
) log('Hi there'
)# Much better
>>> My numbers are: 1, 2 Hi there
If you already have a list and want to call a variable argument function like log, you can do this by using the * operator. This instructs Python to pass items from the sequence as positional arguments.
favorites=
[7
,33
,99
] log('Favorite colors'
,*
favorites) >>> Favorite colors: 7, 33, 99
There are two problems with accepting a variable number of positional arguments.
The first issue is that the variable arguments are always turned into a tuple before they are passed to your function. This means that if the caller of your function uses the * operator on a generator, it will be iterated until it’s exhausted. The resulting tuple will include every value from the generator, which could consume a lot of memory and cause your program to crash.
def
my_generator():for
iin
range
(10
):yield i
def
my_func(*
args):=
my_generator() my_func(*
it) >>> (0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
Functions that accept *args are best for situations where you know the number of inputs in the argument list will be reasonably small. It’s ideal for function calls that pass many literals or variable names together. It’s primarily for the convenience of the programmer and the readability of the code.
The second issue with *args is that you can’t add new positional arguments to your function in the future without migrating every caller. If you try to add a positional argument in the front of the argument list, existing callers will subtly break if they aren’t updated.
def
log(sequence, message,*
values):if not
values:'%s: %s'
%
(sequence, message))else
: values_str=
', '
.join(str
(x)for
xin
values)'%s: %s: %s'
%
(sequence, message, values_str)) log(1
,'Favorites'
,7
,33
)# New usage is OK
log('Favorite numbers'
,7
,33
)# Old usage breaks
>>> 1: Favorites: 7, 33 Favorite numbers: 7: 33
The problem here is that the second call to log used 7 as the message parameter because a sequence argument wasn’t given. Bugs like this are hard to track down because the code still runs without raising any exceptions. To avoid this possibility entirely, you should use keyword-only arguments when you want to extend functions that accept *args (see Item 21: “Enforce Clarity with Keyword-Only Arguments”).
Things to Remember
- Functions can accept a variable number of positional arguments by using *args in the def statement.
- You can use the items from a sequence as the positional arguments for a function with the * operator.
- Using the * operator with a generator may cause your program to run out of memory and crash.
- Adding new positional parameters to functions that accept *args can introduce hard-to-find bugs.
Item 19: Provide Optional Behavior with Keyword Arguments
Like most other programming languages, calling a function in Python allows for passing arguments by position.
def
remainder(number, divisor):return
number%
divisorassert
remainder(20
,7
)==
6
All positional arguments to Python functions can also be passed by keyword, where the name of the argument is used in an assignment within the parentheses of a function call. The keyword arguments can be passed in any order as long as all of the required positional arguments are specified. You can mix and match keyword and positional arguments. These calls are equivalent:
remainder(20
,7
) remainder(20
, divisor=
7
) remainder(number=
20
, divisor=
7
) remainder(divisor=
7
, number=
20
)
Positional arguments must be specified before keyword arguments.
remainder(number=
20
,7
) >>> SyntaxError: non-keyword arg after keyword arg
Each argument can only be specified once.
remainder(20
, number=
7
) >>> TypeError: remainder() got multiple values for argument 'number'
The flexibility of keyword arguments provides three significant benefits.
The first advantage is that keyword arguments make the function call clearer to new readers of the code. With the call remainder(20, 7), it’s not evident which argument is the number and which is the divisor without looking at the implementation of the remainder method. In the call with keyword arguments, number=20 and divisor=7 make it immediately obvious which parameter is being used for each purpose.
The second impact of keyword arguments is that they can have default values specified in the function definition. This allows a function to provide additional capabilities when you need them but lets you accept the default behavior most of the time. This can eliminate repetitive code and reduce noise.
For example, say you want to compute the rate of fluid flowing into a vat. If the vat is also on a scale, then you could use the difference between two weight measurements at two different times to determine the flow rate.
def
flow_rate(weight_diff, time_diff):return
weight_diff/
time_diff weight_diff=
0.5
time_diff=
3
flow=
flow_rate(weight_diff, time_diff)'%.3f kg per second'
%
flow) >>> 0.167 kg per second
In the typical case, it’s useful to know the flow rate in kilograms per second. Other times, it’d be helpful to use the last sensor measurements to approximate larger time scales, like hours or days. You can provide this behavior in the same function by adding an argument for the time period scaling factor.
def
flow_rate(weight_diff, time_diff, period):return
(weight_diff/
time_diff)*
period
The problem is that now you need to specify the period argument every time you call the function, even in the common case of flow rate per second (where the period is 1).
flow_per_second=
flow_rate(weight_diff, time_diff,1
)
To make this less noisy, I can give the period argument a default value.
def
flow_rate(weight_diff, time_diff, period=
1
):return
(weight_diff/
time_diff)*
period
The period argument is now optional.
flow_per_second=
flow_rate(weight_diff, time_diff) flow_per_hour=
flow_rate(weight_diff, time_diff, period=
3600
)
This works well for simple default values (it gets tricky for complex default values—see Item 20: “Use None and Docstrings to Specify Dynamic Default Arguments”).
The third reason to use keyword arguments is that they provide a powerful way to extend a function’s parameters while remaining backwards compatible with existing callers. This lets you provide additional functionality without having to migrate a lot of code, reducing the chance of introducing bugs.
For example, say you want to extend the flow_rate function above to calculate flow rates in weight units besides kilograms. You can do this by adding a new optional parameter that provides a conversion rate to your preferred measurement units.
def
flow_rate(weight_diff, time_diff, period=
1
, units_per_kg=
1
):return
((weight_diff*
units_per_kg)/
time_diff)*
period
The default argument value for units_per_kg is 1, which makes the returned weight units remain as kilograms. This means that all existing callers will see no change in behavior. New callers to flow_rate can specify the new keyword argument to see the new behavior.
pounds_per_hour=
flow_rate(weight_diff, time_diff, period=
3600
, units_per_kg=
2.2
)
The only problem with this approach is that optional keyword arguments like period and units_per_kg may still be specified as positional arguments.
pounds_per_hour=
flow_rate(weight_diff, time_diff,3600
,2.2
)
Supplying optional arguments positionally can be confusing because it isn’t clear what the values 3600 and 2.2 correspond to. The best practice is to always specify optional arguments using the keyword names and never pass them as positional arguments.
Things to Remember
- Function arguments can be specified by position or by keyword.
- Keywords make it clear what the purpose of each argument is when it would be confusing with only positional arguments.
- Keyword arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.
- Optional keyword arguments should always be passed by keyword instead of by position.
Item 20: Use None and Docstrings to Specify Dynamic Default Arguments
Sometimes you need to use a non-static type as a keyword argument’s default value. For example, say you want to print logging messages that are marked with the time of the logged event. In the default case, you want the message to include the time when the function was called. You might try the following approach, assuming the default arguments are reevaluated each time the function is called.
def
log(message, when=
datetime.now()):'%s: %s'
%
(when, message)) log('Hi there!'
) sleep(0.1
) log('Hi again!'
) >>> 2014-11-15 21:10:10.371432: Hi there! 2014-11-15 21:10:10.371432: Hi again!
The timestamps are the same because datetime.now is only executed a single time: when the function is defined. Default argument values are evaluated only once per module load, which usually happens when a program starts up. After the module containing this code is loaded, the datetime.now default argument will never be evaluated again.
The convention for achieving the desired result in Python is to provide a default value of None and to document the actual behavior in the docstring (see Item 49: “Write Docstrings for Every Function, Class, and Module”). When your code sees an argument value of None, you allocate the default value accordingly.
def
log(message, when=
None
):"""Log a message with a timestamp.
Args:
message: Message to print.
when: datetime of when the message occurred.
Defaults to the present time.
"""
when=
datetime.now()if
whenis None else
when'%s: %s'
%
(when, message))
Now the timestamps will be different.
log('Hi there!'
) sleep(0.1
) log('Hi again!'
) >>> 2014-11-15 21:10:10.472303: Hi there! 2014-11-15 21:10:10.573395: Hi again!
Using None for default argument values is especially important when the arguments are mutable. For example, say you want to load a value encoded as JSON data. If decoding the data fails, you want an empty dictionary to be returned by default. You might try this approach.
def
decode(data, default=
{}):try:
return
json.loads(data)except
ValueError:return
default
The problem here is the same as the datetime.now example above. The dictionary specified for default will be shared by all calls to decode because default argument values are only evaluated once (at module load time). This can cause extremely surprising behavior.
foo=
decode('bad data'
) foo['stuff'
]=
5
bar=
decode('also bad'
) bar['meep'
]=
1
'Foo:'
, foo)'Bar:'
, bar) >>> Foo: {'stuff': 5, 'meep': 1} Bar: {'stuff': 5, 'meep': 1}
You’d expect two different dictionaries, each with a single key and value. But modifying one seems to also modify the other. The culprit is that foo and bar are both equal to the default parameter. They are the same dictionary object.
assert
foois
bar
The fix is to set the keyword argument default value to None and then document the behavior in the function’s docstring.
def
decode(data, default=
None
):"""Load JSON data from a string.
Args:
data: JSON data to decode.
default: Value to return if decoding fails.
Defaults to an empty dictionary.
"""
if
defaultis None
: default=
{}try
:return
json.loads(data)except
ValueError:return
default
Now, running the same test code as before produces the expected result.
foo=
decode('bad data'
) foo['stuff'
]=
5
bar=
decode('also bad'
) bar['meep'
]=
1
'Foo:'
, foo)'Bar:'
, bar) >>> Foo: {'stuff': 5} Bar: {'meep': 1}
Things to Remember
- Default arguments are only evaluated once: during function definition at module load time. This can cause odd behaviors for dynamic values (like {} or []).
- Use None as the default value for keyword arguments that have a dynamic value. Document the actual default behavior in the function’s docstring.
Item 21: Enforce Clarity with Keyword-Only Arguments
Passing arguments by keyword is a powerful feature of Python functions (see Item 19: “Provide Optional Behavior with Keyword Arguments”). The flexibility of keyword arguments enables you to write code that will be clear for your use cases.
For example, say you want to divide one number by another but be very careful about special cases. Sometimes you want to ignore ZeroDivisionError exceptions and return infinity instead. Other times, you want to ignore OverflowError exceptions and return zero instead.
def
safe_division(number, divisor, ignore_overflow, ignore_zero_division):try
:return
number/
divisorexcept
OverflowError:if
ignore_overflow:return 0
else
:raise
except
ZeroDivisionError:if
ignore_zero_division:return
float
('inf'
)else
:raise
Using this function is straightforward. This call will ignore the float overflow from division and will return zero.
result=
safe_division(1
,10
**
500
,True
,False
)
This call will ignore the error from dividing by zero and will return infinity.
result=
safe_division(1
,0
,False
,True
)
The problem is that it’s easy to confuse the position of the two Boolean arguments that control the exception-ignoring behavior. This can easily cause bugs that are hard to track down. One way to improve the readability of this code is to use keyword arguments. By default, the function can be overly cautious and can always re-raise exceptions.
def
safe_division_b(number, divisor, ignore_overflow=
False
, ignore_zero_division=
False
):# ...
Then callers can use keyword arguments to specify which of the ignore flags they want to flip for specific operations, overriding the default behavior.
safe_division_b(1
,10
**
500
, ignore_overflow=
True
) safe_division_b(1
,0
, ignore_zero_division=
True
)
The problem is, since these keyword arguments are optional behavior, there’s nothing forcing callers of your functions to use keyword arguments for clarity. Even with the new definition of safe_division_b, you can still call it the old way with positional arguments.
safe_division_b(1
,10
**
500
,True
,False
)
With complex functions like this, it’s better to require that callers are clear about their intentions. In Python 3, you can demand clarity by defining your functions with keyword-only arguments. These arguments can only be supplied by keyword, never by position.
Here, I redefine the safe_division function to accept keyword-only arguments. The * symbol in the argument list indicates the end of positional arguments and the beginning of keyword-only arguments.
def
safe_division_c(number, divisor,*
, ignore_overflow=
False
, ignore_zero_division=
False
):# ...
Now, calling the function with positional arguments for the keyword arguments won’t work.
safe_division_c(1
,10
**
500
,True
,False
) >>> TypeError: safe_division_c() takes 2 positional arguments but 4 were given
Keyword arguments and their default values work as expected.
safe_division_c(1
,0
, ignore_zero_division=
True
)# OK
try
: safe_division_c(1
,0
)except
ZeroDivisionError:pass
# Expected
Keyword-Only Arguments in Python 2
Unfortunately, Python 2 doesn’t have explicit syntax for specifying keyword-only arguments like Python 3. But you can achieve the same behavior of raising TypeErrors for invalid function calls by using the ** operator in argument lists. The ** operator is similar to the * operator (see Item 18: “Reduce Visual Noise with Variable Positional Arguments”), except that instead of accepting a variable number of positional arguments, it accepts any number of keyword arguments, even when they’re not defined.
# Python 2
def
print_args(*
args,**
kwargs):'Positional:'
, args'Keyword: '
, kwargs print_args(1
,2
, foo=
'bar'
, stuff=
'meep'
) >>> Positional: (1, 2) Keyword: {'foo': 'bar', 'stuff': 'meep'}
To make safe_division take keyword-only arguments in Python 2, you have the function accept **kwargs. Then you pop keyword arguments that you expect out of the kwargs dictionary, using the pop method’s second argument to specify the default value when the key is missing. Finally, you make sure there are no more keyword arguments left in kwargs to prevent callers from supplying arguments that are invalid.
# Python 2
def
safe_division_d(number, divisor,**
kwargs): ignore_overflow=
kwargs.pop('ignore_overflow'
,False
) ignore_zero_div=
kwargs.pop('ignore_zero_division'
,False
)if
kwargs:raise
TypeError('Unexpected **kwargs: %r'
%
kwargs)# ...
Now, you can call the function with or without keyword arguments.
safe_division_d(1
,10
) safe_division_d(1
,0
, ignore_zero_division=
True
) safe_division_d(1
,10
**
500
, ignore_overflow=
True
)
Trying to pass keyword-only arguments by position won’t work, just like in Python 3.
safe_division_d(1
,0
,False
,True
) >>> TypeError: safe_division_d() takes 2 positional arguments but 4 were given
Trying to pass unexpected keyword arguments also won’t work.
safe_division_d(0
,0
, unexpected=
True
) >>> TypeError: Unexpected **kwargs: {'unexpected': True}
Things to Remember
- Keyword arguments make the intention of a function call more clear.
- Use keyword-only arguments to force callers to supply keyword arguments for potentially confusing functions, especially those that accept multiple Boolean flags.
- Python 3 supports explicit syntax for keyword-only arguments in functions.
- Python 2 can emulate keyword-only arguments for functions by using **kwargs and manually raising TypeError exceptions.