- Item 30: Know That Function Arguments Can Be Mutated
- Item 31: Return Dedicated Result Objects Instead of Requiring Function Callers to Unpack More Than Three Variables
- Item 32: Prefer Raising Exceptions to Returning None
- Item 33: Know How Closures Interact with Variable Scope and nonlocal
- Item 34: Reduce Visual Noise with Variable Positional Arguments
- Item 35: Provide Optional Behavior with Keyword Arguments
- Item 36: Use None and Docstrings to Specify Dynamic Default Arguments
- Item 37: Enforce Clarity with Keyword-Only and Positional-Only Arguments
- Item 38: Define Function Decorators with functools.wraps
- Item 39: Prefer functools.partial over lambda Expressions for Glue Functions
Item 33: Know How Closures Interact with Variable Scope and nonlocal
Imagine that I want to sort a list of numbers but prioritize one group of numbers to come first. This pattern is useful when you’re rendering a user interface and want important messages or exceptional events to be displayed before everything else. A common way to do this is to pass a helper function as the key argument to a list’s sort method (see Item 100: “Sort by Complex Criteria Using the key Parameter” for details). The helper’s return value will be used as the value for sorting each item in the list. The helper can check whether the given item is in the important group and can vary the sorting value accordingly:
def sort_priority(values, group): def helper(x): if x in group: return (0, x) return (1, x) values.sort(key=helper)
This function works for simple inputs:
numbers = [8, 3, 1, 2, 5, 4, 7, 6] group = {2, 3, 5, 7} sort_priority(numbers, group) print(numbers) >>> [2, 3, 5, 7, 1, 4, 6, 8]
There are three reasons this function operates as expected:
Python supports closures—that is, functions that refer to variables from the scope in which they were defined. This is why the helper function is able to access the group argument for the sort_priority function.
Functions are first-class objects in Python, which means you can refer to them directly, assign them to variables, pass them as arguments to other functions, compare them in expressions and if statements, and so on. This is how the sort method can accept a closure function as the key argument.
Python has specific rules for comparing sequences (including tuples). It first compares items at index zero; then, if those are equal, it compares items at index one; if they are still equal, it compares items at index two, and so on. This is why the return value from the helper closure causes the sort order to have two distinct groups.
It’d be nice if this function returned whether higher-priority items were seen at all so the user interface code could act accordingly. Adding such behavior seems straightforward. There’s already a closure function for deciding which group each number is in. Why not also use the closure to flip a flag when high-priority items are seen? Then, the function could return the flag value after it’s modified by the closure.
Here, I try to do that in a seemingly obvious way:
def sort_priority2(numbers, group): found = False # Flag initial value def helper(x): if x in group: found = True # Flip the flag return (0, x) return (1, x) numbers.sort(key=helper) return found # Flag final value
I can run the function on the same inputs as before:
found = sort_priority2(numbers, group) print("Found:", found) print(numbers) >>> Found: False [2, 3, 5, 7, 1, 4, 6, 8]
The sorted results are correct, which means items from group were definitely found in numbers. However, the found result returned by the function is False when it should be True. How could this happen?
When you reference a variable in an expression, the Python interpreter traverses the nested scopes to resolve the reference in this order:
The current function’s scope
Any enclosing scopes (such as other containing functions)
The scope of the module that contains the code (also called the global scope)
The built-in scope (that contains functions like len and str)
If none of these places has defined a variable with the referenced name, then a NameError exception is raised:
foo = does_not_exist * 5 >>> Traceback ... NameError: name 'does_not_exist' is not defined
Assigning a value to a variable works differently. If the variable is already defined in the current scope, that name will take on the new value in that scope. If the variable doesn’t exist in the current scope, Python treats the assignment as a variable definition. Critically, the scope of the newly defined variable is the function that contains the assignment, not an enclosing scope with an earlier assignment.
This assignment behavior explains the wrong return value of the sort_priority2 function. The found variable is assigned to True in the helper closure. The closure’s assignment is treated as a new variable definition within the scope of helper, not as an assignment within the scope of sort_priority2:
def sort_priority2(numbers, group): found = False # Scope: 'sort_priority2' def helper(x): if x in group: found = True # Scope: 'helper' -- Bad! return (0, x) return (1, x) numbers.sort(key=helper) return found
This problem is sometimes called the scoping bug because it can be so surprising to newbies. But this behavior is the intended result: It prevents local variables in a function from polluting the containing module. Otherwise, every assignment in a function would put garbage into the global module scope. Not only would that be noise, but the interplay of the resulting global variables could cause obscure bugs.
In Python, there is special syntax for assigning data outside of a closure’s scope. The nonlocal statement is used to indicate that scope traversal should happen upon assignment for a specific variable name. The only limit is that nonlocal won’t traverse up to the module-level scope (to avoid polluting globals).
Here, I define the same function again, now using nonlocal:
def sort_priority3(numbers, group): found = False def helper(x): nonlocal found # Added if x in group: found = True return (0, x) return (1, x) numbers.sort(key=helper) return found
Now the found flag works as expected:
found = sort_priority3(numbers, group) print("Found:", found) print(numbers) >>> Found: True [2, 3, 5, 7, 1, 4, 6, 8]
The nonlocal statement makes it clear when data is being assigned out of a closure and into another scope. It’s complementary to the global statement, which indicates that a variable’s assignment should go directly into the module scope.
However, much as with the anti-pattern of global variables, I caution against using nonlocal for anything beyond simple functions. The side effects of nonlocal can be hard to follow. It’s especially hard to understand in long functions where the nonlocal statements and assignments to associated variables are far apart.
When your usage of nonlocal starts getting complicated, it’s better to wrap your state in a helper class. Here, I define a class that can be called like a function; it achieves the same result as the nonlocal approach by assigning an object’s attribute during sorting (see Item 55: “Prefer Public Attributes over Private Ones”):
class Sorter: def __init__(self, group): self.group = group self.found = False def __call__(self, x): if x in self.group: self.found = True return (0, x) return (1, x)
It’s a little longer than before, but it’s much easier to reason about and extend if needed (see Item 48: “Accept Functions Instead of Classes for Simple Interfaces” for details on the __call__ special method). I can access the found attribute on the Sorter instance to get the result:
sorter = Sorter(group) numbers.sort(key=sorter) print("Found:", sorter.found) print(numbers) >>> Found: True [2, 3, 5, 7, 1, 4, 6, 8]
Things to Remember
Closure functions can refer to variables from any of the enclosing scopes in which they were defined.
By default, closures can’t affect enclosing scopes by assigning variables.
Use the nonlocal statement to indicate when a closure can modify a variable in its enclosing scopes. Use the global statement to do the same thing for module-level names.
Avoid using nonlocal statements for anything beyond simple functions.