Closures and Higher-Order Functions
Why Closures Matter for Scientific Code
In scientific computing, you frequently need families of functions that differ only in their parameters:
- A noise generator for different SNR levels
- A regularizer with varying strength
- A kernel function parameterized by bandwidth
You could use classes with __call__, but closures are often simpler,
more Pythonic, and produce cleaner APIs. Understanding closures is
also the key to understanding decorators (Section 2.3).
Definition: Higher-Order Function
Higher-Order Function
A higher-order function is a function that does at least one of:
- Takes a function as an argument (e.g.,
map,filter,sorted) - Returns a function as its result (e.g., function factories)
# Takes a function as argument
sorted(data, key=lambda x: x.snr_db)
# Returns a function
def make_adder(n):
def adder(x):
return x + n
return adder
add5 = make_adder(5)
add5(10) # 15
Higher-order functions enable powerful abstractions: callbacks, strategies, and function composition.
Definition: Closure
Closure
A closure is a function that captures and retains access to variables from its enclosing scope, even after that scope has finished executing. The captured variables are called free variables.
def make_scaler(factor):
# `factor` is a free variable captured by the inner function
def scale(x):
return x * factor
return scale
double = make_scaler(2.0)
triple = make_scaler(3.0)
print(double(5)) # 10.0
print(triple(5)) # 15.0
print(double.__closure__[0].cell_contents) # 2.0
The inner function scale is a closure because it "closes over"
the variable factor from the enclosing make_scaler scope.
Definition: Function Factory
Function Factory
A function factory is a higher-order function that returns a new function configured by its arguments. This is one of the most common uses of closures in scientific Python:
def make_gaussian_kernel(sigma: float):
"""Create a Gaussian kernel function with fixed bandwidth."""
coeff = 1 / (sigma * np.sqrt(2 * np.pi))
denom = 2 * sigma ** 2
def kernel(x: np.ndarray, center: float = 0.0) -> np.ndarray:
return coeff * np.exp(-((x - center) ** 2) / denom)
return kernel
narrow = make_gaussian_kernel(sigma=0.5)
wide = make_gaussian_kernel(sigma=2.0)
The factory pattern avoids re-computing constants (coeff, denom)
on every call β a form of precomputation via closures.
Theorem: Closure Variable Binding Rule (Late Binding)
Closures in Python capture references to variables, not their values at the time of closure creation. If the captured variable is later modified, the closure sees the updated value. This is called late binding.
Think of the closure as holding a pointer to the variable's memory cell, not a snapshot of its value. This is why loop-variable closures are a classic Python trap.
Demonstration of late binding
functions = []
for i in range(5):
def f():
return i # captures reference to `i`, not its value
functions.append(f)
# All functions return 4 (the final value of i)
print([f() for f in functions]) # [4, 4, 4, 4, 4]
Fix with default argument (early binding)
functions = []
for i in range(5):
def f(i=i): # default arg evaluated at definition time
return i
functions.append(f)
print([f() for f in functions]) # [0, 1, 2, 3, 4]
Why this happens
The LEGB rule (Local, Enclosing, Global, Built-in) means that
i in the closure refers to the enclosing scope's variable,
which is updated on each loop iteration. The default argument
trick works because default values are evaluated at function
definition time, creating a snapshot.
Theorem: Closure-Class Equivalence
Any closure can be rewritten as a callable class (using __call__),
and vice versa. The two representations are semantically equivalent:
closures capture free variables in __closure__ cells, while
callable classes store them as instance attributes.
A closure is essentially a lightweight class with one method and some state. Use closures when you need a simple function factory; use classes when you need multiple methods or complex state.
Closure version
def make_regularizer(alpha: float):
def regularize(weights: np.ndarray) -> float:
return alpha * np.sum(weights ** 2)
return regularize
Equivalent callable class
class Regularizer:
def __init__(self, alpha: float):
self.alpha = alpha
def __call__(self, weights: np.ndarray) -> float:
return self.alpha * np.sum(weights ** 2)
When to prefer which
- Closure: simple, one function, few captured variables
- Class: multiple methods, complex state, needs
__repr__, serialization, or introspection - Both are valid; choose based on complexity and readability
Theorem: Map/Filter vs Comprehension Equivalence
For any map(f, iterable) or filter(pred, iterable), there exists
an equivalent list comprehension (or generator expression) that
produces the same result. The comprehension form is generally
preferred in Python for readability.
map and filter are inherited from functional programming
traditions. Python's comprehensions achieve the same result with
less syntactic overhead.
Map equivalence
# These are equivalent:
result_map = list(map(lambda x: x**2, data))
result_comp = [x**2 for x in data]
# For NumPy arrays, vectorized operations are even better:
result_numpy = data ** 2
Filter equivalence
# These are equivalent:
result_filter = list(filter(lambda x: x > 0, data))
result_comp = [x for x in data if x > 0]
# NumPy boolean indexing:
result_numpy = data[data > 0]
When map/filter still wins
When you already have a named function (not a lambda), map can
be more concise:
# map is cleaner here β no need for a redundant lambda
results = list(map(np.sqrt, matrices))
Use functools.reduce for fold operations (sum, product) but
prefer built-in sum() or np.sum() when available.
Common Mistake: The Loop-Variable Closure Trap
Mistake:
Creating closures inside a loop that all capture the same variable:
callbacks = []
for snr in [0, 5, 10, 15, 20]:
callbacks.append(lambda: run_simulation(snr_db=snr))
# All callbacks use snr=20 (the final loop value)!
for cb in callbacks:
cb() # All run with snr_db=20
Correction:
Use a default argument to capture the current value:
callbacks = []
for snr in [0, 5, 10, 15, 20]:
callbacks.append(lambda snr=snr: run_simulation(snr_db=snr))
# Or use functools.partial:
from functools import partial
callbacks = [partial(run_simulation, snr_db=snr) for snr in [0, 5, 10, 15, 20]]
Example: Closure Factory for Noise Generators
Create a function factory that produces noise generator functions for different noise types and power levels. Each generator should accept a signal shape and return noise samples.
The factory function
import numpy as np
def make_noise_generator(
noise_type: str = "gaussian",
power_dbw: float = 0.0,
seed: int | None = None,
):
"""Create a noise generator closure.
Parameters
----------
noise_type : str
'gaussian', 'uniform', or 'laplacian'.
power_dbw : float
Noise power in dBW.
seed : int or None
Random seed for reproducibility.
Returns
-------
callable
Function that takes shape tuple and returns noise array.
"""
rng = np.random.default_rng(seed)
variance = 10 ** (power_dbw / 10)
std = np.sqrt(variance)
if noise_type == "gaussian":
def generate(shape):
return rng.normal(0, std, shape)
elif noise_type == "uniform":
def generate(shape):
half = std * np.sqrt(3)
return rng.uniform(-half, half, shape)
elif noise_type == "laplacian":
def generate(shape):
scale = std / np.sqrt(2)
return rng.laplace(0, scale, shape)
else:
raise ValueError(f"Unknown noise type: {noise_type}")
return generate
Usage in a simulation
# Create specialized generators
thermal_noise = make_noise_generator("gaussian", power_dbw=-10, seed=42)
impulse_noise = make_noise_generator("laplacian", power_dbw=-5, seed=42)
# Use them
signal = np.ones(1000)
noisy_thermal = signal + thermal_noise((1000,))
noisy_impulse = signal + impulse_noise((1000,))
# The RNG state is encapsulated β each generator has its own
print(thermal_noise((3,))) # reproducible sequence
Example: Function Factory for Regularization
Build a function factory that creates regularization penalty functions parameterized by type (L1, L2, ElasticNet) and strength . Use it to sweep over regularization strengths in an experiment.
Factory implementation
def make_regularizer(
reg_type: str = "l2",
alpha: float = 1.0,
l1_ratio: float = 0.5,
):
"""Create a regularization penalty function.
Parameters
----------
reg_type : str
'l1', 'l2', or 'elasticnet'.
alpha : float
Regularization strength.
l1_ratio : float
Mixing parameter for ElasticNet (0 = pure L2, 1 = pure L1).
"""
if reg_type == "l1":
def penalty(w: np.ndarray) -> float:
return alpha * np.sum(np.abs(w))
elif reg_type == "l2":
def penalty(w: np.ndarray) -> float:
return alpha * np.sum(w ** 2)
elif reg_type == "elasticnet":
def penalty(w: np.ndarray) -> float:
l1 = np.sum(np.abs(w))
l2 = np.sum(w ** 2)
return alpha * (l1_ratio * l1 + (1 - l1_ratio) * l2)
else:
raise ValueError(f"Unknown type: {reg_type}")
# Attach metadata for introspection
penalty.reg_type = reg_type
penalty.alpha = alpha
return penalty
Parameter sweep using the factory
# Sweep over regularization strengths
alphas = np.logspace(-4, 2, 20)
results = {}
for alpha in alphas:
reg = make_regularizer("l2", alpha=alpha)
loss = train_model(X, y, regularizer=reg)
results[alpha] = loss
# The factory made it trivial to parameterize the sweep
Example: functools.partial for Scientific APIs
Use functools.partial to create specialized versions of a general
simulation function without writing full wrapper functions.
Using partial to pre-fill parameters
from functools import partial
def simulate_channel(
signal: np.ndarray,
snr_db: float,
channel_type: str = "awgn",
num_taps: int = 1,
seed: int = 42,
) -> np.ndarray:
"""General channel simulation function."""
rng = np.random.default_rng(seed)
noise_power = 10 ** (-snr_db / 10)
noise = rng.normal(0, np.sqrt(noise_power), signal.shape)
if channel_type == "rayleigh":
h = (rng.normal(0, 1, num_taps) +
1j * rng.normal(0, 1, num_taps)) / np.sqrt(2)
signal = np.convolve(signal, h, mode='same')
return signal + noise
# Create specialized versions
awgn_10db = partial(simulate_channel, snr_db=10.0, channel_type="awgn")
rayleigh_fading = partial(simulate_channel, channel_type="rayleigh", num_taps=4)
# Use them β cleaner than lambda, preserves introspection
result = awgn_10db(my_signal)
print(awgn_10db.func.__name__) # 'simulate_channel'
print(awgn_10db.keywords) # {'snr_db': 10.0, 'channel_type': 'awgn'}
partial vs lambda vs closure
# All three create a specialized function:
# 1. partial β preserves metadata, picklable
f1 = partial(simulate_channel, snr_db=10.0)
# 2. lambda β concise but no metadata
f2 = lambda sig: simulate_channel(sig, snr_db=10.0)
# 3. closure β most flexible, can add logic
def make_sim(snr):
def sim(sig):
return simulate_channel(sig, snr_db=snr)
return sim
f3 = make_sim(10.0)
Closure Factory: Parameterized Functions
Explore how closures create families of functions. Adjust the parameters to see how the captured values affect the output function's behavior (e.g., Gaussian kernels with different bandwidths, regularizers with different strengths).
Parameters
Closures vs Callable Classes vs functools.partial
| Feature | Closure | Callable Class | functools.partial |
|---|---|---|---|
| Syntax overhead | Low (nested def) | Medium (init + call) | Minimal (one line) |
| State access | Via closure cells | Via self.attr | Via .func, .args, .keywords |
| Multiple methods | Not supported | Full support | Not supported |
| Serialization (pickle) | Usually fails | Works if defined at module level | Works |
| Introspection | Limited | Full (attrs, repr) | Good (.func, .keywords) |
| Use case | Simple factories | Complex stateful callables | Parameter pre-filling |
| Performance | Fastest | Slightly slower | Same as direct call |
Closure Factories for Scientific Computing
# Code from: ch02/python/closure_factories.py
# Load from backend supplements endpointfunctools Patterns: partial, reduce, lru_cache
# Code from: ch02/python/functools_patterns.py
# Load from backend supplements endpointWhy This Matters: Closures in Research: Hyperparameter Sweeps
In machine learning and signal processing research, function factories are the natural way to parameterize experiments. Instead of passing a dozen parameters through every function call, create specialized functions via closures:
# Create a family of loss functions for hyperparameter sweep
losses = {
f"l2_alpha={a:.1e}": make_regularizer("l2", alpha=a)
for a in np.logspace(-4, 2, 20)
}
# Run experiments in parallel
from concurrent.futures import ProcessPoolExecutor
with ProcessPoolExecutor() as executor:
results = {name: executor.submit(train, reg=loss)
for name, loss in losses.items()}
This pattern appears in PyTorch's learning rate schedulers, scikit-learn's custom scorers, and Optuna's objective functions.
See full treatment in Chapter 6
closure
A function that retains access to variables from its enclosing
scope after that scope has finished executing. The captured
variables are stored in the function's __closure__ attribute.
Related: free variable
free variable
A variable used inside a function that is not defined in that function's local scope. In closures, free variables are captured from the enclosing scope and stored in closure cells.
Related: closure
Historical Note: Closures: From Scheme to Python
1975-2006Closures were first implemented in Scheme (1975), a dialect of Lisp
designed by Guy Steele and Gerald Sussman. The concept was formalized
in their famous "Lambda Papers" (1975-1980). Python gained proper
closure support gradually: nested scopes were added in Python 2.1
(PEP 227, 2001), and the nonlocal keyword for mutable closures
arrived in Python 3.0 (PEP 3104, 2006). Before nonlocal, Python
programmers used the ugly "mutable container" hack (count = [0])
to work around the limitation.
Key Takeaway
Closures create parameterized function families. Use function
factories to generate specialized functions (noise generators,
regularizers, kernels) that capture their configuration at creation
time. Remember that Python closures use late binding β use default
arguments or functools.partial when creating closures in loops.