The array and array_ufunc Protocols

Making Custom Objects Play Nicely with NumPy

NumPy defines several protocols that let custom objects integrate seamlessly with NumPy functions. When you call np.sin(my_object), NumPy doesn't need to know about your class — it asks your object how to handle the operation through well-defined dunder methods.

This is the foundation of libraries like CuPy, JAX, and Dask that provide NumPy-compatible interfaces without being NumPy.

Definition:

The array Protocol

The __array__ method tells NumPy how to convert your object to an ndarray. NumPy calls this whenever it needs to operate on your object:

class PhysicalQuantity:
    """A value with units that converts to raw NumPy array."""

    def __init__(self, value: np.ndarray, unit: str = "V"):
        self._value = np.asarray(value)
        self.unit = unit

    def __array__(self, dtype=None, copy=None) -> np.ndarray:
        if dtype is not None:
            return self._value.astype(dtype)
        return self._value

    def __repr__(self) -> str:
        return f"PhysicalQuantity({self._value}, unit='{self.unit}')"

Now np.array(PhysicalQuantity([1, 2, 3])) works, and so does np.mean(PhysicalQuantity([1, 2, 3])).

Definition:

The array_ufunc Protocol

__array_ufunc__ intercepts NumPy universal function (ufunc) calls on custom objects. It gives your class full control over how operations like np.add, np.multiply, and np.sin are handled:

class PhysicalQuantity:
    # ... (previous code) ...

    def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
        """Intercept NumPy ufunc calls to preserve units."""
        # Convert all PhysicalQuantity inputs to raw arrays
        raw_inputs = []
        units = set()
        for inp in inputs:
            if isinstance(inp, PhysicalQuantity):
                raw_inputs.append(inp._value)
                units.add(inp.unit)
            else:
                raw_inputs.append(inp)

        # Apply the ufunc to raw arrays
        result = getattr(ufunc, method)(*raw_inputs, **kwargs)

        # Wrap result back in PhysicalQuantity
        if len(units) == 1:
            return PhysicalQuantity(result, unit=units.pop())
        return result  # mixed units: return raw

The signature is always (self, ufunc, method, *inputs, **kwargs) where method is typically '__call__', 'reduce', 'accumulate', or 'outer'.

Definition:

The array_function Protocol

While __array_ufunc__ handles element-wise operations (ufuncs), __array_function__ (NEP 18) intercepts all NumPy functions including np.concatenate, np.linalg.norm, np.fft.fft, etc.:

import functools

HANDLED_FUNCTIONS = {}

def implements(np_function):
    """Register an implementation for a NumPy function."""
    def decorator(func):
        HANDLED_FUNCTIONS[np_function] = func
        return func
    return decorator

class PhysicalQuantity:
    def __array_function__(self, func, types, args, kwargs):
        if func not in HANDLED_FUNCTIONS:
            return NotImplemented
        return HANDLED_FUNCTIONS[func](*args, **kwargs)

@implements(np.concatenate)
def concatenate(arrays, axis=0):
    raw = [np.asarray(a) for a in arrays]
    units = {a.unit for a in arrays if isinstance(a, PhysicalQuantity)}
    result = np.concatenate(raw, axis=axis)
    if len(units) == 1:
        return PhysicalQuantity(result, unit=units.pop())
    return result

This is how libraries like pint and astropy.units provide transparent NumPy integration.

Theorem: NumPy Ufunc Dispatch Priority

When a NumPy ufunc is called with mixed types (e.g., np.add(custom_obj, ndarray)), NumPy checks __array_ufunc__ on each input in order. If an input returns NotImplemented, NumPy tries the next input. The dispatch priority is:

  1. Subclasses of ndarray are checked before non-subclasses
  2. Among non-subclasses, inputs are checked left to right
  3. If all inputs return NotImplemented, NumPy raises TypeError
  4. Setting __array_ufunc__ = None opts a class out entirely, forcing TypeError immediately

This ensures that custom types can always override default NumPy behavior without monkey-patching.

Think of it like Python's __radd__ mechanism but generalized to all ufuncs. The key design principle is that custom types always get a chance to handle the operation before NumPy falls back to converting them to arrays.

Example: Building a Unit-Aware Array Class

Create a PhysicalQuantity class that wraps a NumPy array with a unit string. It should work transparently with NumPy ufuncs (addition, multiplication, trigonometric functions) while preserving or computing the correct units.

NumPy Interop Benchmark

Compare the performance overhead of custom array classes using array_ufunc vs. plain NumPy arrays. See how the overhead scales with array size and operation complexity.

Parameters

NumPy Protocol Dispatch Flow

NumPy Protocol Dispatch Flow
How NumPy dispatches operations to custom objects through array, array_ufunc, and array_function protocols.

Common Mistake: Forgetting copy Semantics in array

Mistake:

class MyArray:
    def __array__(self, dtype=None, copy=None):
        return self._data  # Returns a reference, not a copy!

obj = MyArray(np.array([1, 2, 3]))
arr = np.array(obj)
arr[0] = 999  # Modifies obj._data too!

Correction:

class MyArray:
    def __array__(self, dtype=None, copy=None):
        if copy is False:
            # NumPy 2.0+: caller explicitly wants no copy
            return self._data.view()
        return self._data.copy()  # Safe: return a copy by default

NumPy 2.0 added the copy parameter to __array__. Handle it to support both zero-copy views and safe copies.

The cuda_array_interface for GPU Interop

Just as __array__ enables NumPy interop, __cuda_array_interface__ enables GPU array interop between CuPy, PyTorch, Numba CUDA, and other GPU libraries:

class GPUSignal:
    """Signal stored on GPU with CuPy interop."""

    def __init__(self, data_gpu):
        self._data = data_gpu  # CuPy array

    @property
    def __cuda_array_interface__(self):
        return self._data.__cuda_array_interface__

This protocol exposes the GPU memory pointer, shape, dtype, and strides, enabling zero-copy data exchange between GPU libraries.

Universal Function (ufunc)

A NumPy function that operates element-wise on arrays (e.g., np.add, np.sin, np.exp). Custom classes can intercept ufunc calls via __array_ufunc__.

Related: Array Protocol

Array Protocol

A set of dunder methods (__array__, __array_ufunc__, __array_function__) that let custom objects integrate with NumPy's dispatch system.

Related: Universal Function (ufunc)

Quick Check

What happens when you call np.add(custom_obj, np_array) and custom_obj.__array_ufunc__ returns NotImplemented?

NumPy raises TypeError immediately

NumPy falls back to calling array on custom_obj

NumPy checks np_array's array_ufunc next

The operation returns NotImplemented

NumPy Interop Protocols

python
Complete PhysicalQuantity class with __array__, __array_ufunc__, and __array_function__ protocols.
# Code from: ch03/python/numpy_interop.py
# Load from backend supplements endpoint

Key Takeaway

NumPy's __array__, __array_ufunc__, and __array_function__ protocols let custom classes integrate seamlessly with the NumPy ecosystem. Implement __array__ for basic conversion, __array_ufunc__ for element-wise operations, and __array_function__ for full NumPy API coverage. The __cuda_array_interface__ extends this pattern to GPU interop.