Type Annotations for Scientific Code

Why Type Annotations Matter for Scientific Code

Scientific Python code is often written quickly and read months later. Type annotations serve as machine-checked documentation: they tell you that a function expects a (N, M) complex array, not a flat real vector, without you reading the implementation. Combined with a type checker like mypy, they catch entire categories of bugs before runtime.

Unlike statically typed languages, Python's type hints are optional and gradual — you can annotate the critical 20% of your codebase that causes 80% of the confusion.

Historical Note: The Road to Type Hints in Python

2006-2024

Python was dynamically typed from its creation in 1991. The first formal proposal for type hints came in PEP 3107 (2006), which added function annotation syntax but no semantics. It took until PEP 484 (2014, Guido van Rossum, Jukka Lehtosalo, Lukasz Langa) to standardize a type system for Python, inspired by mypy — a project Lehtosalo started as his PhD work at Cambridge. Since then, the type system has evolved rapidly: TypeVar (PEP 484), Literal (PEP 586), TypeAlias (PEP 613), ParamSpec (PEP 612), and the X | Y union syntax (PEP 604, Python 3.10).

Definition:
Type Annotation

A type annotation is a syntactic hint attached to a variable, function parameter, or return value that indicates the expected type:

x: int = 42
name: str = "MIMO"

def snr_to_linear(snr_db: float) -> float:
    return 10 ** (snr_db / 10)

Annotations have no runtime effect by default — Python does not enforce them. They are consumed by static analysis tools (mypy, pyright) and IDEs.

Definition:
Built-in Generic Types

Since Python 3.9, built-in collection types support subscript notation for element types:

snr_values: list[float] = [0.0, 5.0, 10.0, 15.0]
channel_params: dict[str, float] = {"delay_spread": 1e-6, "doppler": 100.0}
antenna_pair: tuple[int, int] = (4, 4)
active_users: set[int] = {0, 2, 5}

Before Python 3.9, you needed from typing import List, Dict, Tuple, Set. The lowercase built-in syntax is now preferred.

Definition:
Union and Optional Types

When a value can be one of several types, use a union:

# Python 3.10+ syntax (preferred)
def normalize(x: np.ndarray | list[float]) -> np.ndarray:
    return np.asarray(x) / np.max(np.abs(x))

# Optional is shorthand for X | None
def find_peak(signal: np.ndarray) -> int | None:
    if len(signal) == 0:
        return None
    return int(np.argmax(signal))

Optional[X] from the typing module is equivalent to X | None. The | syntax is cleaner and preferred in modern code.

Definition:
Literal Types

Literal restricts a parameter to specific constant values:

from typing import Literal

def apply_window(
    signal: np.ndarray,
    window: Literal["hann", "hamming", "blackman", "rectangular"]
) -> np.ndarray:
    ...

This is more precise than str — a type checker will flag apply_window(sig, "haning") as an error (typo caught at analysis time, not as a silent bug at runtime).

Definition:
TypeAlias for Readability

Complex type expressions become unreadable without aliases:

from typing import TypeAlias

# Clear, domain-specific names
ComplexArray: TypeAlias = np.ndarray       # (N,) complex128
ChannelMatrix: TypeAlias = np.ndarray      # (Nr, Nt) complex128
SimulationResult: TypeAlias = dict[str, np.ndarray]

def estimate_channel(
    pilot: ComplexArray,
    received: ComplexArray,
) -> ChannelMatrix:
    ...

TypeAlias (PEP 613) makes the intent explicit — without it, ComplexArray = np.ndarray looks like a regular variable assignment.

In Python 3.12+, you can use the type statement instead: type ComplexArray = np.ndarray.

Definition:
TypeVar and Generic Functions

A TypeVar creates a placeholder for a type that is determined at call time:

from typing import TypeVar

T = TypeVar("T", bound=np.generic)

def normalize_array(arr: np.ndarray, dtype: type[T]) -> np.ndarray:
    """Normalize and cast to the specified dtype."""
    return (arr / np.max(np.abs(arr))).astype(dtype)

# T is inferred as np.float32
result = normalize_array(data, np.float32)

Generic types let you write functions that are type-safe without restricting them to a single concrete type.

Theorem: Gradual Typing Theorem

In a gradually typed system, any program without type annotations is well-typed. Adding annotations can only reveal errors — it can never introduce new ones. Formally, if a program $P$ type-checks with annotations $A$ , then removing any subset of annotations from $A$ still produces a well-typed program.

Think of un-annotated code as having every variable typed as Any. Adding a concrete type annotation replaces Any with something stricter, which can only surface additional mismatches.

Theorem: Covariance and Contravariance of Generic Types

Let Sub be a subtype of Super. Then:

A generic type G[T] is covariant in T if G[Sub] is a subtype of G[Super]. Read-only containers (e.g., Sequence, tuple) are covariant.
A generic type G[T] is contravariant in T if G[Super] is a subtype of G[Sub]. Write-only sinks (e.g., Callable parameter types) are contravariant.
A generic type is invariant if neither relationship holds. Mutable containers (list, dict) are invariant.

A list[int] is not a list[float] because you could append a 3.14 to the "list of floats," violating the int constraint. But a tuple[int, ...] is a Sequence[float] because you cannot modify it — reading integers as floats is always safe.

Theorem: Structural vs. Nominal Subtyping

In nominal subtyping, B is a subtype of A only if B explicitly inherits from A. In structural subtyping, B is a subtype of A if B has all the attributes and methods required by A, regardless of inheritance. Python's Protocol (PEP 544) implements structural subtyping.

Nominal subtyping is like checking someone's ID card. Structural subtyping is like checking whether they can do the job — if they have the right skills, it does not matter which school they attended.

Example: Type-Annotating a MIMO Channel Simulation

Write a function that generates a Rayleigh fading MIMO channel matrix with full type annotations, including the helper types.

Solution

Define type aliases

from typing import TypeAlias
import numpy as np

ChannelMatrix: TypeAlias = np.ndarray   # shape (Nr, Nt), complex128
NoiseVector: TypeAlias = np.ndarray      # shape (Nr,), complex128

def make_rayleigh_channel(
    n_rx: int,
    n_tx: int,
    rng: np.random.Generator | None = None,
) -> ChannelMatrix:
    rng = rng or np.random.default_rng()
    H = (rng.standard_normal((n_rx, n_tx))
         + 1j * rng.standard_normal((n_rx, n_tx))) / np.sqrt(2)
    return H

Use in a detector function

def zf_detect(
    H: ChannelMatrix,
    y: NoiseVector,
) -> np.ndarray:
    """Zero-forcing detector: x_hat = H^+ y."""
    H_pinv = np.linalg.pinv(H)
    return H_pinv @ y

The type aliases make it clear that H is a 2-D complex matrix and y is a 1-D complex vector, even though both are np.ndarray at runtime.

Example: A Generic Data Loader

Write a generic function that loads data from a file and returns it as a specified NumPy dtype, using TypeVar for type safety.

Solution

Define the generic loader

from typing import TypeVar, overload
import numpy as np

DType = TypeVar("DType", np.float32, np.float64, np.complex128)

def load_signal(
    path: str,
    dtype: type[DType] = np.float64,
    max_samples: int | None = None,
) -> np.ndarray:
    raw = np.load(path)
    if max_samples is not None:
        raw = raw[:max_samples]
    return raw.astype(dtype)

Call with explicit types

# DType is inferred as np.float32
signal_f32 = load_signal("iq_samples.npy", np.float32)

# DType is inferred as np.complex128
signal_c128 = load_signal("iq_samples.npy", np.complex128, max_samples=1024)

Example: Protocol-Based Type Hints for Array-Like Objects

Define a Protocol for any object that supports NumPy-style shape and dtype attributes, so functions can accept NumPy arrays, PyTorch tensors, or custom array classes interchangeably.

Solution

Define the Protocol

from typing import Protocol, runtime_checkable

@runtime_checkable
class ArrayLike(Protocol):
    @property
    def shape(self) -> tuple[int, ...]: ...
    @property
    def dtype(self) -> object: ...
    def __getitem__(self, key: object) -> object: ...

def print_info(arr: ArrayLike) -> None:
    print(f"Shape: {arr.shape}, dtype: {arr.dtype}")

Use with different backends

import numpy as np

a = np.zeros((3, 4))
print_info(a)  # Works: np.ndarray satisfies ArrayLike

# If PyTorch is available:
import torch
t = torch.zeros(3, 4)
print_info(t)  # Works: torch.Tensor also satisfies ArrayLike

Type Coverage Explorer

Visualize how adding type annotations to a codebase changes the type coverage percentage and the number of errors caught by mypy.

Parameters

Quick Check

What is Optional[int] equivalent to in Python 3.10+ syntax?

int | float

int | None

int | str

int | Any

Correction:

int | None

Optional[X] means the value can be X or None.

Quick Check

Why is list[int] NOT a subtype of list[float]?

Because int is not a subtype of float

Because list is invariant — you could append a float to a list[int]

Because Python does not support generic types

Because lists cannot hold integers

Correction:

Because list is invariant — you could append a float to a list[int]

Mutable containers are invariant. If list[int] were a subtype of list[float], you could append 3.14 through the list[float] reference, violating the int constraint.

Common Mistake: np.ndarray Does Not Encode Shape or Dtype

Mistake:

Assuming that np.ndarray type hints convey shape or dtype information:

def fft(x: np.ndarray) -> np.ndarray:  # What shape? What dtype?
    ...

Correction:

Use TypeAlias names to document shape and dtype conventions:

ComplexSignal: TypeAlias = np.ndarray  # (N,) complex128
Spectrum: TypeAlias = np.ndarray       # (N,) complex128

def fft(x: ComplexSignal) -> Spectrum:
    return np.fft.fft(x)

For full shape-aware typing, explore numpy.typing.NDArray[np.float64] or the nptyping library.

Common Mistake: Mutable Default Arguments in Typed Functions

Mistake:

Using mutable defaults even with type hints:

def add_noise(signals: list[np.ndarray] = []) -> list[np.ndarray]:
    ...  # The default list is shared across all calls!

Correction:

Use None as the default and create a new list inside:

def add_noise(signals: list[np.ndarray] | None = None) -> list[np.ndarray]:
    if signals is None:
        signals = []
    ...

Why This Matters: Type Annotations in Wireless System Design

In wireless communications, the distinction between a channel matrix $\mathbf{H} \in \mathbb{C}^{N_r \times N_t}$ and a signal vector $\mathbf{x} \in \mathbb{C}^{N_t}$ is critical — multiplying them in the wrong order crashes or silently produces garbage. Type aliases like ChannelMatrix and SignalVector catch these mismatches at analysis time. In production 5G codebases (e.g., Open RAN software), typed interfaces between modules (PHY, MAC, RLC) prevent integration bugs that would otherwise require expensive field testing to find.

See full treatment in Chapter 12

type hint

A syntactic annotation on a variable, parameter, or return value indicating its expected type. Not enforced at runtime by default.

Related: mypy

mypy

A static type checker for Python that reads type annotations and reports type errors without running the code.

Related: type hint, pyright

pyright

A fast static type checker for Python developed by Microsoft, used as the backend for Pylance in VS Code.

Related: mypy

TypeVar

A placeholder type variable used in generic functions and classes. The actual type is determined when the generic is instantiated.

Related: type hint

Literal type

A type that restricts values to specific constants, e.g., Literal["hann", "hamming"] accepts only those two strings.

Related: type hint

Type Annotations for Scientific Code

python

Complete examples of type annotations for NumPy arrays, PyTorch tensors, generic functions, TypeAlias, Literal, and Protocol-based typing.

# Code from: ch04/python/type_hints_demo.py
# Load from backend supplements endpoint

Key Takeaway

Type annotations are gradual and optional. Start by annotating function signatures in your most-used modules. You do not need to annotate everything at once — even partial annotations dramatically improve IDE support and catch bugs early.

Key Takeaway

Use TypeAlias to encode domain semantics. ChannelMatrix is clearer than np.ndarray even though both resolve to the same runtime type. The alias serves as documentation that the type checker can verify is used consistently.

Prerequisites & Notation Runtime Validation

Type Annotations for Scientific Code

Why Type Annotations Matter for Scientific Code

Historical Note: The Road to Type Hints in Python

Definition: Type Annotation

Definition: Built-in Generic Types

Definition: Union and Optional Types

Definition: Literal Types

Definition: TypeAlias for Readability

Definition: TypeVar and Generic Functions

Theorem: Gradual Typing Theorem

Theorem: Covariance and Contravariance of Generic Types

Theorem: Structural vs. Nominal Subtyping

Example: Type-Annotating a MIMO Channel Simulation

Define type aliases

Use in a detector function

Example: A Generic Data Loader

Define the generic loader

Call with explicit types

Example: Protocol-Based Type Hints for Array-Like Objects

Define the Protocol

Use with different backends

Type Coverage Explorer

Parameters

Quick Check

Quick Check

Common Mistake: np.ndarray Does Not Encode Shape or Dtype

Common Mistake: Mutable Default Arguments in Typed Functions

Why This Matters: Type Annotations in Wireless System Design

type hint

mypy

pyright

TypeVar

Literal type

Type Annotations for Scientific Code

Key Takeaway

Key Takeaway

Definition:
Type Annotation

Definition:
Built-in Generic Types

Definition:
Union and Optional Types

Definition:
Literal Types

Definition:
TypeAlias for Readability

Definition:
TypeVar and Generic Functions