Chapter Summary
Chapter Summary
Key Points
- 1.
ndarray internals determine performance. Every NumPy array is a thin wrapper around a contiguous data buffer with shape, strides, and dtype metadata. C-contiguous (row-major) layout is the default; iterating along the last axis is cache-friendly. Understanding strides explains why basic slicing creates views while fancy indexing creates copies.
- 2.
Advanced indexing unlocks expressive data selection. Boolean indexing provides SQL-like
WHEREfiltering.np.ix_creates open meshes for rectangular sub-array selection.np.einsumexpresses any linear algebra operation in a single line of subscript notation — master its rules and you can replace pages of loop-based code. - 3.
Broadcasting eliminates explicit loops and data replication. Dimensions align from the right; size-1 axes stretch to match. The pattern
a[:, None] op b[None, :]computes outer operations with zero memory overhead. Broadcasting replacesnp.tile/np.repeatin virtually every case. - 4.
Vectorization delivers 50-200x speedups. Python loops over arrays pay ~100 ns per element in interpreter overhead. Vectorized NumPy operations run in compiled C at ~1-5 ns per element. Use
np.wherefor conditionals,np.selectfor multi-branch logic, and avoidnp.vectorize(it is just a loop in disguise). - 5.
The modern RNG API provides reproducible, independent random streams. Use
np.random.default_rng(seed)instead of the legacynp.random.seed(). Construct complex Gaussian noise from two independent real Gaussians. UseSeedSequence.spawn()for parallel-safe independent streams. - 6.
Structured arrays and memory-mapped files extend NumPy to heterogeneous and out-of-core data. Structured dtypes store mixed-type records;
np.memmapenables array operations on files larger than RAM; HDF5 and zarr add metadata, compression, and cloud support.
Looking Ahead
Chapter 6 takes the NumPy foundation and builds on it with SciPy: linear algebra routines, sparse matrices, optimization, signal processing, and statistical functions. Every SciPy function operates on NumPy arrays — the ndarray is the universal currency of scientific Python.