Indexing and Slicing β€” Advanced

Beyond Basic Slicing

Basic indexing (a[1:5], a[:, 0]) covers 80% of use cases. But scientific computing demands more: selecting scattered elements, masking invalid data, building open meshes for multi-dimensional lookups, and expressing tensor contractions concisely.

This section covers the advanced indexing tools that make NumPy a language for array manipulation, not just a container.

Definition:

Basic Indexing (Produces Views)

Basic indexing uses integers, slices (start:stop:step), or np.newaxis / ... (Ellipsis). It always returns a view.

a = np.arange(20).reshape(4, 5)
a[1, 3]          # scalar, integer indexing
a[1:3]           # rows 1-2, view
a[:, ::2]        # every other column, view
a[..., -1]       # last column, Ellipsis expands remaining axes
a[np.newaxis, :] # shape (1, 4, 5), adds axis

Key property: basic indexing never copies data.

Definition:

Fancy (Advanced) Indexing (Produces Copies)

Fancy indexing uses arrays of integers or booleans as indices. It always returns a copy.

a = np.arange(20).reshape(4, 5)

# Integer array indexing β€” select specific rows
rows = np.array([0, 2, 3])
print(a[rows])            # shape (3, 5), COPY

# Multi-dimensional fancy indexing
row_idx = np.array([0, 1, 3])
col_idx = np.array([2, 4, 0])
print(a[row_idx, col_idx])  # elements (0,2), (1,4), (3,0)

Fancy indexing can select arbitrary subsets, but beware: the result is always a new array.

Definition:

Boolean Indexing (Masking)

Boolean indexing selects elements where a boolean array is True. It always returns a copy (a 1-D array of selected elements).

a = np.array([1.0, -2.5, 3.7, -0.1, 4.2])
mask = a > 0
print(a[mask])           # [1.0, 3.7, 4.2]

# Common pattern: replace negative values with zero
a[a < 0] = 0.0           # modifies a in place via boolean assignment

Boolean indexing is the NumPy equivalent of SQL's WHERE clause.

Definition:

np.ix_ β€” Open Mesh Indexing

np.ix_ creates an open mesh from multiple 1-D index arrays, enabling selection of rectangular sub-arrays via fancy indexing:

a = np.arange(20).reshape(4, 5)
rows = [0, 2, 3]
cols = [1, 4]

# Without np.ix_: gets 3 elements (paired indexing)
# a[rows, cols]  -> ERROR: shape mismatch

# With np.ix_: gets 3x2 sub-matrix
sub = a[np.ix_(rows, cols)]
print(sub.shape)  # (3, 2)
# sub = [[1, 4], [11, 14], [16, 19]]

np.ix_ reshapes each index array to broadcast correctly: rows becomes shape (3, 1), cols becomes shape (1, 2).

Definition:

np.einsum β€” Einstein Summation

np.einsum expresses tensor operations using Einstein summation notation. Repeated indices are summed over; free indices define the output shape.

A = np.random.randn(3, 4)
B = np.random.randn(4, 5)

# Matrix multiply: C_ik = sum_j A_ij B_jk
C = np.einsum('ij,jk->ik', A, B)

# Trace: sum of diagonal
tr = np.einsum('ii->', np.eye(4))   # 4.0

# Batch matrix multiply: (batch, m, k) @ (batch, k, n) -> (batch, m, n)
X = np.random.randn(10, 3, 4)
Y = np.random.randn(10, 4, 5)
Z = np.einsum('bij,bjk->bik', X, Y)  # shape (10, 3, 5)

# Outer product
u = np.array([1, 2, 3])
v = np.array([4, 5])
outer = np.einsum('i,j->ij', u, v)  # shape (3, 2)

Theorem: Fancy Indexing Shape Rule

When indexing an array a with integer arrays idx0, idx1, ..., all index arrays must be broadcastable to a common shape, and the output shape is that common broadcast shape (plus any non-indexed axes from a).

Think of fancy indexing as a mapping: for each position in the output, the index arrays tell NumPy which element of a to fetch. The index arrays are broadcast together to define the mapping grid.

Theorem: Einstein Summation Rules

In np.einsum('subscripts', *operands):

  1. Each operand gets a subscript string of single-letter indices
  2. Indices appearing in two or more operands but not in the output are summed over (contracted)
  3. Indices appearing in exactly one operand and in the output are free indices (define output axes)
  4. The output shape is determined by the free indices in the order they appear in the output subscript

Repeated index = contraction (generalized dot product). Free index = that axis survives in the output. This notation can express any linear algebra operation.

Example: Masking Invalid Data

Given a signal array with some NaN values, compute the mean of valid samples using boolean indexing.

Example: Extracting Sub-matrices with np.ix_

From a 6x6 correlation matrix, extract the 3x3 sub-matrix corresponding to variables 0, 2, and 5.

Example: Pairwise Distances with einsum

Compute the pairwise Euclidean distance matrix for nn points in dd dimensions using np.einsum.

Einsum Operation Explorer

Visualize different einsum operations: see the input shapes, subscript notation, contracted indices, and output shape.

Parameters

Quick Check

What does a[np.array([True, False, True, False])] return for a = np.array([10, 20, 30, 40])?

np.array([10, 30])

np.array([20, 40])

np.array([10, 20, 30, 40])

A view of a

Quick Check

What does np.einsum('ij,ij->', A, B) compute for matrices A and B of the same shape?

The sum of element-wise products (Frobenius inner product)

Matrix multiplication A @ B

Element-wise product (Hadamard)

The trace of A @ B

Common Mistake: Fancy Index Shape Mismatch

Mistake:

Using two index arrays of different shapes expecting paired indexing:

a = np.arange(20).reshape(4, 5)
rows = [0, 2, 3]
cols = [1, 4]
a[rows, cols]  # ValueError: shape mismatch

Correction:

Use np.ix_ for rectangular sub-array selection, or ensure index arrays have the same length for paired element selection:

a[np.ix_(rows, cols)]  # 3x2 sub-array

fancy indexing

Indexing an array with integer or boolean arrays. Always produces a copy.

Related: basic indexing, boolean indexing

einsum

Einstein summation: a compact notation for expressing tensor contractions, dot products, outer products, and other linear algebra operations.

Related: Broadcasting, tensor contraction

Advanced Indexing

python
Fancy indexing, boolean masking, np.ix_, multi-dimensional indexing patterns.
# Code from: ch05/python/advanced_indexing.py
# Load from backend supplements endpoint

Einsum Cookbook

python
np.einsum recipes: matrix multiply, trace, batch operations, outer products.
# Code from: ch05/python/einsum_cookbook.py
# Load from backend supplements endpoint

Key Takeaway

Basic slicing returns views; fancy and boolean indexing return copies. Use np.ix_ for rectangular sub-array selection. np.einsum can express virtually any linear algebra operation in a single line.