Exercises
ex-sp-ch05-01
EasyCreate a 5x4 array of float64 zeros. Print its shape, dtype, strides, and total number of bytes. Verify it is C-contiguous.
Use np.zeros((5, 4)) and inspect .flags["C_CONTIGUOUS"].
Solution
import numpy as np
a = np.zeros((5, 4), dtype=np.float64)
print(f"shape: {a.shape}") # (5, 4)
print(f"dtype: {a.dtype}") # float64
print(f"strides: {a.strides}") # (32, 8)
print(f"nbytes: {a.nbytes}") # 160
print(f"C-contig: {a.flags['C_CONTIGUOUS']}") # True
ex-sp-ch05-02
EasyGiven a = np.arange(20).reshape(4, 5), use boolean indexing to
extract all elements greater than 10 and less than 18.
Combine conditions with & and parentheses.
Solution
a = np.arange(20).reshape(4, 5)
mask = (a > 10) & (a < 18)
result = a[mask]
print(result) # [11, 12, 13, 14, 15, 16, 17]
ex-sp-ch05-03
EasyAdd a bias vector b = np.array([1, 2, 3]) to every row of a
matrix X = np.ones((5, 3)) using broadcasting (no loops).
Broadcasting aligns shapes from the right: (5, 3) + (3,) works directly.
Solution
X = np.ones((5, 3))
b = np.array([1, 2, 3])
result = X + b # shape (5, 3)
print(result)
# [[2. 3. 4.]
# [2. 3. 4.]
# ...]
ex-sp-ch05-04
EasyReplace a Python loop that computes the ReLU function
(max(0, x) for each element) with a single vectorized call.
Use np.maximum(x, 0) or np.where(x > 0, x, 0).
Solution
x = np.array([-2, -1, 0, 1, 2, 3], dtype=np.float64)
relu = np.maximum(x, 0)
print(relu) # [0. 0. 0. 1. 2. 3.]
ex-sp-ch05-05
EasyGenerate 1000 samples from a Gaussian distribution with mean 5 and standard deviation 2 using the modern Generator API. Compute and print the sample mean and standard deviation.
Use np.random.default_rng(42).normal(5, 2, 1000).
Solution
rng = np.random.default_rng(42)
samples = rng.normal(loc=5.0, scale=2.0, size=1000)
print(f"mean: {samples.mean():.3f}") # ~5.0
print(f"std: {samples.std():.3f}") # ~2.0
ex-sp-ch05-06
MediumFor a = np.arange(24).reshape(2, 3, 4), compute the strides by hand
for both C-order and Fortran-order. Verify with NumPy.
C-order stride for axis k: product of all subsequent dimensions times element size.
F-order: stride for axis k: product of all preceding dimensions times element size.
C-order strides
Element size: 8 bytes (int64 on most systems).
- Axis 0:
- Axis 1:
- Axis 2:
Strides:
(96, 32, 8).
F-order strides
- Axis 0:
- Axis 1:
- Axis 2:
Strides:
(8, 16, 48).
Verification
a_c = np.arange(24).reshape(2, 3, 4)
a_f = np.asfortranarray(a_c)
print(a_c.strides) # (96, 32, 8)
print(a_f.strides) # (8, 16, 48)
ex-sp-ch05-07
MediumUse np.ix_ to extract a 3x3 sub-matrix from a 6x6 matrix,
selecting rows [0, 2, 5] and columns [1, 3, 4]. Verify the result
is a copy.
np.ix_([0,2,5], [1,3,4]) creates the open mesh.
Solution
a = np.arange(36).reshape(6, 6)
sub = a[np.ix_([0, 2, 5], [1, 3, 4])]
print(sub.shape) # (3, 3)
print(sub)
# [[ 1, 3, 4],
# [13, 15, 16],
# [31, 33, 34]]
print(np.shares_memory(a, sub)) # False — copy
ex-sp-ch05-08
MediumCompute the outer product of two vectors u = [1, 2, 3] and
v = [4, 5, 6, 7] using: (a) broadcasting, (b) np.outer,
and (c) np.einsum. Verify all three give the same result.
Broadcasting: u[:, None] * v[None, :].
einsum: np.einsum("i,j->ij", u, v).
Three approaches
u = np.array([1, 2, 3])
v = np.array([4, 5, 6, 7])
r1 = u[:, None] * v[None, :] # broadcasting
r2 = np.outer(u, v) # explicit
r3 = np.einsum('i,j->ij', u, v) # einsum
assert np.array_equal(r1, r2)
assert np.array_equal(r1, r3)
print(r1.shape) # (3, 4)
ex-sp-ch05-09
MediumReplace the following loop with a vectorized version using np.select:
result = np.empty(len(x))
for i in range(len(x)):
if x[i] < -1:
result[i] = -1
elif x[i] > 1:
result[i] = 1
else:
result[i] = x[i] ** 3
Define three conditions and three choices for np.select.
Vectorized solution
x = np.linspace(-3, 3, 1000)
conditions = [x < -1, x > 1]
choices = [-1, 1]
result = np.select(conditions, choices, default=x ** 3)
ex-sp-ch05-10
MediumCreate a structured array with fields (id: int32, x: float64, y: float64, label: U5) and populate it with 10 random points. Sort by the x field.
Use np.dtype with a list of (name, type) tuples.
Solution
dt = np.dtype([('id', np.int32), ('x', np.float64),
('y', np.float64), ('label', 'U5')])
rng = np.random.default_rng(42)
points = np.zeros(10, dtype=dt)
points['id'] = np.arange(10)
points['x'] = rng.uniform(-5, 5, 10)
points['y'] = rng.uniform(-5, 5, 10)
points['label'] = [f"P{i}" for i in range(10)]
sorted_pts = np.sort(points, order='x')
print(sorted_pts[['id', 'x']])
ex-sp-ch05-11
MediumGenerate 4 independent streams of 1000 Gaussian samples using
SeedSequence.spawn(). Verify that the streams are uncorrelated
by computing pairwise correlation coefficients.
Use np.corrcoef on the stacked streams.
Solution
ss = np.random.SeedSequence(42)
rngs = [np.random.default_rng(s) for s in ss.spawn(4)]
streams = np.array([rng.standard_normal(1000) for rng in rngs])
corr = np.corrcoef(streams)
print(corr)
# Diagonal is 1.0; off-diagonal values should be close to 0
off_diag = corr[~np.eye(4, dtype=bool)]
print(f"Max off-diagonal correlation: {np.abs(off_diag).max():.4f}")
ex-sp-ch05-12
HardImplement a function sliding_window_view(a, window_size) that returns
a 2-D array where row i contains a[i:i+window_size], using only
stride manipulation (no loops, no copying).
Hint: np.lib.stride_tricks.as_strided can create a view with custom
strides.
The output shape is (len(a) - window_size + 1, window_size).
The strides should be (a.strides[0], a.strides[0]) — same stride along both axes.
Stride trick implementation
from numpy.lib.stride_tricks import as_strided
def sliding_window_view(a, window_size):
shape = (len(a) - window_size + 1, window_size)
strides = (a.strides[0], a.strides[0])
return as_strided(a, shape=shape, strides=strides)
# Or use the built-in (NumPy >= 1.20):
# np.lib.stride_tricks.sliding_window_view(a, window_size)
a = np.arange(10)
windows = sliding_window_view(a, 3)
print(windows)
# [[0, 1, 2], [1, 2, 3], ..., [7, 8, 9]]
print(np.shares_memory(a, windows)) # True — it is a view!
ex-sp-ch05-13
HardCompute the pairwise Euclidean distance matrix for n points in d
dimensions using three different methods: (a) broadcasting, (b) einsum,
and (c) scipy.spatial.distance.cdist. Benchmark all three for
n=1000, d=3.
Broadcasting: diff = X[:, None, :] - X[None, :, :].
einsum for squared norms: np.einsum("ij,ij->i", X, X).
Three implementations
import time
from scipy.spatial.distance import cdist
rng = np.random.default_rng(42)
X = rng.standard_normal((1000, 3))
# (a) Broadcasting
t0 = time.perf_counter()
diff = X[:, None, :] - X[None, :, :]
D1 = np.sqrt((diff ** 2).sum(axis=-1))
print(f"Broadcasting: {time.perf_counter()-t0:.4f}s")
# (b) einsum
t0 = time.perf_counter()
sq = np.einsum('ij,ij->i', X, X)
cross = np.einsum('ij,kj->ik', X, X)
D2 = np.sqrt(np.maximum(sq[:, None] + sq[None, :] - 2*cross, 0))
print(f"einsum: {time.perf_counter()-t0:.4f}s")
# (c) cdist
t0 = time.perf_counter()
D3 = cdist(X, X, metric='euclidean')
print(f"cdist: {time.perf_counter()-t0:.4f}s")
assert np.allclose(D1, D2, atol=1e-10)
assert np.allclose(D1, D3, atol=1e-10)
ex-sp-ch05-14
HardUse np.einsum to implement batch matrix multiplication:
given A of shape (batch, m, k) and B of shape (batch, k, n),
compute C of shape (batch, m, n) where C[i] = A[i] @ B[i].
Verify against np.matmul.
einsum subscripts: "bij,bjk->bik".
Solution
rng = np.random.default_rng(42)
batch, m, k, n = 32, 4, 5, 3
A = rng.standard_normal((batch, m, k))
B = rng.standard_normal((batch, k, n))
C_einsum = np.einsum('bij,bjk->bik', A, B)
C_matmul = np.matmul(A, B) # or A @ B
assert np.allclose(C_einsum, C_matmul)
print(f"Shape: {C_einsum.shape}") # (32, 4, 3)
ex-sp-ch05-15
HardWrite a function that generates circularly-symmetric complex Gaussian
noise with a given spatial covariance matrix (shape
(n_antennas, n_antennas)). Generate 10,000 samples and verify the
sample covariance matches .
Use Cholesky decomposition: .
Generate i.i.d. CN(0,1) and multiply by .
Solution
def correlated_complex_gaussian(rng, C, n_samples):
n = C.shape[0]
L = np.linalg.cholesky(C)
# i.i.d. CN(0, 1) samples
z = (rng.standard_normal((n_samples, n))
+ 1j * rng.standard_normal((n_samples, n))) / np.sqrt(2)
return z @ L.T # shape (n_samples, n)
rng = np.random.default_rng(42)
n_ant = 4
# Create a valid positive-definite Hermitian covariance
A = rng.standard_normal((n_ant, n_ant)) + 1j * rng.standard_normal((n_ant, n_ant))
C = A @ A.conj().T / n_ant
samples = correlated_complex_gaussian(rng, C, 10_000)
C_est = samples.conj().T @ samples / 10_000
print(f"Max error: {np.abs(C - C_est).max():.4f}") # should be small
ex-sp-ch05-16
HardCreate a memory-mapped file containing a 10,000 x 100 float32 array. Write data in chunks of 1,000 rows, then memory-map it read-only and compute column-wise statistics without loading the entire file.
Use np.memmap with mode="w+" for writing.
Solution
import tempfile, os
rng = np.random.default_rng(42)
path = os.path.join(tempfile.gettempdir(), 'test_memmap.bin')
# Write in chunks
mm_w = np.memmap(path, dtype=np.float32, mode='w+',
shape=(10_000, 100))
for start in range(0, 10_000, 1000):
mm_w[start:start+1000] = rng.standard_normal((1000, 100)).astype(np.float32)
mm_w.flush()
del mm_w
# Read-only access
mm_r = np.memmap(path, dtype=np.float32, mode='r',
shape=(10_000, 100))
col_means = mm_r.mean(axis=0) # computed via chunked access
col_stds = mm_r.std(axis=0)
print(f"Mean range: [{col_means.min():.3f}, {col_means.max():.3f}]")
print(f"Std range: [{col_stds.min():.3f}, {col_stds.max():.3f}]")
os.remove(path)
ex-sp-ch05-17
ChallengeImplement a vectorized 2-D convolution (no loops) using stride tricks.
Given an image of shape (H, W) and a kernel of shape (kH, kW),
create a view of all patches using as_strided, then compute the
convolution with a single einsum call.
Output shape: (H - kH + 1, W - kW + 1).
Use as_strided to create shape (out_H, out_W, kH, kW) with appropriate strides.
Then np.einsum("ijkl,kl->ij", patches, kernel).
Solution
from numpy.lib.stride_tricks import as_strided
def conv2d_strided(image, kernel):
H, W = image.shape
kH, kW = kernel.shape
out_H, out_W = H - kH + 1, W - kW + 1
s0, s1 = image.strides # bytes per row, bytes per col
patches = as_strided(
image,
shape=(out_H, out_W, kH, kW),
strides=(s0, s1, s0, s1),
)
return np.einsum('ijkl,kl->ij', patches, kernel)
# Test
rng = np.random.default_rng(42)
img = rng.standard_normal((100, 100))
kern = np.array([[1, 0, -1], [2, 0, -2], [1, 0, -1]], dtype=np.float64) # Sobel
edges = conv2d_strided(img, kern)
print(f"Output shape: {edges.shape}") # (98, 98)
ex-sp-ch05-18
ChallengeBenchmark the performance difference between iterating over rows (axis 0) vs columns (axis 1) for both C-contiguous and Fortran-contiguous arrays. Create a 5000x5000 matrix and measure the time to compute row sums and column sums for each layout. Explain the results in terms of cache behavior.
Use np.asfortranarray to create F-contiguous version.
Time a.sum(axis=0) and a.sum(axis=1) for each layout.
Benchmark and analysis
import time
n = 5000
rng = np.random.default_rng(42)
c_arr = rng.standard_normal((n, n))
f_arr = np.asfortranarray(c_arr)
results = {}
for name, arr in [('C', c_arr), ('F', f_arr)]:
for axis, label in [(0, 'col_sum'), (1, 'row_sum')]:
t0 = time.perf_counter()
for _ in range(10):
_ = arr.sum(axis=axis)
elapsed = (time.perf_counter() - t0) / 10
results[f"{name}_{label}"] = elapsed
for k, v in results.items():
print(f"{k}: {v:.4f}s")
# C_row_sum < C_col_sum (rows are contiguous in C-order)
# F_col_sum < F_row_sum (columns are contiguous in F-order)
ex-sp-ch05-19
ChallengeImplement a Monte Carlo simulation that estimates using vectorized NumPy operations. Run it with samples. Then run 100 independent trials with different seeds to estimate the standard error and construct a 95% confidence interval for .
Fraction of random points in unit square that fall inside unit circle = pi/4.
Use SeedSequence.spawn(100) for independent trials.
Solution
def estimate_pi(rng, n):
x = rng.uniform(0, 1, n)
y = rng.uniform(0, 1, n)
return 4 * np.mean(x**2 + y**2 <= 1)
# Single high-precision estimate
rng = np.random.default_rng(42)
pi_hat = estimate_pi(rng, 10_000_000)
print(f"Pi estimate: {pi_hat:.6f}")
# 100 independent trials
ss = np.random.SeedSequence(42)
rngs = [np.random.default_rng(s) for s in ss.spawn(100)]
estimates = np.array([estimate_pi(rng, 10_000_000) for rng in rngs])
mean = estimates.mean()
se = estimates.std() / np.sqrt(len(estimates))
ci = (mean - 1.96 * se, mean + 1.96 * se)
print(f"Mean: {mean:.6f}, SE: {se:.6f}")
print(f"95% CI: ({ci[0]:.6f}, {ci[1]:.6f})")
ex-sp-ch05-20
ChallengeUse np.einsum to compute the following tensor operations without
any loops:
- Trace of a matrix:
- Matrix-vector product:
- Quadratic form:
- Batch outer products: given vectors
u[i]andv[i](both shape(batch, d)), compute all outer productsu[i] v[i]^T - Tensor contraction: contract a 4-D tensor
T[i,j,k,l]with a matrixM[k,m]along axisk
Verify each result against explicit NumPy operations.
Trace: "ii->"; matrix-vector: "ij,j->i"; quadratic: "i,ij,j->".
All five operations
rng = np.random.default_rng(42)
n, d, batch = 5, 4, 10
A = rng.standard_normal((n, n))
x = rng.standard_normal(n)
# 1. Trace
tr_ein = np.einsum('ii->', A)
assert np.isclose(tr_ein, np.trace(A))
# 2. Matrix-vector
mv_ein = np.einsum('ij,j->i', A, x)
assert np.allclose(mv_ein, A @ x)
# 3. Quadratic form
qf_ein = np.einsum('i,ij,j->', x, A, x)
assert np.isclose(qf_ein, x @ A @ x)
# 4. Batch outer products
u = rng.standard_normal((batch, d))
v = rng.standard_normal((batch, d))
bo_ein = np.einsum('bi,bj->bij', u, v)
bo_exp = u[:, :, None] * v[:, None, :]
assert np.allclose(bo_ein, bo_exp)
# 5. Tensor contraction
T = rng.standard_normal((3, 4, 5, 6))
M = rng.standard_normal((5, 7))
tc_ein = np.einsum('ijkl,km->ijml', T, M)
tc_exp = np.tensordot(T, M, axes=([2], [0]))
# tensordot puts contracted result at end; transpose to match
assert np.allclose(tc_ein, tc_exp.transpose(0, 1, 3, 2))