Ferkans — Interactive Telecom Tutor

ex-ch02-01

Easy

A BPSK receiver observes a signal $r$ and must decide between hypotheses $H_0$ (bit $0$ sent) and $H_1$ (bit $1$ sent). The channel model is $r = (-1)^{b}\,A + n,$ where $A = 1$ , $b \in \{0,1\}$ is the transmitted bit, and $n \sim \mathcal{N}(0,\sigma^2)$ with $\sigma^2 = 1$ . The prior probabilities are $P(H_0) = 0.6$ and $P(H_1) = 0.4$ .

Given an observation $r = 0.5$ , use Bayes' theorem to compute the posterior probabilities $P(H_0 \mid r = 0.5)$ and $P(H_1 \mid r = 0.5)$ .

Show Hint

Under $H_0$ the transmitted signal is $+A = 1$ , so $r \mid H_0 \sim \mathcal{N}(1, 1)$ . Under $H_1$ the transmitted signal is $-A = -1$ , so $r \mid H_1 \sim \mathcal{N}(-1, 1)$ .

Evaluate the likelihoods: $f(r=0.5 \mid H_0) = \frac{1}{\sqrt{2\pi}} e^{-0.125}$ and $f(r=0.5 \mid H_1) = \frac{1}{\sqrt{2\pi}} e^{-1.125}$ .

Apply Bayes' rule: $P(H_i \mid r) = \frac{f(r \mid H_i)\,P(H_i)}{f(r \mid H_0)\,P(H_0) + f(r \mid H_1)\,P(H_1)}$ .

Solution

Likelihoods

Under $H_0$ : $r \mid H_0 \sim \mathcal{N}(1, 1)$ , so $f(r{=}0.5 \mid H_0) = \frac{1}{\sqrt{2\pi}}\exp\!\Bigl(-\frac{(0.5 - 1)^2}{2}\Bigr) = \frac{1}{\sqrt{2\pi}}\,e^{-0.125}.$

Under $H_1$ : $r \mid H_1 \sim \mathcal{N}(-1, 1)$ , so $f(r{=}0.5 \mid H_1) = \frac{1}{\sqrt{2\pi}}\exp\!\Bigl(-\frac{(0.5 + 1)^2}{2}\Bigr) = \frac{1}{\sqrt{2\pi}}\,e^{-1.125}.$

Evidence (marginal likelihood)

$f(r{=}0.5) = f(r \mid H_0)\,P(H_0) + f(r \mid H_1)\,P(H_1) = \frac{1}{\sqrt{2\pi}}\bigl(0.6\,e^{-0.125} + 0.4\,e^{-1.125}\bigr).KATEXPLACEHOLDER0ENDf(r{=}0.5) \approx \frac{1}{\sqrt{2\pi}}(0.5295 + 0.1299) = \frac{0.6594}{\sqrt{2\pi}}.$ $

Posterior probabilities via Bayes' theorem

$P(H_0 \mid r{=}0.5) = \frac{0.6 \times 0.8825}{0.6594} = \frac{0.5295}{0.6594} \approx 0.8031.KATEXPLACEHOLDER0ENDP(H_1 \mid r{=}0.5) = \frac{0.4 \times 0.3247}{0.6594} = \frac{0.1299}{0.6594} \approx 0.1969.$ $Sanity check:$ 0.8031 + 0.1969 = 1.0000 $. The observation$ r = 0.5 $is closer to$ +1 $than to$ -1 $, so the posterior strongly favors$ H_0 $.$ \square$

ex-ch02-02

Easy

The inter-arrival time of packets at a network node is modeled as an exponential random variable $X$ with rate parameter $\lambda = 2$ packets/ms.

(a) Write the PDF $f_X(x)$ and CDF $F_X(x)$ .

(b) Compute the mean $E[X]$ and variance $\operatorname{Var}(X)$ .

(c) Find $P(X > 1\;\text{ms})$ .

Show Hint

For an exponential RV with rate $\lambda$ , the PDF is $f_X(x) = \lambda e^{-\lambda x}$ for $x \ge 0$ .

The CDF is $F_X(x) = 1 - e^{-\lambda x}$ . The mean is $1/\lambda$ and the variance is $1/\lambda^2$ .

Use the survival function: $P(X > t) = 1 - F_X(t) = e^{-\lambda t}$ .

Solution

PDF and CDF

For an exponential random variable with rate $\lambda = 2$ : $f_X(x) = 2\,e^{-2x}, \quad x \ge 0,$ $F_X(x) = 1 - e^{-2x}, \quad x \ge 0.$

Mean and variance

The mean of an $\text{Exp}(\lambda)$ random variable is $E[X] = \frac{1}{\lambda} = \frac{1}{2} = 0.5\;\text{ms}.$

The variance is $\operatorname{Var}(X) = \frac{1}{\lambda^2} = \frac{1}{4} = 0.25\;\text{ms}^2.$

Tail probability

$P(X > 1) = 1 - F_X(1) = e^{-2 \cdot 1} = e^{-2} \approx 0.1353.$ $Thus there is approximately a$ 13.5% $chance that the inter-arrival time exceeds$ 1 $ms.$ \square$

ex-ch02-03

Easy

Let $X \sim \mathcal{N}(0, \sigma^2)$ model the noise voltage at a receiver front-end, with $\sigma^2 = 4$ . Define the instantaneous power as $Y = X^2$ .

Using the law of the unconscious statistician (LOTUS), compute $E[Y]$ and $E[Y^2]$ , then find $\operatorname{Var}(Y)$ .

Show Hint

LOTUS states $E[g(X)] = \int_{-\infty}^{\infty} g(x)\,f_X(x)\,dx$ . Here $g(x) = x^2$ .

For a zero-mean Gaussian, $E[X^2] = \sigma^2$ and $E[X^4] = 3\sigma^4$ (the fourth moment).

$\operatorname{Var}(Y) = E[Y^2] - (E[Y])^2 = E[X^4] - (E[X^2])^2$ .

Solution

Compute $E[Y] = E[X^2]$ via LOTUS

By LOTUS: $E[Y] = E[X^2] = \int_{-\infty}^{\infty} x^2 \cdot \frac{1}{\sqrt{2\pi \cdot 4}}\, e^{-x^2/8}\,dx = \sigma^2 = 4.$

This follows directly from the definition of variance for a zero-mean RV: $\operatorname{Var}(X) = E[X^2] - (E[X])^2 = E[X^2] = \sigma^2$ .

Compute $E[Y^2] = E[X^4]$

The fourth central moment of a Gaussian $\mathcal{N}(0,\sigma^2)$ is $E[X^4] = 3\sigma^4.$

This can be derived via LOTUS and integration by parts, or by differentiating the moment generating function twice. With $\sigma^2 = 4$ : $E[Y^2] = E[X^4] = 3 \cdot 16 = 48.$

Variance of the instantaneous power

$\operatorname{Var}(Y) = E[Y^2] - (E[Y])^2 = 48 - 16 = 32.$ $Equivalently, since$ Y = X^2 $and$ X \sim \mathcal{N}(0,4) $, we have$ Y/4 \sim \chi^2_1 $(chi-squared with one degree of freedom), whose variance is$ 2 $. Therefore$ \operatorname{Var}(Y) = 16 \cdot 2 = 32 $.$ \square$

ex-ch02-04

Easy

Let $(X, Y)$ be a jointly Gaussian random vector with joint PDF $f_{X,Y}(x,y) = \frac{1}{2\pi\sigma_X \sigma_Y \sqrt{1-\rho^2}} \exp\!\biggl(-\frac{1}{2(1-\rho^2)} \Bigl[\frac{x^2}{\sigma_X^2} - \frac{2\rho\,x\,y}{\sigma_X \sigma_Y} + \frac{y^2}{\sigma_Y^2}\Bigr]\biggr),$ where $\sigma_X = 1$ , $\sigma_Y = 2$ , $\rho = 0.5$ , and both means are zero.

Obtain the marginal PDF $f_X(x)$ by integrating over $y$ .

Show Hint

Write the exponent as a function of $y$ by completing the square: group terms involving $y$ and separate the $x$ -only part.

After completing the square, the $y$ -integral is a Gaussian integral of the form $\int_{-\infty}^{\infty} e^{-\alpha(y - \mu)^2}\,dy = \sqrt{\pi/\alpha}$ .

The remaining $x$ -dependent factor should yield $f_X(x) = \frac{1}{\sqrt{2\pi}\,\sigma_X}\,e^{-x^2/(2\sigma_X^2)}$ .

Solution

Set up the marginal integral

$f_X(x) = \int_{-\infty}^{\infty} f_{X,Y}(x,y)\,dy = \frac{1}{2\pi \cdot 1 \cdot 2 \cdot \sqrt{1-0.25}} \int_{-\infty}^{\infty} \exp\!\Bigl(-\frac{1}{2 \cdot 0.75} \Bigl[x^2 - \frac{2 \cdot 0.5\,x\,y}{2} + \frac{y^2}{4}\Bigr]\Bigr)\,dy.$ $Simplifying the prefactor:$ 2\pi \cdot 2\sqrt{0.75} = 4\pi\sqrt{3}/2 = 2\pi\sqrt{3}$.

Complete the square in $y$

Inside the exponent, group terms involving $y$ : $\frac{y^2}{4} - \frac{0.5\,x\,y}{1} = \frac{1}{4}\Bigl(y^2 - 2xy\Bigr) = \frac{1}{4}\Bigl[(y - x)^2 - x^2\Bigr].$

The full exponent becomes $-\frac{1}{1.5}\Bigl[x^2 + \frac{(y-x)^2}{4} - \frac{x^2}{4}\Bigr] = -\frac{1}{1.5}\cdot\frac{3x^2}{4} - \frac{(y - x)^2}{6}.$

That is, $-\frac{x^2}{2} - \frac{(y - x)^2}{6}$ .

Evaluate the Gaussian integral and obtain $f_X(x)$

$f_X(x) = \frac{e^{-x^2/2}}{2\pi\sqrt{3}} \int_{-\infty}^{\infty} e^{-(y-x)^2/6}\,dy.KATEXPLACEHOLDER0ENDf_X(x) = \frac{e^{-x^2/2}}{2\pi\sqrt{3}} \cdot \sqrt{6\pi} = \frac{e^{-x^2/2}}{2\pi\sqrt{3}} \cdot \sqrt{6\pi} = \frac{e^{-x^2/2}}{\sqrt{2\pi}}.$ $Therefore$ f_X(x) = \frac{1}{\sqrt{2\pi}},e^{-x^2/2} $, confirming that$ X \sim \mathcal{N}(0, 1) $, consistent with the marginal parameters.$ \square$

ex-ch02-05

Easy

Packets arrive at a base station according to a Poisson process with rate $\lambda = 5$ packets per second.

(a) What is the probability that exactly $3$ packets arrive in a $1$ -second interval?

(b) What is the probability that no packets arrive in a $0.5$ -second interval?

(c) Find the expected number of packets in a $10$ -second window and its variance.

Show Hint

For a Poisson process with rate $\lambda$ , the number of arrivals in $[0,t]$ is $N(t) \sim \text{Poisson}(\lambda t)$ .

$P(N(t) = k) = \frac{(\lambda t)^k}{k!}\,e^{-\lambda t}$ .

Both the mean and variance of a Poisson random variable equal $\lambda t$ .

Solution

Part (a): $P(N(1) = 3)$

With $\lambda t = 5 \cdot 1 = 5$ : $P(N(1) = 3) = \frac{5^3}{3!}\,e^{-5} = \frac{125}{6}\,e^{-5} \approx 20.833 \times 0.006738 \approx 0.1404.$

Part (b): $P(N(0.5) = 0)$

With $\lambda t = 5 \cdot 0.5 = 2.5$ : $P(N(0.5) = 0) = \frac{2.5^0}{0!}\,e^{-2.5} = e^{-2.5} \approx 0.0821.$

Part (c): mean and variance over 10 seconds

For a Poisson process, $E[N(t)] = \operatorname{Var}(N(t)) = \lambda t$ . With $t = 10$ : $E[N(10)] = 5 \times 10 = 50\;\text{packets},$ $\operatorname{Var}(N(10)) = 50\;\text{packets}^2.$

The standard deviation is $\sqrt{50} \approx 7.07$ packets. $\square$

ex-ch02-06

Medium

A multi-stage detection system processes a received signal through two successive classifiers. The first classifier $C_1$ declares "signal present" ( $D_1$ ) with probability $P(D_1 \mid H_1) = 0.95$ when a signal is truly present, and has a false alarm rate $P(D_1 \mid H_0) = 0.10$ . The second classifier $C_2$ takes the output of $C_1$ and refines it: $P(D_2 \mid D_1, H_1) = 0.98$ and $P(D_2 \mid D_1, H_0) = 0.05$ .

If $C_1$ declares "no signal" ( $\bar{D}_1$ ), the system outputs $\bar{D}_2$ regardless. The prior probability of signal presence is $P(H_1) = 0.01$ .

Using the total probability theorem and Bayes' theorem, find:

(a) $P(D_2)$ , the overall probability that the system declares "signal present."

(b) $P(H_1 \mid D_2)$ , the posterior probability that a signal is truly present given a positive declaration.

Show Hint

First compute $P(D_2 \mid H_1)$ and $P(D_2 \mid H_0)$ by conditioning on $D_1$ : $P(D_2 \mid H_i) = P(D_2 \mid D_1, H_i)\,P(D_1 \mid H_i)$ .

Apply the total probability theorem: $P(D_2) = P(D_2 \mid H_1)\,P(H_1) + P(D_2 \mid H_0)\,P(H_0)$ .

Finally, use Bayes' theorem: $P(H_1 \mid D_2) = P(D_2 \mid H_1)\,P(H_1) / P(D_2)$ .

Solution

Conditional detection probabilities

Since $C_2$ only activates when $C_1$ declares $D_1$ : $P(D_2 \mid H_1) = P(D_2 \mid D_1, H_1)\,P(D_1 \mid H_1) = 0.98 \times 0.95 = 0.931.$

$P(D_2 \mid H_0) = P(D_2 \mid D_1, H_0)\,P(D_1 \mid H_0) = 0.05 \times 0.10 = 0.005.$

Total probability of detection

By the total probability theorem: $P(D_2) = P(D_2 \mid H_1)\,P(H_1) + P(D_2 \mid H_0)\,P(H_0) = 0.931 \times 0.01 + 0.005 \times 0.99 = 0.00931 + 0.00495 = 0.01426.$

Posterior probability via Bayes' theorem

$P(H_1 \mid D_2) = \frac{P(D_2 \mid H_1)\,P(H_1)}{P(D_2)} = \frac{0.00931}{0.01426} \approx 0.6529.$ $The two-stage system raises the posterior from the prior of$ 1% $to approximately$ 65% $. The cascaded detection dramatically reduces the false alarm rate from$ 10% $to$ 0.5% $, which is essential in radar and cognitive radio applications.$ \square$

ex-ch02-07

Medium

Let $X$ and $Y$ be independent, zero-mean Gaussian random variables, each with variance $\sigma^2$ , representing the in-phase and quadrature components of narrowband noise. Define the envelope $R = \sqrt{X^2 + Y^2}.$

Derive the PDF of $R$ (the Rayleigh distribution) from first principles.

Show Hint

First find the CDF: $F_R(r) = P(R \le r) = P(X^2 + Y^2 \le r^2)$ .

Switch to polar coordinates $(x, y) = (\rho \cos\theta, \rho \sin\theta)$ in the joint PDF of $(X, Y)$ .

Differentiate the CDF with respect to $r$ to obtain the PDF.

Solution

Joint PDF in Cartesian coordinates

Since $X$ and $Y$ are independent $\mathcal{N}(0, \sigma^2)$ : $f_{X,Y}(x,y) = f_X(x)\,f_Y(y) = \frac{1}{2\pi\sigma^2} \exp\!\Bigl(-\frac{x^2 + y^2}{2\sigma^2}\Bigr).$

Transform to polar coordinates

Let $x = \rho\cos\theta$ and $y = \rho\sin\theta$ , with Jacobian $|J| = \rho$ . The joint PDF in polar form is $f_{R,\Theta}(\rho,\theta) = \frac{\rho}{2\pi\sigma^2} \exp\!\Bigl(-\frac{\rho^2}{2\sigma^2}\Bigr), \quad \rho \ge 0,\; 0 \le \theta < 2\pi.$

Marginalize over $\theta$ to get $f_R(r)$

$f_R(r) = \int_0^{2\pi} f_{R,\Theta}(r,\theta)\,d\theta = \frac{r}{2\pi\sigma^2}\,e^{-r^2/(2\sigma^2)} \cdot 2\pi = \frac{r}{\sigma^2}\,e^{-r^2/(2\sigma^2)}, \quad r \ge 0.$ $This is the **Rayleigh PDF** with parameter$ \sigma $. Its mean is$ E[R] = \sigma\sqrt{\pi/2} $and its second moment is$ E[R^2] = 2\sigma^2 $. The Rayleigh distribution is fundamental in wireless communications for modeling the envelope of multipath fading.$ \square$

ex-ch02-08

Medium

Let $X_1, X_2, \ldots, X_n$ be independent and identically distributed exponential random variables with rate $\lambda$ , modeling the inter-arrival times of packets. Define the total waiting time for $n$ arrivals as $S_n = \sum_{i=1}^n X_i$ .

(a) Derive the moment generating function (MGF) $M_{S_n}(t)$ .

(b) Identify the distribution of $S_n$ from the MGF.

(c) Specialize to $n = 3$ , $\lambda = 2$ and compute $E[S_3]$ and $\operatorname{Var}(S_3)$ .

Show Hint

The MGF of a single $\text{Exp}(\lambda)$ RV is $M_X(t) = \frac{\lambda}{\lambda - t}$ for $t < \lambda$ .

For independent RVs, the MGF of the sum is the product of individual MGFs.

The resulting MGF $\left(\frac{\lambda}{\lambda - t}\right)^n$ is the MGF of a Gamma $(n, \lambda)$ (or Erlang- $n$ ) distribution.

Solution

MGF of a single exponential RV

For $X_i \sim \text{Exp}(\lambda)$ : $M_{X_i}(t) = E[e^{tX_i}] = \int_0^{\infty} e^{tx}\,\lambda e^{-\lambda x}\,dx = \lambda \int_0^{\infty} e^{-(\lambda - t)x}\,dx = \frac{\lambda}{\lambda - t}, \quad t < \lambda.$

MGF of the sum $S_n$

Since the $X_i$ are independent, the MGF of the sum factorizes: $M_{S_n}(t) = \prod_{i=1}^n M_{X_i}(t) = \left(\frac{\lambda}{\lambda - t}\right)^n, \quad t < \lambda.$

This is the MGF of the Erlang- $n$ distribution (equivalently, $\text{Gamma}(n, \lambda)$ ), with PDF $f_{S_n}(s) = \frac{\lambda^n\,s^{n-1}}{(n-1)!}\,e^{-\lambda s}, \quad s \ge 0.$

Numerical evaluation for $n=3$, $\lambda=2$

The Erlang- $n$ distribution has mean $n/\lambda$ and variance $n/\lambda^2$ : $E[S_3] = \frac{3}{2} = 1.5\;\text{time units},$ $\operatorname{Var}(S_3) = \frac{3}{4} = 0.75\;\text{time units}^2.$

In a Poisson process context, $S_3$ is the arrival time of the third event — this is the connection between inter-arrival times and the counting process. $\square$

ex-ch02-09

Medium

A random vector $\mathbf{X} = [X_1, X_2]^T$ has zero mean and covariance matrix $\mathbf{C}_{\mathbf{X}} = \begin{bmatrix} 5 & 3 \\ 3 & 5 \end{bmatrix}.$

(a) Find the eigenvalues and eigenvectors of $\mathbf{C}_{\mathbf{X}}$ .

(b) Define the Karhunen-Lo`eve transform (KLT) $\mathbf{Y} = \mathbf{Q}^T \mathbf{X}$ where $\mathbf{Q}$ is the matrix of eigenvectors. Show that the components of $\mathbf{Y}$ are uncorrelated.

(c) Interpret the result in the context of decorrelating received signal components in a $2 \times 2$ MIMO system.

Show Hint

The eigenvalues of $\mathbf{C}_{\mathbf{X}}$ are found from $(5-\lambda)^2 - 9 = 0$ .

The KLT diagonalizes the covariance: $\mathbf{C}_{\mathbf{Y}} = \mathbf{Q}^T \mathbf{C}_{\mathbf{X}} \mathbf{Q} = \mathbf{\Lambda}$ .

Uncorrelated Gaussian components become independent, enabling per-stream detection.

Solution

Eigendecomposition of $\mathbf{C}_{\mathbf{X}}$

The characteristic equation is $\det(\mathbf{C}_{\mathbf{X}} - \lambda \mathbf{I}) = (5-\lambda)^2 - 9 = \lambda^2 - 10\lambda + 16 = (\lambda - 8)(\lambda - 2) = 0.$

Eigenvalues: $\lambda_1 = 8$ , $\lambda_2 = 2$ .

For $\lambda_1 = 8$ : $(5-8)v_1 + 3v_2 = 0 \Rightarrow v_1 = v_2$ , giving $\mathbf{q}_1 = \frac{1}{\sqrt{2}}[1, 1]^T$ .

For $\lambda_2 = 2$ : $(5-2)v_1 + 3v_2 = 0 \Rightarrow v_1 = -v_2$ , giving $\mathbf{q}_2 = \frac{1}{\sqrt{2}}[1, -1]^T$ .

KLT decorrelation

With $\mathbf{Q} = \frac{1}{\sqrt{2}}\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}$ and $\mathbf{Y} = \mathbf{Q}^T \mathbf{X}$ : $\mathbf{C}_{\mathbf{Y}} = E[\mathbf{Y}\mathbf{Y}^T] = \mathbf{Q}^T \mathbf{C}_{\mathbf{X}} \mathbf{Q} = \mathbf{Q}^T \mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^T \mathbf{Q} = \mathbf{\Lambda} = \begin{bmatrix} 8 & 0 \\ 0 & 2 \end{bmatrix}.$

Since $\mathbf{C}_{\mathbf{Y}}$ is diagonal, $\operatorname{Cov}(Y_1, Y_2) = 0$ : the components are uncorrelated.

MIMO interpretation

In a $2 \times 2$ MIMO system, the received signals may be correlated due to insufficient antenna spacing or scattering geometry. The KLT rotates the coordinate system to align with the principal axes of the covariance ellipsoid.

After the transform, $Y_1$ (with variance $8$ ) carries the dominant signal energy along the sum direction $(\mathbf{q}_1)$ , while $Y_2$ (variance $2$ ) captures the weaker difference component. This is equivalent to the whitening step in MIMO receivers, and when $\mathbf{X}$ is jointly Gaussian, uncorrelatedness implies independence — enabling independent per-stream detection. $\square$

ex-ch02-10

Medium

Let $X_1, X_2, \ldots, X_n$ be i.i.d. uniform random variables on $[0, 1]$ . Define $S_n = \sum_{i=1}^n X_i$ .

(a) Compute the exact mean and variance of $S_n$ .

(b) Using the Central Limit Theorem, approximate $P(S_{12} > 7)$ .

(c) Compare with the exact value (noting that $S_{12}$ has an Irwin-Hall distribution) and comment on the accuracy of the CLT approximation.

Show Hint

For $X_i \sim \text{Uniform}(0,1)$ : $E[X_i] = 1/2$ , $\operatorname{Var}(X_i) = 1/12$ .

By the CLT, $\frac{S_n - n/2}{\sqrt{n/12}} \xrightarrow{d} \mathcal{N}(0,1)$ as $n \to \infty$ .

For $n = 12$ : $E[S_{12}] = 6$ , $\operatorname{Var}(S_{12}) = 1$ , so $S_{12} \approx \mathcal{N}(6, 1)$ .

Solution

Exact mean and variance

For each $X_i \sim \text{Uniform}(0,1)$ : $E[X_i] = \frac{1}{2}, \quad \operatorname{Var}(X_i) = \frac{1}{12}.$

By linearity of expectation and independence: $E[S_n] = \frac{n}{2}, \quad \operatorname{Var}(S_n) = \frac{n}{12}.$

CLT approximation for $P(S_{12} > 7)$

For $n = 12$ : $E[S_{12}] = 6$ and $\operatorname{Var}(S_{12}) = 1$ , so $\sigma_{S_{12}} = 1$ .

By the CLT: $P(S_{12} > 7) \approx P\!\left(\frac{S_{12} - 6}{1} > \frac{7 - 6}{1}\right) = P(Z > 1) = 1 - \Phi(1) \approx 0.1587.$

Comparison with exact value

The exact distribution of $S_{12}$ is the Irwin-Hall distribution with $n = 12$ , which has PDF given by the inclusion-exclusion formula: $f_{S_{12}}(s) = \frac{1}{11!}\sum_{k=0}^{\lfloor s \rfloor} (-1)^k \binom{12}{k}(s - k)^{11}.$

The exact value of $P(S_{12} > 7)$ is approximately $0.1573$ .

The CLT approximation of $0.1587$ has a relative error of about $0.9\%$ , which is remarkably accurate for $n = 12$ . This is partly because the uniform distribution is symmetric and bounded, so higher-order cumulants vanish or are small, accelerating CLT convergence. In telecommunications, the sum of $12$ uniform RVs is a classic method for generating approximate Gaussian samples. $\square$

ex-ch02-11

Medium

A wide-sense stationary (WSS) random process $X(t)$ with autocorrelation $R_X(\tau) = \sigma^2 e^{-\alpha|\tau|}$ is passed through a causal LTI filter with impulse response $h(t) = \beta\,e^{-\beta t}\,u(t)$ , where $u(t)$ is the unit step function.

Let $Y(t)$ denote the output process. Set $\sigma^2 = 1$ , $\alpha = 1$ , and $\beta = 2$ .

(a) Find the power spectral density $S_X(f)$ of the input.

(b) Compute the transfer function $H(f)$ and the output PSD $S_Y(f)$ .

(c) Find the output power $E[Y^2(t)]$ .

Show Hint

The PSD is the Fourier transform of the autocorrelation: $S_X(f) = \mathcal{F}\{R_X(\tau)\}$ .

For an LTI system, $S_Y(f) = |H(f)|^2\,S_X(f)$ .

The output power is $E[Y^2(t)] = R_Y(0) = \int_{-\infty}^{\infty} S_Y(f)\,df$ .

Solution

Input PSD

The Fourier transform of $R_X(\tau) = e^{-|\tau|}$ is $S_X(f) = \int_{-\infty}^{\infty} e^{-|\tau|}\,e^{-j2\pi f \tau}\,d\tau = \frac{2\alpha}{\alpha^2 + (2\pi f)^2}.$

With $\alpha = 1$ : $S_X(f) = \frac{2}{1 + 4\pi^2 f^2}.$

Transfer function and output PSD

The Fourier transform of $h(t) = 2e^{-2t}u(t)$ is $H(f) = \int_0^{\infty} 2\,e^{-2t}\,e^{-j2\pi f t}\,dt = \frac{2}{2 + j2\pi f} = \frac{\beta}{\beta + j2\pi f}.$

The squared magnitude: $|H(f)|^2 = \frac{4}{4 + 4\pi^2 f^2}.$

The output PSD: $S_Y(f) = |H(f)|^2\,S_X(f) = \frac{4}{4 + 4\pi^2 f^2} \cdot \frac{2}{1 + 4\pi^2 f^2} = \frac{8}{(4 + 4\pi^2 f^2)(1 + 4\pi^2 f^2)}.$

Output power via partial fractions

Let $u = 4\pi^2 f^2$ . We use partial fractions: $\frac{8}{(4+u)(1+u)} = \frac{8}{3}\left(\frac{1}{1+u} - \frac{1}{4+u}\right).$

The output power is $E[Y^2(t)] = \int_{-\infty}^{\infty} S_Y(f)\,df = \frac{8}{3}\left[\int_{-\infty}^{\infty}\frac{df}{1+4\pi^2 f^2} - \int_{-\infty}^{\infty}\frac{df}{4+4\pi^2 f^2}\right].$

Using $\int_{-\infty}^{\infty}\frac{df}{a^2+4\pi^2 f^2} = \frac{1}{2a}$ :

First integral ( $a=1$ ): $1/2$ .
Second integral ( $a=2$ ): $1/4$ .

$E[Y^2(t)] = \frac{8}{3}\left(\frac{1}{2} - \frac{1}{4}\right) = \frac{8}{3} \cdot \frac{1}{4} = \frac{2}{3} \approx 0.667.$

The filter reduces the input power from $R_X(0) = 1$ to $2/3$ , acting as a low-pass filter that attenuates high-frequency noise components. $\square$

ex-ch02-12

Medium

A wireless channel alternates among three states: Good ( $G$ ), Fair ( $F$ ), and Bad ( $B$ ). The one-step transition probability matrix is $\mathbf{P} = \begin{bmatrix} 0.7 & 0.2 & 0.1 \\ 0.3 & 0.5 & 0.2 \\ 0.1 & 0.3 & 0.6 \end{bmatrix},$ where rows and columns are ordered $(G, F, B)$ .

(a) Verify that $\mathbf{P}$ is a valid stochastic matrix.

(b) Find the stationary distribution $\boldsymbol{\pi} = [\pi_G, \pi_F, \pi_B]$ by solving $\boldsymbol{\pi}\mathbf{P} = \boldsymbol{\pi}$ together with $\pi_G + \pi_F + \pi_B = 1$ .

(c) Interpret the stationary distribution in terms of long-run channel availability.

Show Hint

A valid stochastic matrix has non-negative entries and each row sums to $1$ .

Write $\boldsymbol{\pi}\mathbf{P} = \boldsymbol{\pi}$ as three equations: $\pi_G = 0.7\pi_G + 0.3\pi_F + 0.1\pi_B$ , etc.

Use two of the three balance equations plus the normalization constraint $\pi_G + \pi_F + \pi_B = 1$ .

Solution

Verify stochastic matrix

All entries are non-negative. Row sums:

Row $G$ : $0.7 + 0.2 + 0.1 = 1.0$ $\checkmark$
Row $F$ : $0.3 + 0.5 + 0.2 = 1.0$ $\checkmark$
Row $B$ : $0.1 + 0.3 + 0.6 = 1.0$ $\checkmark$

Therefore $\mathbf{P}$ is a valid (right) stochastic matrix.

Set up balance equations

The equation $\boldsymbol{\pi}\mathbf{P} = \boldsymbol{\pi}$ gives: $\pi_G = 0.7\pi_G + 0.3\pi_F + 0.1\pi_B, \tag{1}$ $\pi_F = 0.2\pi_G + 0.5\pi_F + 0.3\pi_B, \tag{2}$ $\pi_B = 0.1\pi_G + 0.2\pi_F + 0.6\pi_B. \tag{3}$

From (1): $0.3\pi_G = 0.3\pi_F + 0.1\pi_B$ , i.e., $3\pi_G = 3\pi_F + \pi_B$ . $\quad(1')$

From (3): $0.4\pi_B = 0.1\pi_G + 0.2\pi_F$ , i.e., $4\pi_B = \pi_G + 2\pi_F$ . $\quad(3')$

Solve the system

From $(3')$ : $\pi_G = 4\pi_B - 2\pi_F$ .

Substitute into $(1')$ : $3(4\pi_B - 2\pi_F) = 3\pi_F + \pi_B$ , $12\pi_B - 6\pi_F = 3\pi_F + \pi_B$ , $11\pi_B = 9\pi_F$ , $\pi_F = \frac{11}{9}\pi_B$ .

Then $\pi_G = 4\pi_B - \frac{22}{9}\pi_B = \frac{14}{9}\pi_B$ .

Normalization: $\frac{14}{9}\pi_B + \frac{11}{9}\pi_B + \pi_B = 1$ , so $\frac{34}{9}\pi_B = 1$ , giving $\pi_B = \frac{9}{34}$ .

$\pi_G = \frac{14}{34} = \frac{7}{17} \approx 0.4118,$ $\pi_F = \frac{11}{34} \approx 0.3235,$ $\pi_B = \frac{9}{34} \approx 0.2647.$

Channel availability interpretation

In the long run, the channel is in the Good state about $41\%$ of the time, Fair about $32\%$ , and Bad about $26\%$ . If we define "available" as Good or Fair, the long-run availability is $\pi_G + \pi_F = 25/34 \approx 73.5\%$ .

This stationary distribution is crucial for computing the ergodic capacity of the channel: $\bar{C} = \sum_s \pi_s\,C_s$ , where $C_s$ is the capacity in state $s$ . $\square$

ex-ch02-13

Medium

Two independent Poisson processes model packet arrivals from two user classes at a base station: Class A with rate $\lambda_A = 3$ packets/s and Class B with rate $\lambda_B = 7$ packets/s.

(a) Show that the merged (superposed) arrival process $N(t) = N_A(t) + N_B(t)$ is also a Poisson process, and find its rate.

(b) Given that a packet arrives, what is the probability it belongs to Class A?

(c) Compute the probability that exactly $2$ Class-A packets and exactly $3$ Class-B packets arrive in a $1$ -second interval.

Show Hint

The superposition of independent Poisson processes is Poisson with rate $\lambda_A + \lambda_B$ .

By the decomposition property, each arrival is independently classified as Class A with probability $\lambda_A / (\lambda_A + \lambda_B)$ .

Since $N_A(1)$ and $N_B(1)$ are independent Poisson RVs, the joint probability factors.

Solution

Superposition is Poisson

The MGF of the merged count in $[0,t]$ is $M_{N(t)}(s) = M_{N_A(t)}(s)\,M_{N_B(t)}(s) = e^{\lambda_A t(e^s - 1)}\,e^{\lambda_B t(e^s - 1)} = e^{(\lambda_A + \lambda_B)t(e^s - 1)},$ which is the MGF of a $\text{Poisson}((\lambda_A + \lambda_B)\,t)$ random variable.

Therefore $N(t) \sim \text{Poisson}(10t)$ with merged rate $\lambda = \lambda_A + \lambda_B = 10$ packets/s.

Classification probability

Given an arrival in the merged process, the probability it is Class A is $P(\text{Class A}) = \frac{\lambda_A}{\lambda_A + \lambda_B} = \frac{3}{10} = 0.3.$

This follows from the independent thinning property of Poisson processes: each arrival is independently "marked" as Class A with probability $\lambda_A/\lambda$ and Class B with probability $\lambda_B/\lambda$ .

Joint probability of specific counts

Since $N_A(1) \sim \text{Poisson}(3)$ and $N_B(1) \sim \text{Poisson}(7)$ are independent: $P(N_A(1)=2,\; N_B(1)=3) = P(N_A(1)=2)\,P(N_B(1)=3).$

$P(N_A(1) = 2) = \frac{3^2}{2!}\,e^{-3} = \frac{9}{2}\,e^{-3} \approx 0.2240.$

$P(N_B(1) = 3) = \frac{7^3}{3!}\,e^{-7} = \frac{343}{6}\,e^{-7} \approx 0.05213.$

$P(N_A(1)=2,\; N_B(1)=3) \approx 0.2240 \times 0.05213 \approx 0.01168.$

This is a relatively rare event since the mean total is $10$ but we only see $5$ arrivals. $\square$

ex-ch02-14

Hard

Consider binary hypothesis testing for detecting a signal in noise: $H_0: r = n, \qquad H_1: r = A + n,$ where $A > 0$ is a known amplitude and $n \sim \mathcal{N}(0, \sigma^2)$ . The prior probabilities are $P(H_0) = 1 - p$ and $P(H_1) = p$ with $p \neq 0.5$ .

(a) Derive the MAP (Maximum A Posteriori) decision rule and show that the optimal threshold is $\gamma_{\text{MAP}} = \frac{A}{2} + \frac{\sigma^2}{A}\ln\frac{1-p}{p}.$

(b) Show that when $p = 0.5$ the MAP rule reduces to the ML rule with threshold $A/2$ .

(c) For $A = 2$ , $\sigma^2 = 1$ , and $p = 0.3$ , compute the MAP threshold and the resulting probability of error $P_e$ .

Show Hint

The MAP rule decides $H_1$ when $\frac{f(r \mid H_1)}{f(r \mid H_0)} > \frac{P(H_0)}{P(H_1)}$ .

Take the log of both sides of the likelihood ratio test. The log-likelihood ratio for Gaussian is linear in $r$ .

The error probability is $P_e = p\,P(\text{miss}) + (1-p)\,P(\text{FA})$ where each involves a $Q$ -function.

Solution

Derive the MAP decision rule

The MAP rule decides $H_1$ if $\frac{f(r \mid H_1)}{f(r \mid H_0)} > \frac{P(H_0)}{P(H_1)} = \frac{1-p}{p}.$

The likelihoods are: $f(r \mid H_0) = \frac{1}{\sqrt{2\pi\sigma^2}}\,e^{-r^2/(2\sigma^2)},$ $f(r \mid H_1) = \frac{1}{\sqrt{2\pi\sigma^2}}\,e^{-(r-A)^2/(2\sigma^2)}.$

The log-likelihood ratio is $\Lambda(r) = \ln\frac{f(r \mid H_1)}{f(r \mid H_0)} = \frac{-(r-A)^2 + r^2}{2\sigma^2} = \frac{2Ar - A^2}{2\sigma^2} = \frac{A}{\sigma^2}\left(r - \frac{A}{2}\right).$

Solve for the optimal threshold

The MAP rule becomes: decide $H_1$ if $\frac{A}{\sigma^2}\left(r - \frac{A}{2}\right) > \ln\frac{1-p}{p}.$

Since $A > 0$ , dividing by $A/\sigma^2 > 0$ : $r > \frac{A}{2} + \frac{\sigma^2}{A}\ln\frac{1-p}{p} \triangleq \gamma_{\text{MAP}}.$

When $p = 0.5$ : $\ln\frac{1-p}{p} = \ln 1 = 0$ , so $\gamma_{\text{MAP}} = A/2$ , which is the ML threshold. The MAP rule reduces to the midpoint decision boundary, as expected by symmetry of the priors.

Numerical computation

With $A = 2$ , $\sigma^2 = 1$ , $p = 0.3$ : $\gamma_{\text{MAP}} = \frac{2}{2} + \frac{1}{2}\ln\frac{0.7}{0.3} = 1 + \frac{1}{2}\ln\frac{7}{3} = 1 + \frac{1}{2}(0.8473) = 1.4237.$

The threshold shifts right (toward $H_1$ ) because $P(H_1) = 0.3 < 0.5$ , reflecting the prior bias toward $H_0$ .

Probability of error

$P(\text{FA}) = P(r > \gamma \mid H_0) = Q\!\left(\frac{\gamma}{\sigma}\right) = Q(1.4237) \approx 0.0773.KATEXPLACEHOLDER0ENDP(\text{miss}) = P(r \le \gamma \mid H_1) = 1 - Q\!\left(\frac{\gamma - A}{\sigma}\right) = 1 - Q(-0.5763) = Q(0.5763) \approx 0.2822.KATEXPLACEHOLDER1ENDP_e = (1-p)\,P(\text{FA}) + p\,P(\text{miss}) = 0.7 \times 0.0773 + 0.3 \times 0.2822 = 0.0541 + 0.0847 = 0.1388.$ $The MAP detector minimizes$ P_e $over all possible decision rules.$ \square$

ex-ch02-15

Hard

Let $X_1, X_2, \ldots, X_n$ be i.i.d. random variables with MGF $M_X(t) = E[e^{tX}]$ . Consider the tail probability $P(S_n \ge n\,a)$ where $S_n = \sum_{i=1}^n X_i$ and $a > E[X]$ .

(a) Derive the Chernoff bound: $P(S_n \ge n\,a) \le e^{-n\,\Lambda^*(a)},$ where $\Lambda^*(a) = \sup_{t \ge 0}\{ta - \ln M_X(t)\}$ is the Fenchel-Legendre (rate function) of the log-MGF.

(b) Specialize to $X_i \sim \mathcal{N}(0, \sigma^2)$ and show that the Chernoff bound on $P(\bar{X}_n \ge a)$ (where $\bar{X}_n = S_n/n$ ) yields $P(\bar{X}_n \ge a) \le e^{-na^2/(2\sigma^2)}.$

(c) Apply to BER analysis: if $X_i$ represents the decision statistic error for the $i$ -th bit with $\sigma^2 = 1$ , find the bound on $P(\bar{X}_{100} \ge 0.5)$ .

Show Hint

Start with Markov's inequality: $P(S_n \ge na) = P(e^{tS_n} \ge e^{tna}) \le \frac{E[e^{tS_n}]}{e^{tna}}$ for any $t \ge 0$ .

Use independence to factor $E[e^{tS_n}] = [M_X(t)]^n$ . Then optimize over $t$ .

For Gaussian $X_i$ , $M_X(t) = e^{\sigma^2 t^2/2}$ , so $\ln M_X(t) = \sigma^2 t^2/2$ .

Solution

Derive the Chernoff bound from Markov's inequality

For any $t \ge 0$ , by Markov's inequality applied to the non-negative random variable $e^{tS_n}$ : $P(S_n \ge na) = P(e^{tS_n} \ge e^{tna}) \le \frac{E[e^{tS_n}]}{e^{tna}}.$

Since the $X_i$ are i.i.d.: $E[e^{tS_n}] = E\!\left[\prod_{i=1}^n e^{tX_i}\right] = \prod_{i=1}^n E[e^{tX_i}] = [M_X(t)]^n.$

Therefore: $P(S_n \ge na) \le \frac{[M_X(t)]^n}{e^{tna}} = e^{n[\ln M_X(t) - ta]}.$

Since this holds for all $t \ge 0$ , we optimize: $P(S_n \ge na) \le \inf_{t \ge 0}\,e^{n[\ln M_X(t) - ta]} = e^{-n\sup_{t \ge 0}\{ta - \ln M_X(t)\}} = e^{-n\,\Lambda^*(a)}.$

Specialize to Gaussian

For $X_i \sim \mathcal{N}(0, \sigma^2)$ : $M_X(t) = e^{\sigma^2 t^2/2}, \quad \ln M_X(t) = \frac{\sigma^2 t^2}{2}.$

The rate function is $\Lambda^*(a) = \sup_{t \ge 0}\left\{ta - \frac{\sigma^2 t^2}{2}\right\}.$

Taking the derivative and setting it to zero: $\frac{d}{dt}\left(ta - \frac{\sigma^2 t^2}{2}\right) = a - \sigma^2 t = 0$ , giving $t^* = a/\sigma^2$ .

The second derivative is $-\sigma^2 < 0$ , confirming a maximum.

$\Lambda^*(a) = \frac{a^2}{\sigma^2} - \frac{\sigma^2}{2}\cdot\frac{a^2}{\sigma^4} = \frac{a^2}{2\sigma^2}.$

Therefore $P(\bar{X}_n \ge a) = P(S_n \ge na) \le e^{-na^2/(2\sigma^2)}$ .

BER application

With $n = 100$ , $a = 0.5$ , $\sigma^2 = 1$ : $P(\bar{X}_{100} \ge 0.5) \le e^{-100 \cdot 0.25 / 2} = e^{-12.5} \approx 3.73 \times 10^{-6}.$

For comparison, the exact Gaussian probability is $Q(0.5\sqrt{100}) = Q(5) \approx 2.87 \times 10^{-7}$ .

The Chernoff bound is about $13\times$ looser than the exact value, but it captures the correct exponential decay rate $-a^2/(2\sigma^2)$ in the exponent. This exponential tightness is the key property that makes Chernoff bounds invaluable in error probability analysis — they correctly predict how BER scales with SNR in the large-deviation regime. $\square$

ex-ch02-16

Hard

Let $X(t)$ be a wide-sense stationary (WSS) process with mean $\mu_X$ and autocorrelation $R_X(\tau)$ . The process is passed through a stable LTI system with impulse response $h(t)$ to produce $Y(t) = \int_{-\infty}^{\infty} h(\alpha)\,X(t - \alpha)\,d\alpha$ .

Prove that $Y(t)$ is also WSS by showing:

(a) $E[Y(t)]$ is constant (independent of $t$ ).

(b) $R_Y(t_1, t_2)$ depends only on $\tau = t_1 - t_2$ .

(c) Derive the relationship $R_Y(\tau) = h(\tau) * h(-\tau) * R_X(\tau)$ , or equivalently $S_Y(f) = |H(f)|^2 S_X(f)$ .

Show Hint

Use linearity of expectation and the WSS property $E[X(t)] = \mu_X$ for part (a).

For part (b), write $R_Y(t_1,t_2) = E[Y(t_1)Y(t_2)]$ as a double integral involving $R_X(t_1 - \alpha_1 - t_2 + \alpha_2)$ .

Recognize that the double integral depends only on $\tau = t_1 - t_2$ because $R_X$ depends only on its argument.

Solution

Part (a): Constant mean

$E[Y(t)] = E\!\left[\int_{-\infty}^{\infty} h(\alpha)\,X(t-\alpha)\,d\alpha\right] = \int_{-\infty}^{\infty} h(\alpha)\,E[X(t-\alpha)]\,d\alpha.KATEXPLACEHOLDER0ENDE[Y(t)] = \mu_X \int_{-\infty}^{\infty} h(\alpha)\,d\alpha = \mu_X\,H(0),$ $where$ H(0) = \int_{-\infty}^{\infty} h(\alpha),d\alpha $is the DC gain of the filter. This is a constant independent of$ t $.$ \checkmark$

Part (b): Autocorrelation depends only on $\tau$

$R_Y(t_1, t_2) = E[Y(t_1)\,Y(t_2)] = E\!\left[\int_{-\infty}^{\infty}\!\!\int_{-\infty}^{\infty} h(\alpha_1)\,h(\alpha_2)\,X(t_1 - \alpha_1)\,X(t_2 - \alpha_2)\, d\alpha_1\,d\alpha_2\right].KATEXPLACEHOLDER0ENDR_Y(t_1, t_2) = \int_{-\infty}^{\infty}\!\!\int_{-\infty}^{\infty} h(\alpha_1)\,h(\alpha_2)\, R_X\bigl((t_1 - \alpha_1) - (t_2 - \alpha_2)\bigr)\, d\alpha_1\,d\alpha_2.$ $Since$ R_X $is evaluated at$ (t_1 - t_2) - \alpha_1 + \alpha_2 = \tau - \alpha_1 + \alpha_2 $, the entire expression depends on$ t_1 $and$ t_2 $only through$ \tau = t_1 - t_2 $.$ \checkmark$

Part (c): Derive the filtering relationship

Denoting $\tau = t_1 - t_2$ : $R_Y(\tau) = \int_{-\infty}^{\infty} h(\alpha_1) \left[\int_{-\infty}^{\infty} h(\alpha_2)\, R_X(\tau - \alpha_1 + \alpha_2)\,d\alpha_2\right]\,d\alpha_1.$

The inner integral is a convolution. Define $\tilde{h}(\alpha) = h(-\alpha)$ . Then: $\int_{-\infty}^{\infty} h(\alpha_2)\,R_X(\tau - \alpha_1 + \alpha_2)\,d\alpha_2 = [\tilde{h} * R_X](\tau - \alpha_1).$

The outer integral becomes: $R_Y(\tau) = \int_{-\infty}^{\infty} h(\alpha_1)\,[\tilde{h} * R_X](\tau - \alpha_1)\,d\alpha_1 = h * \tilde{h} * R_X \;\text{ evaluated at } \tau.$

Taking the Fourier transform: $S_Y(f) = H(f)\,H^*(f)\,S_X(f) = |H(f)|^2\,S_X(f).$

This is the Wiener-Khinchin filtering theorem: the output PSD is the input PSD scaled by the squared magnitude of the transfer function. This fundamental result confirms that LTI filtering preserves the WSS property and provides the spectral input-output relationship used throughout communications system analysis. $\square$

ex-ch02-17

Hard

The Gilbert-Elliott channel is a two-state Markov chain with states $G$ (Good) and $B$ (Bad). The transition matrix is $\mathbf{P} = \begin{bmatrix} 1 - b & b \\ g & 1 - g \end{bmatrix},$ where $b = P(G \to B)$ and $g = P(B \to G)$ . Set $b = 0.1$ and $g = 0.4$ .

(a) Find the stationary distribution $[\pi_G, \pi_B]$ .

(b) Verify the Chapman-Kolmogorov equation by computing $\mathbf{P}^{(2)}$ directly and also as $\mathbf{P}^2$ .

(c) Compute $\mathbf{P}^{(n)}$ for general $n$ using eigendecomposition and verify convergence to the stationary distribution.

Show Hint

The stationary distribution satisfies $\pi_G = \frac{g}{b+g}$ and $\pi_B = \frac{b}{b+g}$ .

The Chapman-Kolmogorov equation states $P^{(m+n)}_{ij} = \sum_k P^{(m)}_{ik}\,P^{(n)}_{kj}$ , which is matrix multiplication.

Eigendecompose $\mathbf{P} = \mathbf{V}\mathbf{D}\mathbf{V}^{-1}$ . The eigenvalues of a $2 \times 2$ stochastic matrix are $1$ and $1 - b - g$ .

Solution

Stationary distribution

From $\boldsymbol{\pi}\mathbf{P} = \boldsymbol{\pi}$ with $\pi_G + \pi_B = 1$ :

$\pi_G(1-b) + \pi_B\,g = \pi_G \Rightarrow \pi_B\,g = \pi_G\,b$ ,

$\Rightarrow \pi_G = \frac{g}{b+g} = \frac{0.4}{0.5} = 0.8$ , $\quad \pi_B = \frac{b}{b+g} = \frac{0.1}{0.5} = 0.2$ .

The channel is in the Good state $80\%$ of the time in steady state.

Chapman-Kolmogorov verification

Direct computation: $\mathbf{P}^2 = \begin{bmatrix} 0.9 & 0.1 \\ 0.4 & 0.6 \end{bmatrix} \begin{bmatrix} 0.9 & 0.1 \\ 0.4 & 0.6 \end{bmatrix} = \begin{bmatrix} 0.81 + 0.04 & 0.09 + 0.06 \\ 0.36 + 0.24 & 0.04 + 0.36 \end{bmatrix} = \begin{bmatrix} 0.85 & 0.15 \\ 0.60 & 0.40 \end{bmatrix}.$

The Chapman-Kolmogorov equation for $m = n = 1$ : $P^{(2)}_{GG} = P_{GG}P_{GG} + P_{GB}P_{BG} = 0.81 + 0.04 = 0.85$ . $\checkmark$

Similarly for all other entries: matrix multiplication implements Chapman-Kolmogorov. $\checkmark$

Eigendecomposition and $\mathbf{P}^n$

The eigenvalues of $\mathbf{P}$ are $\lambda_1 = 1$ and $\lambda_2 = 1 - b - g = 1 - 0.5 = 0.5$ .

Right eigenvector for $\lambda_1 = 1$ : $\mathbf{v}_1 = [1, 1]^T$ (all-ones, as for any stochastic matrix).

Right eigenvector for $\lambda_2 = 0.5$ : from $(0.9 - 0.5)v_1 + 0.1\,v_2 = 0$ , giving $\mathbf{v}_2 = [1, -4]^T$ .

Thus $\mathbf{P} = \mathbf{V}\mathbf{D}\mathbf{V}^{-1}$ with $\mathbf{V} = \begin{bmatrix} 1 & 1 \\ 1 & -4 \end{bmatrix}, \quad \mathbf{D} = \begin{bmatrix} 1 & 0 \\ 0 & 0.5 \end{bmatrix}, \quad \mathbf{V}^{-1} = \frac{1}{5}\begin{bmatrix} 4 & 1 \\ 1 & -1 \end{bmatrix}.$

Therefore: $\mathbf{P}^n = \mathbf{V}\mathbf{D}^n\mathbf{V}^{-1} = \frac{1}{5}\begin{bmatrix} 1 & 1 \\ 1 & -4 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 0.5^n \end{bmatrix} \begin{bmatrix} 4 & 1 \\ 1 & -1 \end{bmatrix}.$

Expanding: $\mathbf{P}^n = \frac{1}{5}\begin{bmatrix} 4 + 0.5^n & 1 - 0.5^n \\ 4 - 4\cdot 0.5^n & 1 + 4\cdot 0.5^n \end{bmatrix}.$

As $n \to \infty$ , $0.5^n \to 0$ and every row converges to $[4/5,\; 1/5] = [0.8,\; 0.2] = [\pi_G, \pi_B]$ . $\square$

ex-ch02-18

Hard

Let $\mathbf{X}$ and $\mathbf{Y}$ be jointly Gaussian random vectors with $\mathbf{X} \sim \mathcal{N}(\boldsymbol{\mu}_X, \mathbf{C}_{XX}), \quad \mathbf{Y} \sim \mathcal{N}(\boldsymbol{\mu}_Y, \mathbf{C}_{YY}),$ and cross-covariance $\mathbf{C}_{XY} = E[(\mathbf{X} - \boldsymbol{\mu}_X) (\mathbf{Y} - \boldsymbol{\mu}_Y)^T]$ .

Consider the linear MMSE (minimum mean-square error) estimator $\hat{\mathbf{X}} = \mathbf{A}\mathbf{Y} + \mathbf{b}$ that minimizes $E[\|\mathbf{X} - \hat{\mathbf{X}}\|^2]$ .

(a) Derive the optimal $\mathbf{A}$ and $\mathbf{b}$ .

(b) Show that the MMSE is $\text{MMSE} = \operatorname{tr}(\mathbf{C}_{XX} - \mathbf{C}_{XY}\mathbf{C}_{YY}^{-1}\mathbf{C}_{YX})$ .

(c) Specialize to scalar $X$ and $Y$ with $\mu_X = \mu_Y = 0$ , $\sigma_X^2 = 4$ , $\sigma_Y^2 = 1$ , and correlation coefficient $\rho = 0.8$ . Compute the estimator and the MMSE.

Show Hint

Expand $E[\|\mathbf{X} - \mathbf{A}\mathbf{Y} - \mathbf{b}\|^2]$ and use the orthogonality principle: the optimal estimator makes the error orthogonal to the observation.

The orthogonality principle gives $E[(\mathbf{X} - \mathbf{A}\mathbf{Y} - \mathbf{b})\mathbf{Y}^T] = \mathbf{0}$ .

For jointly Gaussian vectors, the linear MMSE estimator equals the conditional mean $E[\mathbf{X} \mid \mathbf{Y}]$ .

Solution

Derive optimal $\mathbf{b}$

Taking the expectation of $\hat{\mathbf{X}} = \mathbf{A}\mathbf{Y} + \mathbf{b}$ and requiring $E[\hat{\mathbf{X}}] = E[\mathbf{X}] = \boldsymbol{\mu}_X$ : $\boldsymbol{\mu}_X = \mathbf{A}\boldsymbol{\mu}_Y + \mathbf{b} \quad\Longrightarrow\quad \mathbf{b} = \boldsymbol{\mu}_X - \mathbf{A}\boldsymbol{\mu}_Y.$

Substituting back: $\hat{\mathbf{X}} = \boldsymbol{\mu}_X + \mathbf{A}(\mathbf{Y} - \boldsymbol{\mu}_Y).$

Derive optimal $\mathbf{A}$ via orthogonality principle

The estimation error is $\mathbf{e} = \mathbf{X} - \hat{\mathbf{X}}$ . The orthogonality principle requires $E[\mathbf{e}(\mathbf{Y} - \boldsymbol{\mu}_Y)^T] = \mathbf{0}$ :

$E[(\mathbf{X} - \boldsymbol{\mu}_X - \mathbf{A}(\mathbf{Y} - \boldsymbol{\mu}_Y)) (\mathbf{Y} - \boldsymbol{\mu}_Y)^T] = \mathbf{0}.$

$\mathbf{C}_{XY} - \mathbf{A}\,\mathbf{C}_{YY} = \mathbf{0}.$

$\boxed{\mathbf{A} = \mathbf{C}_{XY}\,\mathbf{C}_{YY}^{-1}.}$

The LMMSE estimator is therefore $\hat{\mathbf{X}}_{\text{MMSE}} = \boldsymbol{\mu}_X + \mathbf{C}_{XY}\,\mathbf{C}_{YY}^{-1} (\mathbf{Y} - \boldsymbol{\mu}_Y).$

For jointly Gaussian $(\mathbf{X}, \mathbf{Y})$ , this equals the conditional expectation $E[\mathbf{X} \mid \mathbf{Y}]$ .

Derive the MMSE

The error covariance is $\mathbf{C}_e = E[\mathbf{e}\,\mathbf{e}^T] = \mathbf{C}_{XX} - \mathbf{A}\,\mathbf{C}_{YX} = \mathbf{C}_{XX} - \mathbf{C}_{XY}\,\mathbf{C}_{YY}^{-1}\,\mathbf{C}_{YX}.$

This follows because $\mathbf{e} = (\mathbf{X} - \boldsymbol{\mu}_X) - \mathbf{A}(\mathbf{Y} - \boldsymbol{\mu}_Y)$ , and by the orthogonality principle, the cross terms vanish.

The MMSE is the total error variance: $\text{MMSE} = E[\|\mathbf{e}\|^2] = \operatorname{tr}(\mathbf{C}_e) = \operatorname{tr}\!\left(\mathbf{C}_{XX} - \mathbf{C}_{XY}\,\mathbf{C}_{YY}^{-1}\,\mathbf{C}_{YX}\right).$

Scalar specialization

With $\mu_X = \mu_Y = 0$ , $\sigma_X^2 = 4$ , $\sigma_Y^2 = 1$ , $\rho = 0.8$ : $C_{XY} = \rho\,\sigma_X\,\sigma_Y = 0.8 \times 2 \times 1 = 1.6.$

The LMMSE estimator: $\hat{X} = \frac{C_{XY}}{C_{YY}}\,Y = \frac{1.6}{1}\,Y = 1.6\,Y.$

The MMSE: $\text{MMSE} = \sigma_X^2 - \frac{C_{XY}^2}{\sigma_Y^2} = 4 - \frac{2.56}{1} = 1.44.$

Equivalently, $\text{MMSE} = \sigma_X^2(1 - \rho^2) = 4(1 - 0.64) = 1.44$ .

The observation $Y$ reduces the estimation variance from $4$ to $1.44$ , a $64\%$ reduction. The fraction of variance explained equals $\rho^2 = 0.64$ , confirming the operational meaning of the correlation coefficient. $\square$

ex-ch02-19

Challenge

Consider a MIMO channel $\mathbf{y} = \mathbf{H}\mathbf{x} + \mathbf{n}$ where $\mathbf{H} \in \mathbb{C}^{N_r \times N_t}$ is a random channel matrix, $\mathbf{x} \in \mathbb{C}^{N_t}$ is the transmitted signal with covariance $\mathbf{C}_x = E[\mathbf{x}\mathbf{x}^H]$ satisfying $\operatorname{tr}(\mathbf{C}_x) \le P$ , and $\mathbf{n} \sim \mathcal{CN}(\mathbf{0}, \sigma^2\mathbf{I}_{N_r})$ .

Using the probability and linear algebra tools from Chapters 1–2:

(a) Derive the mutual information $I(\mathbf{x}; \mathbf{y} \mid \mathbf{H})$ for a fixed channel realization, expressing it in terms of the eigenvalues of $\mathbf{H}\mathbf{C}_x\mathbf{H}^H$ .

(b) When the transmitter has no CSI, show that the optimal input covariance is $\mathbf{C}_x = \frac{P}{N_t}\mathbf{I}_{N_t}$ (uniform power allocation), and derive the resulting capacity formula $C = \sum_{i=1}^{\min(N_r, N_t)} \log_2\!\left(1 + \frac{P}{N_t \sigma^2}\,\lambda_i\right),$ where $\lambda_i$ are the eigenvalues of $\mathbf{H}\mathbf{H}^H$ .

(c) For a $2 \times 2$ channel with $\mathbf{H} = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}$ , $P = 10$ , and $\sigma^2 = 1$ , compute the capacity numerically.

Show Hint

Use the fact that for jointly Gaussian vectors, $I(\mathbf{x}; \mathbf{y}) = h(\mathbf{y}) - h(\mathbf{y} \mid \mathbf{x}) = h(\mathbf{y}) - h(\mathbf{n})$ .

The differential entropy of a complex Gaussian vector $\mathbf{z} \sim \mathcal{CN}(\boldsymbol{\mu}, \mathbf{C})$ is $h(\mathbf{z}) = \log_2\det(\pi e \mathbf{C})$ .

Connect to Chapter 1: use the SVD of $\mathbf{H}$ to diagonalize the channel into parallel sub-channels.

Solution

Mutual information for fixed $\mathbf{H}$

Given $\mathbf{H}$ , the output $\mathbf{y}$ is Gaussian with covariance $\mathbf{C}_y = \mathbf{H}\mathbf{C}_x\mathbf{H}^H + \sigma^2\mathbf{I}.$

The conditional entropy $h(\mathbf{y} \mid \mathbf{x}, \mathbf{H}) = h(\mathbf{n}) = \log_2\det(\pi e\,\sigma^2\mathbf{I})$ .

The mutual information is $I(\mathbf{x}; \mathbf{y} \mid \mathbf{H}) = h(\mathbf{y} \mid \mathbf{H}) - h(\mathbf{n}) = \log_2\det(\mathbf{C}_y) - \log_2\det(\sigma^2\mathbf{I}) = \log_2\det\!\left(\mathbf{I} + \frac{1}{\sigma^2}\mathbf{H}\mathbf{C}_x\mathbf{H}^H\right).$

Let $\mu_1, \ldots, \mu_r$ be the eigenvalues of $\frac{1}{\sigma^2}\mathbf{H}\mathbf{C}_x\mathbf{H}^H$ . Then $I(\mathbf{x}; \mathbf{y} \mid \mathbf{H}) = \sum_{i=1}^r \log_2(1 + \mu_i).$

Optimal input without CSI

Without CSI, the transmitter cannot adapt $\mathbf{C}_x$ to $\mathbf{H}$ . The ergodic capacity is $C = \max_{\mathbf{C}_x:\,\operatorname{tr}(\mathbf{C}_x) \le P} E_{\mathbf{H}}\!\left[\log_2\det\!\left(\mathbf{I} + \frac{1}{\sigma^2} \mathbf{H}\mathbf{C}_x\mathbf{H}^H\right)\right].$

For i.i.d. Rayleigh fading where $\mathbf{H}$ has i.i.d. $\mathcal{CN}(0,1)$ entries, the distribution of $\mathbf{H}$ is unitarily invariant: $\mathbf{U}\mathbf{H}$ and $\mathbf{H}\mathbf{V}$ have the same distribution for any unitary $\mathbf{U}$ , $\mathbf{V}$ .

By the Hadamard inequality and Jensen's inequality argument, $\mathbf{C}_x = \frac{P}{N_t}\mathbf{I}$ is optimal. Then $\frac{1}{\sigma^2}\mathbf{H}\mathbf{C}_x\mathbf{H}^H = \frac{P}{N_t\sigma^2}\mathbf{H}\mathbf{H}^H,$ whose eigenvalues are $\frac{P}{N_t\sigma^2}\lambda_i$ , yielding $C = \sum_{i=1}^{\min(N_r,N_t)} \log_2\!\left(1 + \frac{P}{N_t\sigma^2}\,\lambda_i\right).$

SVD interpretation (connection to Chapter 1)

Let $\mathbf{H} = \mathbf{U}\boldsymbol{\Sigma}\mathbf{V}^H$ be the SVD. Define $\tilde{\mathbf{y}} = \mathbf{U}^H\mathbf{y}$ and $\tilde{\mathbf{x}} = \mathbf{V}^H\mathbf{x}$ . Then $\tilde{\mathbf{y}} = \boldsymbol{\Sigma}\tilde{\mathbf{x}} + \tilde{\mathbf{n}},$ where $\tilde{\mathbf{n}} = \mathbf{U}^H\mathbf{n} \sim \mathcal{CN}(\mathbf{0}, \sigma^2\mathbf{I})$ (unitary invariance of the Gaussian distribution).

This diagonalizes the MIMO channel into $r = \operatorname{rank}(\mathbf{H})$ parallel scalar sub-channels: $\tilde{y}_i = \sigma_i\,\tilde{x}_i + \tilde{n}_i$ , each with SNR $\gamma_i = \frac{P}{N_t\sigma^2}\sigma_i^2 = \frac{P}{N_t\sigma^2}\lambda_i$ .

The eigenvalues $\lambda_i$ of $\mathbf{H}\mathbf{H}^H$ are $\sigma_i^2$ , the squared singular values of $\mathbf{H}$ .

Numerical computation for the $2 \times 2$ example

$\mathbf{H}\mathbf{H}^H = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix} = \begin{bmatrix} 5 & 4 \\ 4 & 5 \end{bmatrix}.KATEXPLACEHOLDER0ENDC = \log_2(1 + 5 \times 9) + \log_2(1 + 5 \times 1) = \log_2(46) + \log_2(6) \approx 5.524 + 2.585 = 8.109\;\text{bits/channel use}.$ $Compare with a SISO channel at the same total power:$ C_{\text{SISO}} = \log_2(1 + 10 \cdot |\mathbf{h}|^2) \approx \log_2(51) \approx 5.67 $. The MIMO system provides a$ 43% $capacity gain through spatial multiplexing — the two eigenvalues create two parallel data pipes.$ \square$

ex-ch02-20

Challenge

Consider a stationary ergodic random process $X(t)$ that represents the indicator of bit error at time $t$ for a communication link: $X(t) = 1$ if the bit transmitted at time $t$ is received in error, and $X(t) = 0$ otherwise.

The ensemble-average BER is $\bar{p} = E[X(t)]$ and the time-averaged BER measured over an observation window $[0, T]$ is $\hat{p}_T = \frac{1}{T}\int_0^T X(t)\,dt.$

(a) State the conditions under which $\hat{p}_T \to \bar{p}$ as $T \to \infty$ (in mean-square sense). Relate this to the autocorrelation $R_X(\tau)$ .

(b) For a channel whose error process has autocorrelation $R_X(\tau) = \bar{p}^2 + \bar{p}(1 - \bar{p})\,e^{-|\tau|/\tau_c}$ (with correlation time $\tau_c$ ), derive the variance of $\hat{p}_T$ as a function of $T$ and $\tau_c$ .

(c) Determine how large $T$ must be (in terms of $\tau_c$ ) to ensure $\operatorname{Var}(\hat{p}_T) \le \epsilon^2\,\bar{p}^2$ for a prescribed relative accuracy $\epsilon$ .

(d) Discuss when ergodicity fails and the time-averaged BER does not converge to the ensemble average. Give a concrete channel example.

Show Hint

Mean-square ergodicity requires $\operatorname{Var}(\hat{p}_T) \to 0$ as $T \to \infty$ . A sufficient condition involves the integrability of $C_X(\tau) = R_X(\tau) - \bar{p}^2$ .

Use the formula $\operatorname{Var}(\hat{p}_T) = \frac{1}{T^2}\int_0^T\!\int_0^T C_X(t_1 - t_2)\,dt_1\,dt_2 = \frac{1}{T}\int_{-T}^{T}\left(1 - \frac{|\tau|}{T}\right)C_X(\tau)\,d\tau$ .

For exponential $C_X(\tau)$ , evaluate the integral in closed form. For large $T/\tau_c$ , the leading term is $\operatorname{Var}(\hat{p}_T) \approx \frac{2\tau_c}{T}\bar{p}(1-\bar{p})$ .

Solution

Conditions for mean-square ergodicity

Define the autocovariance $C_X(\tau) = R_X(\tau) - \bar{p}^2$ . The variance of the time average is $\operatorname{Var}(\hat{p}_T) = \frac{1}{T^2}\int_0^T\!\int_0^T C_X(t_1 - t_2)\,dt_1\,dt_2.$

Substituting $\tau = t_1 - t_2$ and performing one integration: $\operatorname{Var}(\hat{p}_T) = \frac{1}{T}\int_{-T}^{T}\left(1 - \frac{|\tau|}{T}\right)C_X(\tau)\,d\tau.$

Mean-square ergodicity holds if and only if $\operatorname{Var}(\hat{p}_T) \to 0$ as $T \to \infty$ .

A sufficient condition is that $C_X(\tau)$ is integrable: $\int_{-\infty}^{\infty} |C_X(\tau)|\,d\tau < \infty.$

When this holds, the dominated convergence theorem gives $\operatorname{Var}(\hat{p}_T) \to 0$ , and thus $\hat{p}_T \xrightarrow{\text{m.s.}} \bar{p}$ .

Variance for exponential autocovariance

Given $C_X(\tau) = \bar{p}(1-\bar{p})\,e^{-|\tau|/\tau_c}$ , we compute: $\operatorname{Var}(\hat{p}_T) = \frac{2}{T}\int_0^T \left(1 - \frac{\tau}{T}\right) \bar{p}(1-\bar{p})\,e^{-\tau/\tau_c}\,d\tau.$

Evaluating the two terms: $\int_0^T e^{-\tau/\tau_c}\,d\tau = \tau_c(1 - e^{-T/\tau_c}),$ $\int_0^T \frac{\tau}{T}\,e^{-\tau/\tau_c}\,d\tau = \frac{\tau_c}{T}\left[\tau_c(1 - e^{-T/\tau_c}) - T\,e^{-T/\tau_c}\right] = \frac{\tau_c^2}{T}(1 - e^{-T/\tau_c}) - \tau_c\,e^{-T/\tau_c}.$

Combining: $\operatorname{Var}(\hat{p}_T) = \frac{2\bar{p}(1-\bar{p})}{T} \left[\tau_c(1 - e^{-T/\tau_c}) - \frac{\tau_c^2}{T}(1 - e^{-T/\tau_c}) + \tau_c\,e^{-T/\tau_c}\right].$

Simplifying: $\operatorname{Var}(\hat{p}_T) = \frac{2\bar{p}(1-\bar{p})\,\tau_c}{T} \left[1 - \frac{\tau_c}{T}\bigl(1 - e^{-T/\tau_c}\bigr)\right].$

Required observation window

For $T \gg \tau_c$ , $e^{-T/\tau_c} \approx 0$ and $\tau_c/T \ll 1$ , so $\operatorname{Var}(\hat{p}_T) \approx \frac{2\bar{p}(1-\bar{p})\,\tau_c}{T}.$

Setting $\operatorname{Var}(\hat{p}_T) \le \epsilon^2\,\bar{p}^2$ : $\frac{2\bar{p}(1-\bar{p})\,\tau_c}{T} \le \epsilon^2\,\bar{p}^2,$ $T \ge \frac{2(1-\bar{p})\,\tau_c}{\epsilon^2\,\bar{p}}.$

For example, with $\bar{p} = 10^{-3}$ , $\tau_c = 10$ ms, and $\epsilon = 0.1$ (i.e., $10\%$ relative accuracy): $T \ge \frac{2 \times 0.999 \times 0.01}{0.01 \times 0.001} \approx 2000\;\text{s} \approx 33\;\text{minutes}.$

The effective number of "independent samples" is approximately $T/(2\tau_c)$ . Correlated errors require proportionally longer observation windows.

When ergodicity fails

Ergodicity fails when $C_X(\tau) \not\to 0$ as $|\tau| \to \infty$ , so the integral $\int |C_X(\tau)|\,d\tau$ diverges.

Concrete example: A composite fading channel where the large-scale shadowing component $S$ is constant over all time (drawn once from a log-normal distribution). The BER conditional on $S$ is $p(S) = Q(\sqrt{2S/\sigma^2})$ , and the error process has $\lim_{|\tau| \to \infty} C_X(\tau) = \operatorname{Var}(p(S)) > 0.$

In this case, $\hat{p}_T \to p(S) \neq E[p(S)] = \bar{p}$ for each realization of $S$ . The time average converges to a random limit that depends on the specific shadowing realization, not the ensemble average.

Practical consequence: In non-ergodic channels, one must average BER measurements over multiple independent channel realizations (e.g., different locations or frequency bands), not just over time. This is the distinction between ergodic capacity and outage capacity in wireless system design. $\square$

Exercises

ex-ch02-01

Likelihoods

Evidence (marginal likelihood)

Posterior probabilities via Bayes' theorem

ex-ch02-02

PDF and CDF

Mean and variance

Tail probability

ex-ch02-03

Compute $E[Y] = E[X^2]$ via LOTUS

Compute $E[Y^2] = E[X^4]$

Variance of the instantaneous power

ex-ch02-04

Set up the marginal integral

Complete the square in $y$

Evaluate the Gaussian integral and obtain $f_X(x)$

ex-ch02-05

Part (a): $P(N(1) = 3)$

Part (b): $P(N(0.5) = 0)$

Part (c): mean and variance over 10 seconds

ex-ch02-06

Conditional detection probabilities

Total probability of detection

Posterior probability via Bayes' theorem

ex-ch02-07

Joint PDF in Cartesian coordinates

Transform to polar coordinates

Marginalize over $\theta$ to get $f_R(r)$

ex-ch02-08

MGF of a single exponential RV

MGF of the sum $S_n$

Numerical evaluation for $n=3$, $\lambda=2$

ex-ch02-09

Eigendecomposition of $\mathbf{C}_{\mathbf{X}}$

KLT decorrelation

MIMO interpretation

ex-ch02-10

Exact mean and variance

CLT approximation for $P(S_{12} > 7)$

Comparison with exact value

ex-ch02-11

Input PSD

Transfer function and output PSD

Output power via partial fractions

ex-ch02-12

Verify stochastic matrix

Set up balance equations

Solve the system

Channel availability interpretation

ex-ch02-13

Superposition is Poisson

Classification probability

Joint probability of specific counts

ex-ch02-14

Derive the MAP decision rule

Solve for the optimal threshold

Numerical computation

Probability of error

ex-ch02-15

Derive the Chernoff bound from Markov's inequality

Specialize to Gaussian

BER application

ex-ch02-16

Part (a): Constant mean

Part (b): Autocorrelation depends only on $\tau$

Part (c): Derive the filtering relationship

ex-ch02-17

Stationary distribution

Chapman-Kolmogorov verification

Eigendecomposition and $\mathbf{P}^n$

ex-ch02-18

Derive optimal $\mathbf{b}$

Derive optimal $\mathbf{A}$ via orthogonality principle

Derive the MMSE

Scalar specialization

ex-ch02-19

Mutual information for fixed $\mathbf{H}$

Optimal input without CSI

SVD interpretation (connection to Chapter 1)