Exercises
ex-ch02-01
EasyA BPSK receiver observes a signal and must decide between hypotheses (bit sent) and (bit sent). The channel model is where , is the transmitted bit, and with . The prior probabilities are and .
Given an observation , use Bayes' theorem to compute the posterior probabilities and .
Under the transmitted signal is , so . Under the transmitted signal is , so .
Evaluate the likelihoods: and .
Apply Bayes' rule: .
Likelihoods
Under : , so
Under : , so
Evidence (marginal likelihood)
$
Posterior probabilities via Bayes' theorem
0.8031 + 0.1969 = 1.0000r = 0.5+1-1H_0\square$
ex-ch02-02
EasyThe inter-arrival time of packets at a network node is modeled as an exponential random variable with rate parameter packets/ms.
(a) Write the PDF and CDF .
(b) Compute the mean and variance .
(c) Find .
For an exponential RV with rate , the PDF is for .
The CDF is . The mean is and the variance is .
Use the survival function: .
PDF and CDF
For an exponential random variable with rate :
Mean and variance
The mean of an random variable is
The variance is
Tail probability
13.5%1\square$
ex-ch02-03
EasyLet model the noise voltage at a receiver front-end, with . Define the instantaneous power as .
Using the law of the unconscious statistician (LOTUS), compute and , then find .
LOTUS states . Here .
For a zero-mean Gaussian, and (the fourth moment).
.
Compute $E[Y] = E[X^2]$ via LOTUS
By LOTUS:
This follows directly from the definition of variance for a zero-mean RV: .
Compute $E[Y^2] = E[X^4]$
The fourth central moment of a Gaussian is
This can be derived via LOTUS and integration by parts, or by differentiating the moment generating function twice. With :
Variance of the instantaneous power
Y = X^2X \sim \mathcal{N}(0,4)Y/4 \sim \chi^2_12\operatorname{Var}(Y) = 16 \cdot 2 = 32\square$
ex-ch02-04
EasyLet be a jointly Gaussian random vector with joint PDF where , , , and both means are zero.
Obtain the marginal PDF by integrating over .
Write the exponent as a function of by completing the square: group terms involving and separate the -only part.
After completing the square, the -integral is a Gaussian integral of the form .
The remaining -dependent factor should yield .
Set up the marginal integral
2\pi \cdot 2\sqrt{0.75} = 4\pi\sqrt{3}/2 = 2\pi\sqrt{3}$.
Complete the square in $y$
Inside the exponent, group terms involving :
The full exponent becomes
That is, .
Evaluate the Gaussian integral and obtain $f_X(x)$
f_X(x) = \frac{1}{\sqrt{2\pi}},e^{-x^2/2}X \sim \mathcal{N}(0, 1)\square$
ex-ch02-05
EasyPackets arrive at a base station according to a Poisson process with rate packets per second.
(a) What is the probability that exactly packets arrive in a -second interval?
(b) What is the probability that no packets arrive in a -second interval?
(c) Find the expected number of packets in a -second window and its variance.
For a Poisson process with rate , the number of arrivals in is .
.
Both the mean and variance of a Poisson random variable equal .
Part (a): $P(N(1) = 3)$
With :
Part (b): $P(N(0.5) = 0)$
With :
Part (c): mean and variance over 10 seconds
For a Poisson process, . With :
The standard deviation is packets.
ex-ch02-06
MediumA multi-stage detection system processes a received signal through two successive classifiers. The first classifier declares "signal present" () with probability when a signal is truly present, and has a false alarm rate . The second classifier takes the output of and refines it: and .
If declares "no signal" (), the system outputs regardless. The prior probability of signal presence is .
Using the total probability theorem and Bayes' theorem, find:
(a) , the overall probability that the system declares "signal present."
(b) , the posterior probability that a signal is truly present given a positive declaration.
First compute and by conditioning on : .
Apply the total probability theorem: .
Finally, use Bayes' theorem: .
Conditional detection probabilities
Since only activates when declares :
Total probability of detection
By the total probability theorem:
Posterior probability via Bayes' theorem
1%65%10%0.5%\square$
ex-ch02-07
MediumLet and be independent, zero-mean Gaussian random variables, each with variance , representing the in-phase and quadrature components of narrowband noise. Define the envelope
Derive the PDF of (the Rayleigh distribution) from first principles.
First find the CDF: .
Switch to polar coordinates in the joint PDF of .
Differentiate the CDF with respect to to obtain the PDF.
Joint PDF in Cartesian coordinates
Since and are independent :
Transform to polar coordinates
Let and , with Jacobian . The joint PDF in polar form is
Marginalize over $\theta$ to get $f_R(r)$
\sigmaE[R] = \sigma\sqrt{\pi/2}E[R^2] = 2\sigma^2\square$
ex-ch02-08
MediumLet be independent and identically distributed exponential random variables with rate , modeling the inter-arrival times of packets. Define the total waiting time for arrivals as .
(a) Derive the moment generating function (MGF) .
(b) Identify the distribution of from the MGF.
(c) Specialize to , and compute and .
The MGF of a single RV is for .
For independent RVs, the MGF of the sum is the product of individual MGFs.
The resulting MGF is the MGF of a Gamma (or Erlang-) distribution.
MGF of a single exponential RV
For :
MGF of the sum $S_n$
Since the are independent, the MGF of the sum factorizes:
This is the MGF of the Erlang- distribution (equivalently, ), with PDF
Numerical evaluation for $n=3$, $\lambda=2$
The Erlang- distribution has mean and variance :
In a Poisson process context, is the arrival time of the third event โ this is the connection between inter-arrival times and the counting process.
ex-ch02-09
MediumA random vector has zero mean and covariance matrix
(a) Find the eigenvalues and eigenvectors of .
(b) Define the Karhunen-Lo`eve transform (KLT) where is the matrix of eigenvectors. Show that the components of are uncorrelated.
(c) Interpret the result in the context of decorrelating received signal components in a MIMO system.
The eigenvalues of are found from .
The KLT diagonalizes the covariance: .
Uncorrelated Gaussian components become independent, enabling per-stream detection.
Eigendecomposition of $\mathbf{C}_{\mathbf{X}}$
The characteristic equation is
Eigenvalues: , .
For : , giving .
For : , giving .
KLT decorrelation
With and :
Since is diagonal, : the components are uncorrelated.
MIMO interpretation
In a MIMO system, the received signals may be correlated due to insufficient antenna spacing or scattering geometry. The KLT rotates the coordinate system to align with the principal axes of the covariance ellipsoid.
After the transform, (with variance ) carries the dominant signal energy along the sum direction , while (variance ) captures the weaker difference component. This is equivalent to the whitening step in MIMO receivers, and when is jointly Gaussian, uncorrelatedness implies independence โ enabling independent per-stream detection.
ex-ch02-10
MediumLet be i.i.d. uniform random variables on . Define .
(a) Compute the exact mean and variance of .
(b) Using the Central Limit Theorem, approximate .
(c) Compare with the exact value (noting that has an Irwin-Hall distribution) and comment on the accuracy of the CLT approximation.
For : , .
By the CLT, as .
For : , , so .
Exact mean and variance
For each :
By linearity of expectation and independence:
CLT approximation for $P(S_{12} > 7)$
For : and , so .
By the CLT:
Comparison with exact value
The exact distribution of is the Irwin-Hall distribution with , which has PDF given by the inclusion-exclusion formula:
The exact value of is approximately .
The CLT approximation of has a relative error of about , which is remarkably accurate for . This is partly because the uniform distribution is symmetric and bounded, so higher-order cumulants vanish or are small, accelerating CLT convergence. In telecommunications, the sum of uniform RVs is a classic method for generating approximate Gaussian samples.
ex-ch02-11
MediumA wide-sense stationary (WSS) random process with autocorrelation is passed through a causal LTI filter with impulse response , where is the unit step function.
Let denote the output process. Set , , and .
(a) Find the power spectral density of the input.
(b) Compute the transfer function and the output PSD .
(c) Find the output power .
The PSD is the Fourier transform of the autocorrelation: .
For an LTI system, .
The output power is .
Input PSD
The Fourier transform of is
With :
Transfer function and output PSD
The Fourier transform of is
The squared magnitude:
The output PSD:
Output power via partial fractions
Let . We use partial fractions:
The output power is
Using :
- First integral (): .
- Second integral (): .
The filter reduces the input power from to , acting as a low-pass filter that attenuates high-frequency noise components.
ex-ch02-12
MediumA wireless channel alternates among three states: Good (), Fair (), and Bad (). The one-step transition probability matrix is where rows and columns are ordered .
(a) Verify that is a valid stochastic matrix.
(b) Find the stationary distribution by solving together with .
(c) Interpret the stationary distribution in terms of long-run channel availability.
A valid stochastic matrix has non-negative entries and each row sums to .
Write as three equations: , etc.
Use two of the three balance equations plus the normalization constraint .
Verify stochastic matrix
All entries are non-negative. Row sums:
- Row :
- Row :
- Row :
Therefore is a valid (right) stochastic matrix.
Set up balance equations
The equation gives:
From (1): , i.e., .
From (3): , i.e., .
Solve the system
From : .
Substitute into : , , , .
Then .
Normalization: , so , giving .
Channel availability interpretation
In the long run, the channel is in the Good state about of the time, Fair about , and Bad about . If we define "available" as Good or Fair, the long-run availability is .
This stationary distribution is crucial for computing the ergodic capacity of the channel: , where is the capacity in state .
ex-ch02-13
MediumTwo independent Poisson processes model packet arrivals from two user classes at a base station: Class A with rate packets/s and Class B with rate packets/s.
(a) Show that the merged (superposed) arrival process is also a Poisson process, and find its rate.
(b) Given that a packet arrives, what is the probability it belongs to Class A?
(c) Compute the probability that exactly Class-A packets and exactly Class-B packets arrive in a -second interval.
The superposition of independent Poisson processes is Poisson with rate .
By the decomposition property, each arrival is independently classified as Class A with probability .
Since and are independent Poisson RVs, the joint probability factors.
Superposition is Poisson
The MGF of the merged count in is which is the MGF of a random variable.
Therefore with merged rate packets/s.
Classification probability
Given an arrival in the merged process, the probability it is Class A is
This follows from the independent thinning property of Poisson processes: each arrival is independently "marked" as Class A with probability and Class B with probability .
Joint probability of specific counts
Since and are independent:
This is a relatively rare event since the mean total is but we only see arrivals.
ex-ch02-14
HardConsider binary hypothesis testing for detecting a signal in noise: where is a known amplitude and . The prior probabilities are and with .
(a) Derive the MAP (Maximum A Posteriori) decision rule and show that the optimal threshold is
(b) Show that when the MAP rule reduces to the ML rule with threshold .
(c) For , , and , compute the MAP threshold and the resulting probability of error .
The MAP rule decides when .
Take the log of both sides of the likelihood ratio test. The log-likelihood ratio for Gaussian is linear in .
The error probability is where each involves a -function.
Derive the MAP decision rule
The MAP rule decides if
The likelihoods are:
The log-likelihood ratio is
Solve for the optimal threshold
The MAP rule becomes: decide if
Since , dividing by :
When : , so , which is the ML threshold. The MAP rule reduces to the midpoint decision boundary, as expected by symmetry of the priors.
Numerical computation
With , , :
The threshold shifts right (toward ) because , reflecting the prior bias toward .
Probability of error
P_e\square$
ex-ch02-15
HardLet be i.i.d. random variables with MGF . Consider the tail probability where and .
(a) Derive the Chernoff bound: where is the Fenchel-Legendre (rate function) of the log-MGF.
(b) Specialize to and show that the Chernoff bound on (where ) yields
(c) Apply to BER analysis: if represents the decision statistic error for the -th bit with , find the bound on .
Start with Markov's inequality: for any .
Use independence to factor . Then optimize over .
For Gaussian , , so .
Derive the Chernoff bound from Markov's inequality
For any , by Markov's inequality applied to the non-negative random variable :
Since the are i.i.d.:
Therefore:
Since this holds for all , we optimize:
Specialize to Gaussian
For :
The rate function is
Taking the derivative and setting it to zero: , giving .
The second derivative is , confirming a maximum.
Therefore .
BER application
With , , :
For comparison, the exact Gaussian probability is .
The Chernoff bound is about looser than the exact value, but it captures the correct exponential decay rate in the exponent. This exponential tightness is the key property that makes Chernoff bounds invaluable in error probability analysis โ they correctly predict how BER scales with SNR in the large-deviation regime.
ex-ch02-16
HardLet be a wide-sense stationary (WSS) process with mean and autocorrelation . The process is passed through a stable LTI system with impulse response to produce .
Prove that is also WSS by showing:
(a) is constant (independent of ).
(b) depends only on .
(c) Derive the relationship , or equivalently .
Use linearity of expectation and the WSS property for part (a).
For part (b), write as a double integral involving .
Recognize that the double integral depends only on because depends only on its argument.
Part (a): Constant mean
H(0) = \int_{-\infty}^{\infty} h(\alpha),d\alphat\checkmark$
Part (b): Autocorrelation depends only on $\tau$
R_X(t_1 - t_2) - \alpha_1 + \alpha_2 = \tau - \alpha_1 + \alpha_2t_1t_2\tau = t_1 - t_2\checkmark$
Part (c): Derive the filtering relationship
Denoting :
The inner integral is a convolution. Define . Then:
The outer integral becomes:
Taking the Fourier transform:
This is the Wiener-Khinchin filtering theorem: the output PSD is the input PSD scaled by the squared magnitude of the transfer function. This fundamental result confirms that LTI filtering preserves the WSS property and provides the spectral input-output relationship used throughout communications system analysis.
ex-ch02-17
HardThe Gilbert-Elliott channel is a two-state Markov chain with states (Good) and (Bad). The transition matrix is where and . Set and .
(a) Find the stationary distribution .
(b) Verify the Chapman-Kolmogorov equation by computing directly and also as .
(c) Compute for general using eigendecomposition and verify convergence to the stationary distribution.
The stationary distribution satisfies and .
The Chapman-Kolmogorov equation states , which is matrix multiplication.
Eigendecompose . The eigenvalues of a stochastic matrix are and .
Stationary distribution
From with :
,
, .
The channel is in the Good state of the time in steady state.
Chapman-Kolmogorov verification
Direct computation:
The Chapman-Kolmogorov equation for : .
Similarly for all other entries: matrix multiplication implements Chapman-Kolmogorov.
Eigendecomposition and $\mathbf{P}^n$
The eigenvalues of are and .
Right eigenvector for : (all-ones, as for any stochastic matrix).
Right eigenvector for : from , giving .
Thus with
Therefore:
Expanding:
As , and every row converges to .
ex-ch02-18
HardLet and be jointly Gaussian random vectors with and cross-covariance .
Consider the linear MMSE (minimum mean-square error) estimator that minimizes .
(a) Derive the optimal and .
(b) Show that the MMSE is .
(c) Specialize to scalar and with , , , and correlation coefficient . Compute the estimator and the MMSE.
Expand and use the orthogonality principle: the optimal estimator makes the error orthogonal to the observation.
The orthogonality principle gives .
For jointly Gaussian vectors, the linear MMSE estimator equals the conditional mean .
Derive optimal $\mathbf{b}$
Taking the expectation of and requiring :
Substituting back:
Derive optimal $\mathbf{A}$ via orthogonality principle
The estimation error is . The orthogonality principle requires :
The LMMSE estimator is therefore
For jointly Gaussian , this equals the conditional expectation .
Derive the MMSE
The error covariance is
This follows because , and by the orthogonality principle, the cross terms vanish.
The MMSE is the total error variance:
Scalar specialization
With , , , :
The LMMSE estimator:
The MMSE:
Equivalently, .
The observation reduces the estimation variance from to , a reduction. The fraction of variance explained equals , confirming the operational meaning of the correlation coefficient.
ex-ch02-19
ChallengeConsider a MIMO channel where is a random channel matrix, is the transmitted signal with covariance satisfying , and .
Using the probability and linear algebra tools from Chapters 1โ2:
(a) Derive the mutual information for a fixed channel realization, expressing it in terms of the eigenvalues of .
(b) When the transmitter has no CSI, show that the optimal input covariance is (uniform power allocation), and derive the resulting capacity formula where are the eigenvalues of .
(c) For a channel with , , and , compute the capacity numerically.
Use the fact that for jointly Gaussian vectors, .
The differential entropy of a complex Gaussian vector is .
Connect to Chapter 1: use the SVD of to diagonalize the channel into parallel sub-channels.
Mutual information for fixed $\mathbf{H}$
Given , the output is Gaussian with covariance
The conditional entropy .
The mutual information is
Let be the eigenvalues of . Then
Optimal input without CSI
Without CSI, the transmitter cannot adapt to . The ergodic capacity is
For i.i.d. Rayleigh fading where has i.i.d. entries, the distribution of is unitarily invariant: and have the same distribution for any unitary , .
By the Hadamard inequality and Jensen's inequality argument, is optimal. Then whose eigenvalues are , yielding
SVD interpretation (connection to Chapter 1)
Let be the SVD. Define and . Then where (unitary invariance of the Gaussian distribution).
This diagonalizes the MIMO channel into parallel scalar sub-channels: , each with SNR .
The eigenvalues of are , the squared singular values of .
Numerical computation for the $2 \times 2$ example
C_{\text{SISO}} = \log_2(1 + 10 \cdot |\mathbf{h}|^2) \approx \log_2(51) \approx 5.6743%\square$
ex-ch02-20
ChallengeConsider a stationary ergodic random process that represents the indicator of bit error at time for a communication link: if the bit transmitted at time is received in error, and otherwise.
The ensemble-average BER is and the time-averaged BER measured over an observation window is
(a) State the conditions under which as (in mean-square sense). Relate this to the autocorrelation .
(b) For a channel whose error process has autocorrelation (with correlation time ), derive the variance of as a function of and .
(c) Determine how large must be (in terms of ) to ensure for a prescribed relative accuracy .
(d) Discuss when ergodicity fails and the time-averaged BER does not converge to the ensemble average. Give a concrete channel example.
Mean-square ergodicity requires as . A sufficient condition involves the integrability of .
Use the formula .
For exponential , evaluate the integral in closed form. For large , the leading term is .
Conditions for mean-square ergodicity
Define the autocovariance . The variance of the time average is
Substituting and performing one integration:
Mean-square ergodicity holds if and only if as .
A sufficient condition is that is integrable:
When this holds, the dominated convergence theorem gives , and thus .
Variance for exponential autocovariance
Given , we compute:
Evaluating the two terms:
Combining:
Simplifying:
Required observation window
For , and , so
Setting :
For example, with , ms, and (i.e., relative accuracy):
The effective number of "independent samples" is approximately . Correlated errors require proportionally longer observation windows.
When ergodicity fails
Ergodicity fails when as , so the integral diverges.
Concrete example: A composite fading channel where the large-scale shadowing component is constant over all time (drawn once from a log-normal distribution). The BER conditional on is , and the error process has
In this case, for each realization of . The time average converges to a random limit that depends on the specific shadowing realization, not the ensemble average.
Practical consequence: In non-ergodic channels, one must average BER measurements over multiple independent channel realizations (e.g., different locations or frequency bands), not just over time. This is the distinction between ergodic capacity and outage capacity in wireless system design.