Exercises
ex-ch19-01
EasyState the point-to-point source–channel separation theorem for a DMS with entropy over a DMC with capacity at bandwidth ratio . What is the necessary and sufficient condition for reliable (lossless) transmission?
The condition involves comparing to .
At , there is one channel use per source symbol.
Statement
The source is transmissible losslessly over the DMC if and only if . Separation of source and channel coding is optimal: compress to bits, then channel-code at rate .
General bandwidth ratio
For general , the condition becomes . The bandwidth ratio provides channel uses per source symbol, so the effective capacity is bits per source symbol.
ex-ch19-02
EasyA binary source is transmitted over a BSC with crossover probability at bandwidth ratio . Is reliable transmission possible under separation?
Compute and .
Compare to .
Source entropy
bits.
Channel capacity
bits/use.
Comparison
Since , reliable lossless transmission is not possible at . We would need channel uses per source symbol.
ex-ch19-03
EasyTwo independent sources with bits and bits are transmitted over a MAC with capacity region . Is separate source and channel coding sufficient?
For independent sources, Slepian–Wolf rates equal the marginal entropies.
Check if lies inside the MAC capacity region.
Slepian–Wolf rates
Since the sources are independent, the Slepian–Wolf region requires and , with .
MAC region check
The point satisfies , , and . Therefore is inside the MAC capacity region, and separate coding is sufficient.
ex-ch19-04
EasyFor a Gaussian source over a Gaussian channel with (linear, not dB) and bandwidth ratio , compute the minimum achievable distortion under (a) optimal coding, (b) uncoded transmission.
The rate-distortion function is .
Channel capacity is .
At for Gaussians, both methods achieve the same distortion.
Optimal (separate) coding
Setting :
Uncoded transmission
Comparison
Both achieve . At for Gaussian source/channel, uncoded transmission is optimal.
ex-ch19-05
EasyExplain in one paragraph why Shannon's separation theorem does not extend to arbitrary multi-terminal networks. Give one specific example.
Think about what information is lost when you separate source coding from channel coding.
The key insight involves source correlation and encoder cooperation.
Explanation
Shannon's separation theorem relies on the fact that in the point-to-point setting, the source and channel are connected by a single rate constraint, and optimizing each side independently achieves the global optimum. In multi-terminal networks, the source correlation provides a form of implicit cooperation between distributed encoders that separate coding discards. When each encoder compresses its source independently (even using distributed compression like Slepian–Wolf), the compressed bitstreams lose the fine-grained correlation structure that a joint scheme could exploit through the channel.
Example: Two correlated binary sources over a binary adder MAC. The Slepian–Wolf sum rate exceeds the MAC sum capacity, so separation fails. But transmitting raw source bits directly through the MAC can succeed because the receiver effectively observes a function of the sources that, combined with knowledge of their correlation, suffices for reconstruction.
ex-ch19-06
MediumProve the converse of the point-to-point source–channel separation theorem: if a sequence of joint source–channel codes achieves for a DMS over a DMC with capacity , then .
Start with Fano's inequality to bound .
Use the chain rule and the data processing inequality for the Markov chain .
Bound using the memoryless channel property.
Apply Fano's inequality
Since , Fano's inequality gives where .
Chain of inequalities
$
Data processing
Since the encoder maps deterministically, the Markov chain holds. By data processing:
Single-letter bound
For the memoryless channel: where the first inequality uses the memoryless property and the second uses the definition of capacity.
Conclusion
Combining: . Dividing by and taking : .
ex-ch19-07
MediumConsider two correlated binary sources with joint distribution and for .
(a) Compute .
(b) These sources are transmitted over a Gaussian MAC with dB. Determine whether separation is sufficient.
Both marginals are , so .
The conditional entropy is .
The Gaussian MAC sum capacity with equal powers is .
Part (a): Entropies
The marginals are uniform: bit.
given is a BSC(), so .
.
Part (b): Separation check
The Slepian–Wolf sum rate is .
The Gaussian MAC sum capacity at :
Separation requires , i.e., , which gives approximately. For , the Slepian–Wolf sum rate exceeds the MAC sum capacity, and the sufficient condition for separation fails.
ex-ch19-08
MediumA Gaussian source is transmitted over a Gaussian channel with at bandwidth ratio .
(a) Compute the optimal distortion under separate coding.
(b) Compute the distortion under uncoded repetition coding: (same symbol transmitted twice).
(c) Which is better?
For separate coding: .
For uncoded repetition, the receiver sees two noisy copies. Use MRC.
Part (a): Separate coding
$
Part (b): Uncoded repetition
Each channel use transmits (power each, total power over 2 uses). The receiver uses MRC on the two observations: and .
The effective SNR after MRC is . The MMSE distortion is
Part (c): Comparison
\text{SNR} = 10D_{\text{rep}} / D_{\text{opt}} = 11$. Separate coding achieves 11x lower distortion. The uncoded repetition scheme wastes the extra bandwidth by simply repeating the same sample, while separate coding uses the extra channel uses to transmit additional coded bits.
ex-ch19-09
MediumProve that for independent sources over any MAC, separate source and channel coding is optimal (i.e., joint coding provides no advantage).
When sources are independent, each encoder's message is also independent.
Use the converse argument: bound using Fano's inequality.
The independence means , so no correlation to exploit.
Setup
Let the encoders map and the decoder produce from with .
Rate bounds
By Fano's inequality: for each .
For the sum rate: where is the MAC sum capacity.
Individual bounds
Similarly, .
These individual and sum bounds exactly match the Slepian–Wolf rates for independent sources (which equal the marginal entropies) needing to fit inside the MAC capacity region. Since the sources are independent, there is no correlation that joint coding could exploit beyond what separate Slepian–Wolf compression and MAC coding achieve.
ex-ch19-10
MediumExplain the concept of hybrid digital–analog coding and describe a scenario where it outperforms both pure digital (separate) and pure analog (uncoded) transmission.
Hybrid coding allocates part of the channel resource to uncoded transmission and part to coded transmission.
Consider broadcasting a Gaussian source to receivers with different channel qualities.
Hybrid coding concept
Hybrid digital–analog coding splits the source into two components: a digital part (compressed and channel-coded) and an analog part (transmitted uncoded via a linear mapping). The channel input is where is the analog mapping, is the channel codeword, and control the power split.
Scenario where hybrid wins
Broadcasting a Gaussian source to two receivers:
- Receiver 1 has dB (weak channel)
- Receiver 2 has dB (strong channel)
Pure digital (separate): Must choose a single operating point. If designed for receiver 2 (high rate), receiver 1 cannot decode. If designed for receiver 1 (low rate), receiver 2's potential is wasted.
Pure analog (uncoded): . This gracefully adapts to each receiver's SNR, but is suboptimal for either.
Hybrid coding: Transmit a base digital layer decodable by receiver 1 (achieving close to the rate-distortion bound at ), plus an analog refinement that receiver 2 can exploit to achieve much lower distortion. The analog component provides graceful degradation for intermediate receivers. Hybrid achieves near-optimal distortion for both receivers simultaneously — something neither pure approach can do.
ex-ch19-11
MediumConsider the lossy version of the separation theorem. A Gaussian source is transmitted over an AWGN channel with dB at bandwidth ratio (two source symbols per channel use).
(a) What is the minimum achievable distortion under separation?
(b) If we could increase to 1 (one channel use per source symbol), by how many dB does the distortion improve?
for Gaussian sources.
Set and solve for .
Part (a): $\kappa = 0.5$
bits/use.
Setting :
Part (b): $\kappa = 1$
10\log_{10}(D_{0.5}/D_1) = 10\log_{10}(10.05) \approx 10$ dB. Doubling the bandwidth ratio from 0.5 to 1 reduces distortion by about 10 dB at 20 dB SNR.
ex-ch19-12
HardProve that for a degraded broadcast channel with degraded side information , the minimum achievable distortion pair is characterized by the existence of rates such that:
- for
- is in the degraded BC capacity region
(Hint: achievability uses successive refinement + superposition coding.)
Start by showing that the degraded side information structure makes successive refinement optimal.
Map the base layer of successive refinement to the cloud center of superposition coding.
For the converse, use Fano's inequality at each decoder and the degradedness to chain the bounds.
Achievability
Source coding: Since is a Markov chain, successive refinement (Chapter 6) is optimal. Encode into a base description at rate (for receiver 1, which has weaker SI) and a refinement at additional rate (for receiver 2, which has better SI and needs lower distortion).
Channel coding
Channel coding: Use superposition coding on the degraded BC. The cloud center carries (decodable by both receivers, since receiver 2 can decode anything receiver 1 can on a degraded BC). The satellite carries (decodable only by receiver 1, who has the better channel ).
Wait — receiver 1 has the better channel but worse side information. The degraded BC means is stronger than , and the degraded SI means is weaker than . The base layer (high rate, decodable by both) carries the coarse description, and receiver 2 uses its good SI to reconstruct at low distortion even from the base layer alone. Receiver 1, with its strong channel, decodes both layers to compensate for its weaker SI.
Converse
By Fano's inequality at each receiver: The degradedness of the BC ensures that modulo the SI structure. Combined with the BC capacity region converse, this shows that the rates must lie in the degraded BC capacity region.
ex-ch19-13
HardShow that for a Gaussian source over a Gaussian channel , , at bandwidth ratio , uncoded linear transmission achieves the optimal MSE distortion .
Set with (power constraint).
Compute the MMSE of given .
Compare with from the separation theorem.
Uncoded scheme
Set where to satisfy the power constraint .
The receiver observes .
MMSE estimation
Since and are jointly Gaussian and independent, the MMSE estimate is the linear MMSE estimate:
The MSE is
Comparison with separation optimum
Under separation at :
We have . Uncoded linear transmission achieves the information-theoretic minimum distortion for the Gaussian source over the Gaussian channel at .
ex-ch19-14
HardTwo sources are jointly Gaussian with zero mean, unit variance, and correlation coefficient . They are transmitted over a Gaussian MAC with equal power constraints and .
Derive the MSE distortion achieved by uncoded transmission for each source using the MMSE decoder. Show that the distortion decreases as increases (the MMSE decoder exploits correlation).
The observation is . Use the joint Gaussian MMSE formula.
The covariance matrix of determines the MMSE.
Setup
With , the received signal is . The covariance matrix of is .
MMSE estimation
The MMSE estimate of given is
.
.
The MMSE distortion for is:
Effect of correlation
Taking the derivative with respect to :
for all and . Therefore, the distortion is strictly decreasing in : more correlation means lower distortion.
Intuitively, when is high, , so the MAC effectively provides a boosted observation of each source. The MMSE decoder uses the known correlation to separate the superimposed signals.
ex-ch19-15
HardConsider the Cover–El Gamal–Salehi sufficient condition for transmitting correlated sources over a MAC :
The source pair is transmissible if there exists a joint distribution such that
Show that when (no auxiliary random variables), this reduces to the simple condition that the Slepian–Wolf region fits inside the MAC capacity region.
With , the conditioning on disappears.
The remaining conditions match the MAC capacity region constraints.
Simplification
Setting , the conditions become:
Identification with SW + MAC
The first three inequalities say exactly that the Slepian–Wolf corner point and the sum rate are achievable inside the MAC capacity region for the input distribution . The fourth condition provides an additional "mixed" constraint.
Since the MAC capacity region is the set of satisfying , , , and the Slepian–Wolf region requires , , , the conditions reduce to the Slepian–Wolf region fitting inside the MAC capacity region.
ex-ch19-16
HardA Gaussian source is transmitted over a Gaussian channel at bandwidth ratio . Show that the distortion ratio grows exponentially with SNR (in dB). Specifically, show that at high SNR, .
and for the best uncoded scheme.
At high SNR, approximate .
Optimal distortion
At : .
Best uncoded distortion
The best uncoded scheme at transmits the source 3 times (or uses an optimal linear mapping ). With MRC over 3 copies, the effective SNR is , giving
Ratio at high SNR
\text{SNR}^{\kappa-1}\kappa = 3\text{SNR} = 20= 100100^2/3 \approx 3333$, or about 35 dB — a massive penalty for not coding.
ex-ch19-17
Challenge(Research-flavored) Consider a sensor network where sensors observe correlated Gaussian sources with covariance matrix and transmit over a Gaussian MAC with channel vector and noise .
(a) Derive the MMSE distortion for estimating from the MAC output when each sensor uses uncoded transmission .
(b) Show that this distortion depends on the full covariance matrix , not just the marginal variances — confirming that the MAC decoder inherently exploits source correlation.
(c) For sensors with equicorrelation ( for all ), unit variances, and equal channels (, ), express the distortion in closed form and plot it as a function of .
The joint distribution is Gaussian. Use the standard Gaussian MMSE formula.
The cross-covariance vector depends on the -th row of .
For equicorrelation, the covariance matrix has a simple eigendecomposition.
Part (a): MMSE formula
Let and (element-wise product). Then where .
The MMSE estimate of given is:
The MMSE distortion is:
Part (b): Correlation dependence
The numerator of the subtracted term is , which equals . This depends on the off-diagonal entries of (the cross-correlations), not just the diagonal entries (variances). The denominator similarly depends on the full matrix through .
The MMSE decoder uses the known correlation structure to "separate" the superimposed signals — a capability that separate source and channel coding would need to replicate through explicit distributed compression.
Part (c): Equicorrelation case
With , , , : and .
.
.
As : , which is much smaller than the case .
ex-ch19-18
Challenge(Open-ended) The separation theorem fails for certain multi-terminal settings, but modern standards (5G NR, Wi-Fi 7) still use separation. Write a critical analysis (1–2 pages) of when this practical choice is justified and when it may not be. Consider:
(a) The finite-blocklength penalty of separation
(b) The role of source correlation in IoT/mMTC scenarios
(c) The emergence of deep joint source–channel coding (DeepJSCC)
(d) The tradeoff between theoretical optimality and implementation complexity
For (a), cite the dispersion results of Polyanskiy–Poor–Verdú (2010).
For (c), consider the work of Bourtsoulatze, Kurka, and Gündüz on DeepJSCC.
Think about what 'optimality' means in a system with imperfect CSI, bursty traffic, and heterogeneous devices.
Analysis framework
When separation is justified:
- Point-to-point links with moderate-to-long blocklengths (): the separation penalty is negligible (< 0.1 dB for ).
- Independent sources: no correlation to exploit.
- Well-characterized channels: the channel code can be optimized for the known channel statistics.
- Modularity benefits: separate design allows independent evolution of source and channel codecs (e.g., upgrading from H.264 to H.265 without changing the LDPC code).
When separation may fail:
- URLLC at short blocklength (): the finite-blocklength penalty can reach 1–3 dB. DeepJSCC has shown gains in this regime.
- Massive IoT with correlated sources: thousands of sensors observing correlated physical fields. Separate compression discards the correlation that joint coding could exploit through the MAC.
- Channel mismatch: when the actual channel differs from the design channel, digital schemes suffer cliff effects while analog/hybrid schemes degrade gracefully.
- Semantic communication: when the receiver only needs a function of the source (not full reconstruction), joint design can dramatically reduce the required rate.
The DeepJSCC frontier: Recent work on deep learning-based joint source–channel coding (Bourtsoulatze et al., 2019) has demonstrated practical JSCC schemes that outperform separate coding for image transmission at short blocklengths and under channel mismatch. These results suggest that for specific source-channel pairs, learned joint codes can approach the theoretical gains promised by information theory.
Conclusion: Separation is not a theoretical limitation but an engineering tradeoff. For the dominant use cases in current standards, it is the right choice. For emerging scenarios (URLLC, mMTC, semantic communication), joint design may offer meaningful gains that justify the additional complexity.