Ferkans — Interactive Telecom Tutor

The Eigenvalue Density of Random Covariance Matrices

We now answer the central question from Section 21.1: what is the limiting spectral distribution of $\frac{1}{m}\mathbf{H}^H\mathbf{H}$ when $\mathbf{H}$ has i.i.d. $\mathcal{CN}(0,1)$ entries? The answer — the Marchenko-Pastur law — is one of the most important results in random matrix theory. It tells us exactly how the eigenvalues distribute themselves, and this distribution is what determines the capacity of a Rayleigh fading MIMO channel.

Theorem: The Marchenko-Pastur Law

Let $\mathbf{H} \in \mathbb{C}^{n \times m}$ have i.i.d. $\mathcal{CN}(0,1)$ entries. As $n, m \to \infty$ with $n/m \to \beta \in (0, \infty)$ , the empirical spectral distribution of $\mathbf{W} = \frac{1}{m}\mathbf{H}^H\mathbf{H}$ converges almost surely to the Marchenko-Pastur distribution $F_\beta$ with density $f_{\mathrm{MP}}(\lambda; \beta) = \frac{1}{2\pi\beta\lambda}\sqrt{(\lambda_+ - \lambda)(\lambda - \lambda_-)}\, \mathbf{1}_{\{\lambda_- \leq \lambda \leq \lambda_+\}},$ where $\lambda_{\pm} = (1 \pm \sqrt{\beta})^2$ .

If $\beta > 1$ , there is additionally a point mass of weight $(1 - 1/\beta)$ at $\lambda = 0$ (corresponding to the $m - n$ zero eigenvalues when $n < m$ ... but since we form $\mathbf{H}^H\mathbf{H} \in \mathbb{C}^{m \times m}$ with rank at most $n$ , for $\beta > 1$ we have $n < m$ and there are $m - n$ zero eigenvalues).

More precisely, for $\beta > 1$ : $F_\beta(\lambda) = \left(1 - \frac{1}{\beta}\right)\mathbf{1}_{\{\lambda \geq 0\}} + \int_0^{\lambda} f_{\mathrm{MP}}(t;\beta)\, dt.$

The density has a characteristic "bulk" shape supported on $[\lambda_-, \lambda_+]$ . When $\beta = 1$ (square matrix), the support starts at $\lambda_- = 0$ and extends to $\lambda_+ = 4$ . As $\beta \to 0$ (many more columns than rows), the distribution concentrates around $\lambda = 1$ — the eigenvalues become nearly deterministic. The width of the support, $\lambda_+ - \lambda_- = 4\sqrt{\beta}$ , measures the "spread" of eigenvalues and directly affects capacity.

Proof

Moment method outline

The classical proof uses the moment method: show that for every $k \geq 1$ , $\frac{1}{m}\text{tr}(\mathbf{W}^k) = \int \lambda^k\, dF^{\mathbf{W}}(\lambda) \xrightarrow{a.s.} \int \lambda^k\, dF_\beta(\lambda).$ The $k$ -th moment of the ESD is a sum over products of matrix entries that can be computed by a combinatorial argument (counting non-crossing pair partitions).

First moment

$\mathbb{E}\left[\frac{1}{m}\text{tr}(\mathbf{W})\right] = \frac{1}{m}\text{tr}\left(\frac{1}{m}\mathbb{E}[\mathbf{H}^H\mathbf{H}]\right) = \frac{1}{m}\text{tr}\left(\frac{n}{m}\mathbf{I}_m\right) = \frac{n}{m} \to \beta.$ This matches $\int \lambda\, dF_\beta(\lambda) = 1$ (when the mass at zero contributes nothing to the first moment of the continuous part) ... in fact $\int \lambda\, f_{\mathrm{MP}}(\lambda;\beta)\, d\lambda = 1$ for any $\beta > 0$ .

Second moment and variance

The second moment of the MP distribution is $\int \lambda^2\, dF_\beta(\lambda) = 1 + \beta$ . Matching this against $\frac{1}{m}\text{tr}(\mathbf{W}^2)$ requires counting pairs $(i_1,j_1),(i_2,j_2)$ with specific index-matching constraints — the non-crossing partition structure.

Convergence

Since the moments of the ESD converge to those of $F_\beta$ , and the MP distribution is determined by its moments (it has compact support), the method of moments implies weak convergence $F^{\mathbf{W}} \to F_\beta$ almost surely. The almost sure convergence requires a variance bound (Borel-Cantelli argument).

,

Example: Marchenko-Pastur Density for $\beta = 1$ (Square Matrix)

Write out the Marchenko-Pastur density for $\beta = 1$ and compute its first two moments.

Solution

Support and density

For $\beta = 1$ : $\lambda_- = (1 - 1)^2 = 0$ , $\lambda_+ = (1 + 1)^2 = 4$ . $f_{\mathrm{MP}}(\lambda; 1) = \frac{1}{2\pi\lambda}\sqrt{(4 - \lambda)\lambda}\, \mathbf{1}_{\{0 \leq \lambda \leq 4\}} = \frac{1}{2\pi}\sqrt{\frac{4 - \lambda}{\lambda}}\, \mathbf{1}_{\{0 \leq \lambda \leq 4\}}.$

First moment

$\int_0^4 \lambda \cdot f_{\mathrm{MP}}(\lambda;1)\, d\lambda = \int_0^4 \frac{1}{2\pi}\sqrt{\lambda(4 - \lambda)}\, d\lambda = 1$ (by the substitution $\lambda = 2 + 2\cos\theta$ ).

Second moment

$\int_0^4 \lambda^2 f_{\mathrm{MP}}(\lambda;1)\, d\lambda = 1 + \beta = 2$ .

Example: Moments of the Marchenko-Pastur Distribution

Show that the $k$ -th moment of the Marchenko-Pastur distribution satisfies the recursion $m_k = \sum_{j=0}^{k-1} \beta^j \binom{k}{j}\binom{k-1}{j} / (j+1)$ (Narayana numbers). Compute $m_1, m_2, m_3$ for general $\beta$ .

Solution

Known results

The moments are given by $m_k = \int \lambda^k\, dF_\beta(\lambda) = \sum_{j=0}^{k-1} \frac{1}{j+1}\binom{k}{j}\binom{k}{j+1} \beta^j.$ (These are weighted Narayana numbers, also related to Catalan numbers when $\beta = 1$ .)

First three moments

$m_1 = 1$ , $m_2 = 1 + \beta$ , $m_3 = 1 + 3\beta + \beta^2$ . For $\beta = 1$ : $m_1 = 1$ , $m_2 = 2$ , $m_3 = 5$ (Catalan numbers!).

Marchenko-Pastur Density for Varying $\beta$

Visualize the Marchenko-Pastur density $f_{\mathrm{MP}}(\lambda; \beta)$ and see how the support $[\lambda_-, \lambda_+]$ changes with the aspect ratio $\beta = n/m$ .

Parameters

\beta = n/m

0.5

Definition:
Support and Condition Number of the Marchenko-Pastur Distribution

The Marchenko-Pastur distribution with parameter $\beta \in (0,1]$ is supported on $[\lambda_-, \lambda_+]$ where $\lambda_- = (1 - \sqrt{\beta})^2, \quad \lambda_+ = (1 + \sqrt{\beta})^2.$ The ratio $\lambda_+/\lambda_- = \left(\frac{1 + \sqrt{\beta}}{1 - \sqrt{\beta}}\right)^2$ is the asymptotic condition number of the Wishart matrix. It grows as $\beta \to 1$ (square matrices are ill-conditioned) and shrinks as $\beta \to 0$ (many more columns than rows concentrates the eigenvalues near 1).

Theorem: Asymptotic MIMO Capacity via Marchenko-Pastur

For an i.i.d. Rayleigh fading MIMO channel with $n_r$ receive and $n_t$ transmit antennas, $n_t/n_r \to \beta$ , and equal power allocation at SNR $\rho$ , the per-receive-antenna ergodic capacity converges: $\frac{C}{n_r} \to \int \log_2(1 + \rho\lambda)\, dF_\beta(\lambda) = \int_{\lambda_-}^{\lambda_+} \log_2(1 + \rho\lambda)\, f_{\mathrm{MP}}(\lambda;\beta)\, d\lambda.$ This integral can be evaluated in closed form using the Stieltjes transform (Section 21.3).

Each eigenvalue contributes $\log_2(1 + \rho\lambda)$ to the capacity. As the matrix grows, the eigenvalue histogram hardens to the MP density, so the sum over eigenvalues converges to an integral. The capacity per antenna is a deterministic functional of the MP distribution — no randomness remains in the large-system limit.

Proof

Convergence of the functional

Since $F^{\mathbf{W}} \to F_\beta$ almost surely and $\lambda \mapsto \log(1 + \rho\lambda)$ is bounded and continuous on $[\lambda_-, \lambda_+]$ , the integral $\int \log(1 + \rho\lambda)\, dF^{\mathbf{W}} \to \int \log(1 + \rho\lambda)\, dF_\beta$ almost surely by the portmanteau theorem.

Quick Check

For $\beta = 1/4$ (i.e., 4 times more columns than rows), what is the support $[\lambda_-, \lambda_+]$ of the Marchenko-Pastur distribution?

$[1/4, 9/4]$

$[0, 4]$

$[1/16, 9/16]$

$[0, 9/4]$

Correction:

[1/4, 9/4]

$\lambda_- = (1 - 1/2)^2 = 1/4$ and $\lambda_+ = (1 + 1/2)^2 = 9/4$ .

Common Mistake: Convention for $\beta$ : Rows/Columns or Columns/Rows?

Mistake:

Different references define $\beta$ differently. Some use $\beta = n/m$ (rows over columns), others $\beta = m/n$ . Using the wrong convention leads to a reflected MP density.

Correction:

In this book (following Couillet-Debbah), $\beta = n/m$ where $\mathbf{H} \in \mathbb{C}^{n \times m}$ . When $\beta < 1$ , the matrix $\frac{1}{m}\mathbf{H}^H\mathbf{H}$ has no zero eigenvalues. When $\beta > 1$ , there are $m - n$ zero eigenvalues (mass at the origin). Always check which convention a reference uses before applying formulas.

Common Mistake: Forgetting the $1/m$ Normalization

Mistake:

Computing the eigenvalues of $\mathbf{H}^H\mathbf{H}$ (without the $1/m$ factor) and expecting them to follow the standard Marchenko-Pastur density.

Correction:

Without the $1/m$ normalization, the eigenvalues of $\mathbf{H}^H\mathbf{H}$ grow with $m$ . The MP law describes the eigenvalues of $\frac{1}{m}\mathbf{H}^H\mathbf{H}$ , which have $O(1)$ support. Alternatively, one can state the MP law for $\mathbf{H}^H\mathbf{H}$ with support $[m\lambda_-, m\lambda_+]$ , but the standard form uses the normalized version.

Historical Note: Marchenko and Pastur's 1967 Paper

1967

Vladimir Marchenko and Leonid Pastur published their landmark result in 1967, working at the Institute for Low Temperature Physics in Kharkov (now Kharkiv), Ukraine. Their paper "Distribution of eigenvalues for some sets of random matrices" established the law that now bears their name. The result was initially a contribution to mathematical physics, describing the spectral properties of large covariance matrices. It would take over three decades before Telatar (1999) and others recognized its central importance for MIMO capacity analysis.

⚠️Engineering Note

Finite-Size Corrections for System Design

The Marchenko-Pastur law is exact only in the limit $n, m \to \infty$ . For practical MIMO systems with $n_t = 64$ antennas and $K = 16$ users ( $\beta = 1/4$ ), the ESD closely matches the MP density, but edge eigenvalues exhibit Tracy-Widom fluctuations of order $n^{-2/3}$ . These fluctuations affect the tails of the capacity distribution (outage rates) even when the mean capacity is well approximated by the asymptotic formula. For outage analysis with $\min(n_t, n_r) < 16$ , use finite-dimensional expressions or Monte Carlo.

Practical Constraints

•
Tracy-Widom fluctuations at the edge: $O(n^{-2/3})$
•
Central eigenvalues converge faster: $O(n^{-1})$

Marchenko-Pastur Distribution

The limiting spectral distribution of the sample covariance matrix $\frac{1}{m}\mathbf{H}^H\mathbf{H}$ when $\mathbf{H}$ has i.i.d. entries with zero mean and unit variance, as $n, m \to \infty$ with $n/m \to \beta$ . It has density $f_{\mathrm{MP}}(\lambda;\beta) = \frac{1}{2\pi\beta\lambda}\sqrt{(\lambda_+ - \lambda)(\lambda - \lambda_-)}$ on $[\lambda_-, \lambda_+]$ with $\lambda_\pm = (1 \pm \sqrt{\beta})^2$ .

Wishart Matrix

A random matrix of the form $\mathbf{W} = \mathbf{H}^H\mathbf{H}$ where $\mathbf{H} \in \mathbb{C}^{n \times m}$ has i.i.d. Gaussian entries. Named after John Wishart who studied its distribution in 1928 for multivariate statistics.

Related: Wishart-Type Matrix

Key Takeaway

The Marchenko-Pastur law gives the limiting eigenvalue density of sample covariance matrices with i.i.d. entries. Its support $[\lambda_-, \lambda_+] = [(1-\sqrt{\beta})^2, (1+\sqrt{\beta})^2]$ determines the eigenvalue spread, which in turn governs the capacity of i.i.d. Rayleigh MIMO channels. The result is universal — it depends only on the aspect ratio $\beta$ , not on the specific distribution of the entries (as long as they have zero mean and unit variance).

The Marchenko-Pastur Law

The Eigenvalue Density of Random Covariance Matrices

Theorem: The Marchenko-Pastur Law

Moment method outline

First moment

Second moment and variance

Convergence

Example: Marchenko-Pastur Density for β=1\beta = 1β=1 (Square Matrix)

Support and density

First moment

Second moment

Example: Moments of the Marchenko-Pastur Distribution

Known results

First three moments

Marchenko-Pastur Density for Varying β\betaβ

Parameters

Definition: Support and Condition Number of the Marchenko-Pastur Distribution

Theorem: Asymptotic MIMO Capacity via Marchenko-Pastur

Convergence of the functional

Quick Check

Common Mistake: Convention for β\betaβ: Rows/Columns or Columns/Rows?

Common Mistake: Forgetting the 1/m1/m1/m Normalization

Historical Note: Marchenko and Pastur's 1967 Paper

Finite-Size Corrections for System Design

Marchenko-Pastur Distribution

Wishart Matrix

Key Takeaway

Example: Marchenko-Pastur Density for $\beta = 1$ (Square Matrix)

Marchenko-Pastur Density for Varying $\beta$

Definition:
Support and Condition Number of the Marchenko-Pastur Distribution

Common Mistake: Convention for $\beta$ : Rows/Columns or Columns/Rows?

Common Mistake: Forgetting the $1/m$ Normalization