Ferkans — Interactive Telecom Tutor

The Linear Detector That Scales

The linear MMSE detector is the Swiss army knife of OTFS detection: simple, $O(MN \log(MN))$ complexity, handles any $P$ -path channel, and — crucially — is diagonalized by the 2D DFT thanks to the block-circulant structure of the DD channel matrix. This makes MMSE practically free to compute, and it is the baseline against which more sophisticated schemes (MP, LCD) are compared.

The point is that MMSE is optimal among linear detectors (it minimizes MSE). When the MP gain over MMSE is small (which happens for low-to-moderate path counts and QPSK/16-QAM), MMSE is the detector of choice for its simplicity. When the gap is large (high-order QAM, $P > 10$ ), MP is invoked.

Definition:
DD-Domain LMMSE Detector

Given $\mathbf{y}_{DD} = \mathbf{H}_{DD}\,\mathbf{x}_{DD} + \mathbf{w}_{DD}$ with $\mathbf{x}_{DD} \sim \mathcal{CN}(0, \sigma_x^2 \mathbf{I})$ and $\mathbf{w}_{DD} \sim \mathcal{CN}(0, \sigma^2\mathbf{I})$ , the DD-domain LMMSE detector is $\hat{\mathbf{x}}_{\text{LMMSE}} \;=\; \left(\mathbf{H}_{DD}^{H}\,\mathbf{H}_{DD} + \frac{\sigma^2}{\sigma_x^2}\,\mathbf{I}\right)^{-1}\,\mathbf{H}_{DD}^{H}\,\mathbf{y}_{DD}.$ The final step is quantization to the nearest QAM symbol: $\hat{x}_i = Q(\hat{\mathbf{x}}_{\text{LMMSE}, i})$ for $i = 1, \ldots, MN$ .

Under Gaussian input assumption (which is not exactly QAM), this is the MMSE linear estimator; for QAM input, it is a good approximation in the high-SNR regime.

Theorem: MMSE via the 2D DFT

The DD channel matrix $\mathbf{H}_{DD}$ is diagonalized by the 2D DFT $\mathbf{F} = \mathbf{F}_N \otimes \mathbf{F}_M$ : $\mathbf{H}_{DD} = \mathbf{F}^H \boldsymbol{\Lambda} \mathbf{F}$ with $\boldsymbol{\Lambda}$ a diagonal matrix of eigenvalues. The MMSE solution can be written as $\hat{\mathbf{x}}_{\text{LMMSE}} \;=\; \mathbf{F}^H \left(\frac{\boldsymbol{\Lambda}^H}{|\boldsymbol{\Lambda}|^2 + \sigma^2/\sigma_x^2}\right) \mathbf{F}\,\mathbf{y}_{DD},$ where the multiplication and division are element-wise. The total complexity is $O(MN\log(MN))$ : two 2D FFTs plus $MN$ element-wise operations.

The block-circulant structure of $\mathbf{H}_{DD}$ makes the 2D DFT its eigenbasis. In this eigenbasis, MMSE reduces to per-element Wiener filtering — a trivial operation. The $O(MN\log(MN))$ complexity comes entirely from the two FFTs at the beginning and end of the pipeline.

Proof

Diagonalize

$\mathbf{H}_{DD} = \mathbf{F}^H \boldsymbol{\Lambda} \mathbf{F}$ . Substituting into the MMSE formula: $\hat{\mathbf{x}} = (\mathbf{F}^H \boldsymbol{\Lambda}^H \boldsymbol{\Lambda} \mathbf{F} + (\sigma^2/\sigma_x^2)\mathbf{I})^{-1} \mathbf{F}^H \boldsymbol{\Lambda}^H \mathbf{F}\,\mathbf{y}$ .

Commute with the DFT

$\mathbf{I} = \mathbf{F}^H \mathbf{F}$ , so the expression is $\mathbf{F}^H (\boldsymbol{\Lambda}^H\boldsymbol{\Lambda} + (\sigma^2/\sigma_x^2)\mathbf{I})^{-1} \boldsymbol{\Lambda}^H \mathbf{F}\,\mathbf{y}$ . The middle expression is diagonal, hence the element-wise form.

Complexity

$\mathbf{F}\mathbf{y}$ : one 2D FFT, $O(MN\log(MN))$ . Per-element operation: $O(MN)$ . $\mathbf{F}^H\cdot$ : one 2D iFFT, $O(MN\log(MN))$ . Total: $O(MN\log(MN))$ . $\blacksquare$

Key Takeaway

MMSE in OTFS is essentially free. The block-circulant structure of the DD channel matrix reduces MMSE to two 2D FFTs plus an element-wise Wiener filter — total $O(MN\log(MN))$ . On a modern GPU, this is sub-millisecond for $MN = 10^4$ . The detector is "the 2D FFT-MMSE-inverse-2D-FFT" pipeline, identical in structure to a per-subcarrier equalizer in OFDM.

DD-Domain MMSE Detection via 2D FFT

Complexity:

O(MN\log(MN))

Input: Received

\mathbf{y}_{DD}

(length

MN

), channel eigenvalues

\boldsymbol{\Lambda}

(length

MN

, from channel estimate), noise

variance

\sigma^2

, symbol power

\sigma_x^2

Output: Detected symbols

\hat{\mathbf{x}}

1. Compute

\tilde{\mathbf{y}} = \mathbf{F}\,\mathbf{y}_{DD}

. (2D FFT,

O(MN\log(MN))

)

2. Compute element-wise Wiener:

\tilde{\mathbf{x}}_i = \boldsymbol{\Lambda}_i^* \cdot \tilde{\mathbf{y}}_i / (|\boldsymbol{\Lambda}_i|^2 + \sigma^2/\sigma_x^2)

. (

O(MN)

)

3. Compute

\hat{\mathbf{x}}_{\text{LMMSE}} = \mathbf{F}^H\,\tilde{\mathbf{x}}

. (2D IFFT,

O(MN\log(MN))

)

4. Quantize:

\hat{x}_i = \arg\min_{s \in \mathcal{X}} |\hat{x}_{\text{LMMSE}, i} - s|

.

5. Return

\hat{\mathbf{x}}

.

In production code, the 2D DFT is computed via np.fft.fft2 (or equivalent). The element-wise multiply uses precomputed channel eigenvalues from the estimator. Quantization is a per-cell minimum-distance search over $|\mathcal{X}|$ symbols — $O(MN \cdot |\mathcal{X}|)$ , typically $\sim 10^5$ ops.

Example: Per-Symbol SNR After MMSE

An OTFS frame has MMSE detector output with post-detection per-symbol SNR $\mathrm{SINR}_i = |\lambda_i|^2 \sigma_x^2 / (\sigma^2 + \text{residual ISI})$ . For a channel with $P = 4$ paths, $|h_i|^2 \in \{0.5, 0.3, 0.1, 0.1\}$ , compute the MMSE SINR at a representative DD cell.

Solution

Eigenvalue magnitude

At DD cell $(n, m)$ , the eigenvalue is $\lambda_{n, m} = \sum_i h_i\,e^{j\phi_{i, n, m}}$ . Sum of $P = 4$ complex numbers with magnitudes $\sqrt{|h_i|^2}$ and random phases.

RMS magnitude

$\mathbb{E}|\lambda_{n, m}|^2 = \sum_i |h_i|^2 = 0.5 + 0.3 + 0.1 + 0.1 = 1$ (total channel power normalized to 1).

Per-cell SINR

At $\sigma_x^2 = 1, \sigma^2 = 10^{-1}$ (10 dB SNR), $\mathrm{SINR}_{n, m} = |\lambda_{n, m}|^2/(0.1 + \text{ISI})$ . Mean $\mathbb{E}[\mathrm{SINR}] \approx 10$ (10 dB — no loss at moderate SNR). In deep-fade cells ( $|\lambda| \ll 1$ ), SINR can drop by tens of dB, which is where MP detection outperforms MMSE.

Per-Cell SINR Distribution After MMSE

Plot the distribution (CDF) of post-MMSE SINR across DD cells for a $P$ -path channel at varying SNR. The distribution has a heavy left tail: some cells experience deep fades where MMSE cannot recover the symbol. Overlay the ML per-cell SINR (asymptotic) to see the MMSE-vs-ML gap.

Parameters

Frame SNR (dB)15

P

4

Delay bins

M

32

Doppler bins

N

16

Theorem: Uncoded MMSE BER for QPSK in OTFS

For uncoded QPSK over a $P$ -path DD channel, the MMSE uncoded BER satisfies $\mathrm{BER}_{\text{MMSE}} \;\approx\; \mathbb{E}\!\left[Q\!\left(\sqrt{\frac{2|\lambda_{n,m}|^2 \sigma_x^2}{\sigma^2 + \epsilon}}\right)\right],$ where $\epsilon$ is a residual ISI term from imperfect inversion, the expectation is over channel realizations, and $\lambda_{n, m}$ is the 2D DFT of the channel matrix at cell $(n, m)$ . In the high-SNR regime, $\mathrm{BER}_{\text{MMSE}} \sim 1/\mathrm{SNR}$ — diversity order 1, same as OFDM.

Linear MMSE does not exploit the DD diversity of the channel. Each DD cell's BER is dominated by the worst realization of $\lambda_{n, m}$ — if it is small (deep fade), the cell is unrecoverable regardless of the other paths. The OTFS BER advantage over OFDM comes from nonlinear detectors (MP, LCD) or from channel coding that averages across cells.

This is a negative result: MMSE alone does not deliver the full OTFS promise. But MMSE as an initialization for iterative decoding (Section 5) does achieve the full diversity.

Proof

Per-cell MMSE

At cell $(n, m)$ , post-MMSE signal has SINR $\mathrm{SINR}_{n, m} = |\lambda_{n, m}|^2/(\sigma^2/\sigma_x^2)$ (neglecting residual ISI for simplicity).

QPSK BER

Per-cell BER: $Q(\sqrt{2\,\mathrm{SINR}_{n, m}})$ . Frame BER: average over cells.

Asymptotic

For Rayleigh-distributed $|\lambda_{n, m}|^2$ (which it is, as a sum of complex Gaussians), the tail of $Q(\sqrt{|\lambda|^2})$ gives $\mathrm{BER} \sim 1/\mathrm{SNR}$ — diversity 1. The spread of $|\lambda_{n, m}|^2$ across cells does not help a per-cell detector. $\blacksquare$

,

Key Takeaway

MMSE has diversity 1, not $P$ . Linear detection cannot exploit the DD-domain multipath diversity. Each DD cell experiences an effective single-tap channel $\lambda_{n, m}$ , and deep fades on this single tap cause uncorrectable errors. The full diversity promised by Chapter 9 is accessible only through nonlinear detection (MP) or channel coding applied across cells.

🔧Engineering Note

When MMSE Suffices

Despite having only diversity 1, MMSE is the correct choice when:

Channel coding provides diversity: LDPC or Turbo codes with sufficient code rate average across DD cells and deliver the full diversity. MMSE as inner detector is adequate.
High SNR operation: at $\mathrm{SNR} > 20$ dB, both MMSE and MP have negligible BER ( $< 10^{-6}$ ). The complexity difference dominates the choice, favoring MMSE.
URLLC applications: latency constraints prohibit iterative detection. MMSE's $O(MN\log(MN))$ single-pass complexity is decisive for $< 1$ ms latency.

MP (Section 3) is preferred when:

Uncoded operation: MP's diversity advantage shows up in uncoded BER at moderate SNR.
Low SNR ( $< 10$ dB): MP gains 3–5 dB over MMSE.
Rich multipath (large $P$ ): more paths = more room for MP to exploit, less effective for linear MMSE.

Practical Constraints

•
MMSE + outer code: most deployments
•
MP: low SNR, uncoded, research/dense multipath scenarios
•
Sphere decoding: academic / small frame sizes only

Linear MMSE in the DD Domain

The Linear Detector That Scales

Definition: DD-Domain LMMSE Detector

Theorem: MMSE via the 2D DFT

Diagonalize

Commute with the DFT

Complexity

Key Takeaway

DD-Domain MMSE Detection via 2D FFT

Example: Per-Symbol SNR After MMSE

Eigenvalue magnitude

RMS magnitude

Per-cell SINR

Per-Cell SINR Distribution After MMSE

Parameters

Theorem: Uncoded MMSE BER for QPSK in OTFS

Per-cell MMSE

QPSK BER

Asymptotic

Key Takeaway

When MMSE Suffices

Definition:
DD-Domain LMMSE Detector