Linear MMSE in the DD Domain

The Linear Detector That Scales

The linear MMSE detector is the Swiss army knife of OTFS detection: simple, O(MNlog⁑(MN))O(MN \log(MN)) complexity, handles any PP-path channel, and β€” crucially β€” is diagonalized by the 2D DFT thanks to the block-circulant structure of the DD channel matrix. This makes MMSE practically free to compute, and it is the baseline against which more sophisticated schemes (MP, LCD) are compared.

The point is that MMSE is optimal among linear detectors (it minimizes MSE). When the MP gain over MMSE is small (which happens for low-to-moderate path counts and QPSK/16-QAM), MMSE is the detector of choice for its simplicity. When the gap is large (high-order QAM, P>10P > 10), MP is invoked.

Definition:

DD-Domain LMMSE Detector

Given yDD=HDD xDD+wDD\mathbf{y}_{DD} = \mathbf{H}_{DD}\,\mathbf{x}_{DD} + \mathbf{w}_{DD} with xDD∼CN(0,Οƒx2I)\mathbf{x}_{DD} \sim \mathcal{CN}(0, \sigma_x^2 \mathbf{I}) and wDD∼CN(0,Οƒ2I)\mathbf{w}_{DD} \sim \mathcal{CN}(0, \sigma^2\mathbf{I}), the DD-domain LMMSE detector is x^LMMSEβ€…β€Š=β€…β€Š(HDDH HDD+Οƒ2Οƒx2 I)βˆ’1 HDDH yDD.\hat{\mathbf{x}}_{\text{LMMSE}} \;=\; \left(\mathbf{H}_{DD}^{H}\,\mathbf{H}_{DD} + \frac{\sigma^2}{\sigma_x^2}\,\mathbf{I}\right)^{-1}\,\mathbf{H}_{DD}^{H}\,\mathbf{y}_{DD}. The final step is quantization to the nearest QAM symbol: x^i=Q(x^LMMSE,i)\hat{x}_i = Q(\hat{\mathbf{x}}_{\text{LMMSE}, i}) for i=1,…,MNi = 1, \ldots, MN.

Under Gaussian input assumption (which is not exactly QAM), this is the MMSE linear estimator; for QAM input, it is a good approximation in the high-SNR regime.

Theorem: MMSE via the 2D DFT

The DD channel matrix HDD\mathbf{H}_{DD} is diagonalized by the 2D DFT F=FNβŠ—FM\mathbf{F} = \mathbf{F}_N \otimes \mathbf{F}_M: HDD=FHΞ›F\mathbf{H}_{DD} = \mathbf{F}^H \boldsymbol{\Lambda} \mathbf{F} with Ξ›\boldsymbol{\Lambda} a diagonal matrix of eigenvalues. The MMSE solution can be written as x^LMMSEβ€…β€Š=β€…β€ŠFH(Ξ›Hβˆ£Ξ›βˆ£2+Οƒ2/Οƒx2)F yDD,\hat{\mathbf{x}}_{\text{LMMSE}} \;=\; \mathbf{F}^H \left(\frac{\boldsymbol{\Lambda}^H}{|\boldsymbol{\Lambda}|^2 + \sigma^2/\sigma_x^2}\right) \mathbf{F}\,\mathbf{y}_{DD}, where the multiplication and division are element-wise. The total complexity is O(MNlog⁑(MN))O(MN\log(MN)): two 2D FFTs plus MNMN element-wise operations.

The block-circulant structure of HDD\mathbf{H}_{DD} makes the 2D DFT its eigenbasis. In this eigenbasis, MMSE reduces to per-element Wiener filtering β€” a trivial operation. The O(MNlog⁑(MN))O(MN\log(MN)) complexity comes entirely from the two FFTs at the beginning and end of the pipeline.

Key Takeaway

MMSE in OTFS is essentially free. The block-circulant structure of the DD channel matrix reduces MMSE to two 2D FFTs plus an element-wise Wiener filter β€” total O(MNlog⁑(MN))O(MN\log(MN)). On a modern GPU, this is sub-millisecond for MN=104MN = 10^4. The detector is "the 2D FFT-MMSE-inverse-2D-FFT" pipeline, identical in structure to a per-subcarrier equalizer in OFDM.

DD-Domain MMSE Detection via 2D FFT

Complexity: O(MNlog⁑(MN))O(MN\log(MN))
Input: Received yDD\mathbf{y}_{DD} (length MNMN), channel eigenvalues
Ξ›\boldsymbol{\Lambda} (length MNMN, from channel estimate), noise
variance Οƒ2\sigma^2, symbol power Οƒx2\sigma_x^2
Output: Detected symbols x^\hat{\mathbf{x}}
1. Compute y~=F yDD\tilde{\mathbf{y}} = \mathbf{F}\,\mathbf{y}_{DD}. (2D FFT, O(MNlog⁑(MN))O(MN\log(MN)))
2. Compute element-wise Wiener:
x~i=Ξ›iβˆ—β‹…y~i/(βˆ£Ξ›i∣2+Οƒ2/Οƒx2)\tilde{\mathbf{x}}_i = \boldsymbol{\Lambda}_i^* \cdot \tilde{\mathbf{y}}_i / (|\boldsymbol{\Lambda}_i|^2 + \sigma^2/\sigma_x^2). (O(MN)O(MN))
3. Compute x^LMMSE=FH x~\hat{\mathbf{x}}_{\text{LMMSE}} = \mathbf{F}^H\,\tilde{\mathbf{x}}. (2D IFFT, O(MNlog⁑(MN))O(MN\log(MN)))
4. Quantize: x^i=arg⁑min⁑s∈X∣x^LMMSE,iβˆ’s∣\hat{x}_i = \arg\min_{s \in \mathcal{X}} |\hat{x}_{\text{LMMSE}, i} - s|.
5. Return x^\hat{\mathbf{x}}.

In production code, the 2D DFT is computed via np.fft.fft2 (or equivalent). The element-wise multiply uses precomputed channel eigenvalues from the estimator. Quantization is a per-cell minimum-distance search over ∣X∣|\mathcal{X}| symbols β€” O(MNβ‹…βˆ£X∣)O(MN \cdot |\mathcal{X}|), typically ∼105\sim 10^5 ops.

Example: Per-Symbol SNR After MMSE

An OTFS frame has MMSE detector output with post-detection per-symbol SNR SINRi=∣λi∣2Οƒx2/(Οƒ2+residualΒ ISI)\mathrm{SINR}_i = |\lambda_i|^2 \sigma_x^2 / (\sigma^2 + \text{residual ISI}). For a channel with P=4P = 4 paths, ∣hi∣2∈{0.5,0.3,0.1,0.1}|h_i|^2 \in \{0.5, 0.3, 0.1, 0.1\}, compute the MMSE SINR at a representative DD cell.

Per-Cell SINR Distribution After MMSE

Plot the distribution (CDF) of post-MMSE SINR across DD cells for a PP-path channel at varying SNR. The distribution has a heavy left tail: some cells experience deep fades where MMSE cannot recover the symbol. Overlay the ML per-cell SINR (asymptotic) to see the MMSE-vs-ML gap.

Parameters
15
4
32
16

Theorem: Uncoded MMSE BER for QPSK in OTFS

For uncoded QPSK over a PP-path DD channel, the MMSE uncoded BER satisfies BERMMSEβ€…β€Šβ‰ˆβ€…β€ŠE ⁣[Q ⁣(2∣λn,m∣2Οƒx2Οƒ2+Ο΅)],\mathrm{BER}_{\text{MMSE}} \;\approx\; \mathbb{E}\!\left[Q\!\left(\sqrt{\frac{2|\lambda_{n,m}|^2 \sigma_x^2}{\sigma^2 + \epsilon}}\right)\right], where Ο΅\epsilon is a residual ISI term from imperfect inversion, the expectation is over channel realizations, and Ξ»n,m\lambda_{n, m} is the 2D DFT of the channel matrix at cell (n,m)(n, m). In the high-SNR regime, BERMMSE∼1/SNR\mathrm{BER}_{\text{MMSE}} \sim 1/\mathrm{SNR} β€” diversity order 1, same as OFDM.

Linear MMSE does not exploit the DD diversity of the channel. Each DD cell's BER is dominated by the worst realization of Ξ»n,m\lambda_{n, m} β€” if it is small (deep fade), the cell is unrecoverable regardless of the other paths. The OTFS BER advantage over OFDM comes from nonlinear detectors (MP, LCD) or from channel coding that averages across cells.

This is a negative result: MMSE alone does not deliver the full OTFS promise. But MMSE as an initialization for iterative decoding (Section 5) does achieve the full diversity.

,

Key Takeaway

MMSE has diversity 1, not PP. Linear detection cannot exploit the DD-domain multipath diversity. Each DD cell experiences an effective single-tap channel Ξ»n,m\lambda_{n, m}, and deep fades on this single tap cause uncorrectable errors. The full diversity promised by Chapter 9 is accessible only through nonlinear detection (MP) or channel coding applied across cells.

πŸ”§Engineering Note

When MMSE Suffices

Despite having only diversity 1, MMSE is the correct choice when:

  • Channel coding provides diversity: LDPC or Turbo codes with sufficient code rate average across DD cells and deliver the full diversity. MMSE as inner detector is adequate.
  • High SNR operation: at SNR>20\mathrm{SNR} > 20 dB, both MMSE and MP have negligible BER (<10βˆ’6< 10^{-6}). The complexity difference dominates the choice, favoring MMSE.
  • URLLC applications: latency constraints prohibit iterative detection. MMSE's O(MNlog⁑(MN))O(MN\log(MN)) single-pass complexity is decisive for <1< 1 ms latency.

MP (Section 3) is preferred when:

  • Uncoded operation: MP's diversity advantage shows up in uncoded BER at moderate SNR.
  • Low SNR (<10< 10 dB): MP gains 3–5 dB over MMSE.
  • Rich multipath (large PP): more paths = more room for MP to exploit, less effective for linear MMSE.
Practical Constraints
  • β€’

    MMSE + outer code: most deployments

  • β€’

    MP: low SNR, uncoded, research/dense multipath scenarios

  • β€’

    Sphere decoding: academic / small frame sizes only