Ferkans — Interactive Telecom Tutor

Respect the Geometry

SDR lifts the problem to a convex space; element-wise BCD cycles coordinates. Manifold optimization takes a different view: it respects the native geometry of the feasible set. The unit-modulus torus is a smooth manifold $\mathcal{M} = (S^1)^N$ , and standard tools from Riemannian geometry (gradients, retractions, geodesics) give convergent, efficient algorithms tailored to this space. Manifold methods scale much better than SDR and give better solutions than pure element-wise. For large $N$ ( $\geq 256$ ), manifold is the algorithm of choice.

Definition:
The Complex Unit-Modulus Manifold

The complex unit-modulus manifold is

$\mathcal{M} = \{\boldsymbol{\phi} \in \mathbb{C}^N : |\phi_n| = 1\ \forall n\}.$

$\mathcal{M}$ is the $N$ -fold Cartesian product of complex unit circles: $\mathcal{M} = S^1 \times S^1 \times \cdots \times S^1$ . Each circle is a 1D smooth manifold, so $\mathcal{M}$ is an $N$ -dimensional smooth manifold.

The tangent space at a point $\boldsymbol{\phi} \in \mathcal{M}$ is

$T_{\boldsymbol{\phi}} \mathcal{M} = \{\boldsymbol{\xi} \in \mathbb{C}^N : \Re(\phi_n^* \xi_n) = 0\ \forall n\}.$

Each tangent vector is "perpendicular" to $\boldsymbol{\phi}$ in a pointwise sense — it advances along the circle at each coordinate without leaving the torus.

The condition $\Re(\phi_n^* \xi_n) = 0$ is the linearization of $|\phi_n|^2 = 1$ : differentiating $\phi_n^* \phi_n = 1$ gives $\phi_n^* \xi_n + \xi_n^* \phi_n = 2\Re(\phi_n^* \xi_n) = 0$ . The tangent space is thus the orthogonal complement of $\boldsymbol{\phi}$ in the per-coordinate sense.

,

Definition:
Riemannian Gradient and Retraction

Given a smooth $f: \mathbb{C}^N \to \mathbb{R}$ and its Euclidean gradient $\nabla f(\boldsymbol{\phi}) \in \mathbb{C}^N$ , the Riemannian gradient at $\boldsymbol{\phi} \in \mathcal{M}$ is the projection of $\nabla f$ onto the tangent space:

$\text{grad}_{\mathcal{M}} f(\boldsymbol{\phi}) = P_{T_{\boldsymbol{\phi}}\mathcal{M}}(\nabla f) = \nabla f - \Re(\boldsymbol{\phi}^* \odot \nabla f) \odot \boldsymbol{\phi},$

where $\odot$ is element-wise multiplication.

A retraction $R_{\boldsymbol{\phi}}(\boldsymbol{\xi})$ maps a tangent vector $\boldsymbol{\xi} \in T_{\boldsymbol{\phi}}\mathcal{M}$ back to a point on $\mathcal{M}$ . A simple choice is per-coordinate normalization:

$R_{\boldsymbol{\phi}}(\boldsymbol{\xi}) = \frac{\boldsymbol{\phi} + \boldsymbol{\xi}}{|\boldsymbol{\phi} + \boldsymbol{\xi}|}\ \text{(element-wise)}.$

More sophisticated retractions use the exponential map (exact geodesic), but the simple normalization is typically sufficient.

,

Riemannian Gradient Descent on $\mathcal{M}$

Complexity:

O(T \cdot N^2)

:

T \sim 50

-

200

iterations, each

O(N^2)

for gradient of quadratic

Input: initial point

\boldsymbol{\phi}^{(0)} \in \mathcal{M}

, step size

\alpha

, tolerance

\epsilon

.

Output: stationary point

\boldsymbol{\phi}^\star

.

1. For

t = 0, 1, 2, \ldots

:

2.

\quad

Compute Euclidean gradient

\mathbf{g} = \nabla f(\boldsymbol{\phi}^{(t)})

.

3.

\quad

Project to tangent space:

\boldsymbol{\eta} = \mathbf{g} - \Re((\boldsymbol{\phi}^{(t)})^* \odot \mathbf{g}) \odot \boldsymbol{\phi}^{(t)}

.

4.

\quad

If

\|\boldsymbol{\eta}\| < \epsilon

: break.

5.

\quad

Backtrack: find

\alpha_t

such that

f(R_{\boldsymbol{\phi}^{(t)}}(-\alpha_t \boldsymbol{\eta})) > f(\boldsymbol{\phi}^{(t)})

(ascent).

6.

\quad

Retract:

\boldsymbol{\phi}^{(t+1)} = R_{\boldsymbol{\phi}^{(t)}}(-\alpha_t \boldsymbol{\eta})

, i.e.,

element-wise normalize

(\boldsymbol{\phi}^{(t)} - \alpha_t \boldsymbol{\eta})

to unit modulus.

7. return

\boldsymbol{\phi}^{(t)}

.

The per-iteration cost is dominated by the gradient $\nabla f$ of the quadratic, which is $O(N^2)$ for dense $\mathbf{A}$ . Compared to SDR's $O(N^{6.5})$ , manifold is $N^{4.5}/T$ times faster — a decisive advantage at $N \geq 128$ . Conjugate gradient or L-BFGS variants on the manifold are used in practice for faster convergence (see Manopt toolbox).

Theorem: Convergence of Riemannian Gradient Descent

Assume $f$ is twice continuously differentiable on $\mathbb{C}^N$ , and the manifold $\mathcal{M}$ is compact (trivially, as the complex torus). Riemannian gradient ascent with Armijo backtracking satisfies:

$f(\boldsymbol{\phi}^{(t+1)}) > f(\boldsymbol{\phi}^{(t)})$ unless $\|\text{grad}_{\mathcal{M}} f(\boldsymbol{\phi}^{(t)})\| = 0$ .
The sequence $\{\boldsymbol{\phi}^{(t)}\}$ has a convergent subsequence whose limit is a stationary point of $f$ on $\mathcal{M}$ (i.e., $\text{grad}_{\mathcal{M}} f = 0$ ).
Near a non-degenerate stationary point, convergence is linear with rate governed by the Riemannian Hessian.

For the RIS QCQP, the method typically converges to a local maximum within 50–200 iterations.

The Riemannian gradient points in the direction of steepest ascent on the manifold. Armijo backtracking ensures monotone improvement. Under smoothness, the sequence converges to a stationary point — the Riemannian analog of a zero gradient.

Riemannian Newton and Trust Region

Riemannian gradient descent converges linearly; Riemannian Newton converges quadratically — each iteration squares the optimality gap. Computing the Riemannian Hessian (the second fundamental form of $\mathcal{M}$ + Euclidean Hessian) is $O(N^2)$ , same as the gradient, so Newton trades no compute per iteration for faster outer convergence. Practical implementations use the trust-region Newton method via the Manopt toolbox, giving the fastest reliable convergence for the RIS QCQP.

Riemannian Gradient Descent on $S^1 \times S^1$

For a 2-torus (

N = 2

) example, animate the trajectory of Riemannian gradient descent. Each step computes a tangent vector orthogonal to the current

\boldsymbol{\phi}

, descends in that direction, and retracts back to the torus. The trajectory spirals toward a local maximum of the objective.

Example: Riemannian Gradient for the Rank-1 QCQP

For $f(\boldsymbol{\phi}) = |\mathbf{c}^H \boldsymbol{\phi}|^2$ , compute the Euclidean and Riemannian gradients at $\boldsymbol{\phi} = \mathbf{1}$ (all ones).

Solution

Euclidean gradient

$\nabla f(\boldsymbol{\phi}) = 2 (\mathbf{c}^H \boldsymbol{\phi}) \mathbf{c} = 2 (\sum_n c_n^*) \mathbf{c}$ at $\boldsymbol{\phi} = \mathbf{1}$ .

Riemannian gradient

Project to tangent space: $\text{grad}_{\mathcal{M}} f = \nabla f - \Re(\boldsymbol{\phi}^* \odot \nabla f) \odot \boldsymbol{\phi}$ . At $\boldsymbol{\phi} = \mathbf{1}$ : $\boldsymbol{\phi}^* = \mathbf{1}^* = \mathbf{1}$ , so $\text{grad}_{\mathcal{M}} f = \nabla f - \Re(\nabla f) = j\Im(\nabla f)$ .

Interpretation

At $\boldsymbol{\phi} = \mathbf{1}$ , the Riemannian gradient is purely imaginary — a tangent vector in the rotational direction at each coordinate. This makes intuitive sense: the tangent space at $\mathbf{1}$ is the imaginary axis at each coordinate, since advancing $\phi_n$ on the unit circle at $\phi_n = 1$ means moving purely in the imaginary direction.

Manifold GD vs. SDR Convergence

Track the objective of Riemannian gradient descent (with Armijo backtracking) over iterations for a fixed problem instance, and compare with the SDR upper bound and the element-wise BCD trajectory. Manifold typically closes 80-90% of the gap to SDR within 50 iterations; element-wise plateaus earlier but at a lower objective.

Parameters

RIS elements

N

64

Rank of

\mathbf{A}

2

Max iterations100

Common Mistake: Step Size Matters

Mistake:

"Just use a fixed step size $\alpha = 0.1$ and iterate."

Correction:

Without backtracking, a fixed step size can cause non-monotone iterates and even divergence (overshooting). Always use Armijo backtracking: start with $\alpha$ , halve it until $f(R_{\boldsymbol{\phi}}(-\alpha \boldsymbol{\eta})) \geq f(\boldsymbol{\phi}) + c \alpha \|\boldsymbol{\eta}\|^2$ for some $c \in (0, 1)$ . This guarantees monotone progress at the cost of a few extra function evaluations per iteration. Modern implementations (Manopt) handle this automatically.

🔧Engineering Note

Manopt: Let Someone Else Do the Work

The Manopt toolbox (MATLAB) and its Python/Julia ports implement Riemannian trust-region Newton out of the box for the complex unit-modulus manifold (among many others). Using Manopt:

Define the objective $f$ and its Euclidean gradient.
Select the manifold ("complexcircle" or "obliquecomplexfactor").
Call trustregions(problem).

The toolbox handles tangent-space projection, retraction, Hessian approximation, and backtracking internally. For research and production alike, Manopt is the recommended black-box; avoid re-implementing Riemannian optimization unless you have a very specific reason.

Practical Constraints

•
Manopt at $N = 256$ : typically 5-10 s per solve, $\sim 100$ iterations.
•
Pymanopt / Julia Manopt have identical API — easy to port across languages.
•
Warm-starting across AO iterations reduces per-solve time by $3$ - $5\times$ .

Manifold Optimization on the Complex Torus