Exercises
ex-ch03-01
EasyShow that the intersection of two halfspaces and is convex.
Each halfspace is convex. What do you know about intersections of convex sets?
Apply intersection theorem
Each halfspace is convex (verified in Section 3.1). By TIntersection of Convex Sets Is Convex, the intersection of convex sets is convex.
ex-ch03-02
EasyDetermine whether is convex on . (Note that is not differentiable at .)
Use the definition directly: check .
Triangle inequality
. This is exactly the definition of convexity.
ex-ch03-03
EasyShow that is convex.
Express as a pointwise supremum of linear (hence convex) functions.
Pointwise supremum
. Each is linear, hence convex. The pointwise maximum of convex functions is convex (TOperations That Preserve Convexity, rule 2).
ex-ch03-04
EasyWrite the dual of the LP: subject to , , .
Convert to standard form ( constraints) first.
Standard form
Rewrite: s.t. , , .
Dual
Dual: s.t. , , . Simplifying: s.t. , , . Optimal: , . (Primal: , .) Strong duality holds.
ex-ch03-05
EasyVerify the KKT conditions for the problem subject to . What are and ?
Rewrite the constraint as .
KKT conditions
Stationarity: . Primal feasibility: . Dual feasibility: . Complementary slackness: .
The constraint is active at (since the unconstrained minimum violates ). So , .
ex-ch03-06
EasyCompute one step of gradient descent on starting from with step size .
The gradient descent update is .
Compute and evaluate at .
Gradient computation
. . In this case, one step reaches the exact minimiser because is quadratic and where .
ex-ch03-07
MediumShow that (the log-sum-exp function) is convex.
Compute the Hessian and show it is PSD.
Alternatively, express as a pointwise supremum of linear functions plus a correction.
Hessian approach
Let and . The gradient is . The Hessian is .
For any : .
By Cauchy–Schwarz applied to the probability distribution : . Hence .
ex-ch03-08
MediumDerive the dual of the QP: subject to , where .
Form the Lagrangian with multiplier for the equality constraint.
Minimise over by setting the gradient to zero.
Lagrangian
.
Minimise over $\mathbf{x}$
, so .
Dual function
Substituting back: .
The dual problem: (unconstrained in ). This is a concave QP in .
ex-ch03-09
MediumWater-filling with sub-channels: noise floors and total power . Find the optimal allocation and total rate.
Start by assuming all channels are active, then check for negative allocations.
Try all 3 active
, . . Remove channel 3.
2 active channels
, . , , . All non-negative.
Total rate
bits/s/Hz.
ex-ch03-10
MediumShow that gradient descent with step size diverges on the quadratic .
Write the update rule and find the condition for .
Update rule
, so .
Divergence condition
. For , we need . This holds iff , i.e., .
ex-ch03-11
MediumProject the vector onto the probability simplex .
Sort the components and apply the simplex projection algorithm.
Sort
Sorted: , , .
Find $\rho$
: . : . Not . So , .
Project
: . Check: .
ex-ch03-12
MediumShow that is convex on the cone of positive definite matrices .
Restrict to a line and show the resulting scalar function is convex.
Restrict to a line
Let . Let be eigenvalues of .
Second derivative
. . Hence is convex in , and since this holds for all lines, is convex.
ex-ch03-13
MediumThe Rosenbrock function is . Is it convex? Find its global minimum.
Compute the Hessian at the critical point.
Global minimum
Setting : the unique critical point is with .
Convexity check
The Hessian at is . Eigenvalues: and . PSD at .
But at : which is PSD. However, at general points the Hessian can be indefinite (e.g., near , the entry becomes ).
So is not convex globally, but has a unique global minimum at .
ex-ch03-14
MediumShow that the MMSE estimator solves the convex QP: .
Expand the objective, take the gradient, and set it to zero.
Expand
.
Gradient
. Setting to zero: .
Convexity
for , confirming strict convexity.
ex-ch03-15
Hard(Dual decomposition for multi-cell power control.) Consider single-antenna links sharing a band. Link transmits with power and achieves rate . The network wants to maximise (proportional fair) subject to .
- Explain why this is non-convex.
- Formulate the high-SINR approximation (treat interference as noise, fix interference) as a convex problem.
- Write the Lagrangian dual and interpret the dual variables.
In (1), note that is not concave jointly in because of the interference coupling.
In (2), fix all for and solve for — this gives a set of decoupled water-filling problems.
Non-convexity
is concave in alone (for fixed interference), but the term in the denominator makes non-concave in the joint variable . Hence is non-convex.
Fixed-interference approximation
Fix the interference: . Then is concave in . Each link solves an independent water-filling problem.
Dual interpretation
Introducing multiplier for each gives , the marginal utility of power. An iterative algorithm alternates between updating powers (primal) and updating interference levels (akin to dual variables).
ex-ch03-16
Hard(SDP relaxation for MIMO detection.) The ML detection problem is .
- Show this is equivalent to for appropriate and .
- Write the SDP relaxation.
- Explain randomised rounding.
Expand and identify the terms depending on .
Reformulation
. Since is constant for , minimising is equivalent to maximising . Set , .
SDP relaxation
Lift: , so . Relax to :
s.t. , , .
Randomised rounding
Sample , then round: . Repeat multiple times and keep the best.
ex-ch03-17
Hard(Convergence of projected gradient descent.) Let be -smooth and convex, and be a closed convex set. Show that projected gradient descent with satisfies
where is the constrained minimum.
Use the non-expansiveness of projection: .
Use the descent lemma: .
Descent property
By -smoothness and the update rule: . (This uses the gradient Lipschitz condition applied at the projected point.)
Distance contraction
By the projection property and first-order convexity: . Telescoping over steps gives the bound.
ex-ch03-18
Hard(Greedy antenna selection is near-optimal.) For a MIMO system with antennas and RF chains, the capacity with selected set is .
- Show that is monotone: for .
- Show submodularity by proving the marginal gain from adding antenna decreases as the set grows.
For (1), use the fact that adding columns to can only increase .
For (2), use the matrix determinant lemma for rank-one updates.
Monotonicity
for (adding PSD terms). Since is operator monotone on PSD matrices, .
Submodularity
The marginal gain from adding antenna to set is .
By the matrix determinant lemma: .
For : , so .
ex-ch03-19
Challenge(Dual decomposition for NUM.) Consider flows sharing links in a network. Flow uses links in set and achieves rate . Link has capacity . The network utility maximisation (NUM) problem is:
Assume is strictly concave and increasing (e.g., for proportional fairness).
- Form the Lagrangian dual.
- Show the dual decomposes into independent sub-problems.
- Propose a distributed algorithm using gradient ascent on the dual.
- Interpret the dual variables as link prices.
Introduce multipliers for each capacity constraint and write the Lagrangian. Group terms by flow index .
Define the "path price" . Each flow sees only its own price.
For the distributed algorithm, update prices via subgradient ascent: .
Lagrangian
.
Decomposition
. For fixed , each flow independently solves where is the "path price."
Distributed algorithm
- Source update: .
- Link update: . This is a dual subgradient ascent. Each source needs only its path price; each link needs only its aggregate load.
Interpretation
is the "congestion price" of link . TCP Vegas and similar protocols implement this decomposition implicitly: the round-trip time acts as the price signal.
ex-ch03-20
Challenge(Nesterov acceleration.) Prove that Nesterov's accelerated gradient method achieves an convergence rate for -smooth convex functions — matching the theoretical lower bound for first-order methods.
The algorithm: set . For :
Define a potential function V_k = k(k-1)(f(\\mathbf{x}_k) - f^\\star) + 2L\\|\\mathbf{v}_k - \\mathbf{x}^\\star\\|^2 for a suitable sequence .
Show that is non-increasing.
Lyapunov argument (sketch)
Define the potential where and .
Using the -smoothness descent lemma and the first-order convexity condition, one shows . Since : .