2206.04160
Alternating Mirror Descent for Constrained Min-Max Games
Andre Wibisono, Molei Tao, Georgios Piliouras
correcthigh confidence
- Category
- math.DS
- Journal tier
- Strong Field
- Processed
- Sep 28, 2025, 12:56 AM
- arXiv Links
- Abstract ↗PDF ↗
Audit review
The paper proves RK ≤ 2M/η + (4/3) α_max^3 M^4 η^2 K (and hence dg(p̄K, q̄K) = O(K^{-2/3})) by linking alternating mirror descent (AMD) regret to a modified energy Hη and bounding its growth under third-order smoothness (Theorems 3.2, 4.4, 4.5) . The candidate solution derives the same O(K^{1/3})/O(K^{-2/3}) rates via a direct one-step mirror-descent inequality for each player, isolating per-iteration bias terms that equal a trapezoid-rule error for φ* and ψ*. Using the AMD optimality conditions (∇φ(pk+1) = ∇φ(pk) − ηAqk; ∇ψ(qk+1) = ∇ψ(qk) + ηA^T pk+1) and the standard commutator/trapezoid bound (Cf ≤ (M/12)||x′−x||^3 under third-derivative control) , the model obtains a sharper constant (1/6 instead of 4/3) but the same asymptotic rate. The identity linking total regret to the duality gap plus the boundary term matches Lemma 3.1 in the paper . Thus, both arguments are sound; the model’s proof is a direct, commutator-based alternative that yields a better constant under the same assumptions.
Referee report (LaTeX)
\textbf{Recommendation:} minor revisions \textbf{Journal Tier:} strong field \textbf{Justification:} The paper presents a clean, conceptually appealing analysis of AMD via a modified-energy lens, establishing an O(K\^{-2/3}) average duality gap under third-order smoothness and bounded domains. The approach is rigorous and unifies continuous-time intuition with discrete-time regret guarantees. Minor clarifications about constants, norms, and interiority would further strengthen the presentation.