Alternating Mirror Descent for Constrained Min-Max Games

Andre Wibisono, Molei Tao, Georgios Piliouras

correcthigh confidence

Category: math.DS
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:56 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper proves RK ≤ 2M/η + (4/3) α_max^3 M^4 η^2 K (and hence dg(p̄K, q̄K) = O(K^{-2/3})) by linking alternating mirror descent (AMD) regret to a modified energy Hη and bounding its growth under third-order smoothness (Theorems 3.2, 4.4, 4.5) . The candidate solution derives the same O(K^{1/3})/O(K^{-2/3}) rates via a direct one-step mirror-descent inequality for each player, isolating per-iteration bias terms that equal a trapezoid-rule error for φ* and ψ*. Using the AMD optimality conditions (∇φ(pk+1) = ∇φ(pk) − ηAqk; ∇ψ(qk+1) = ∇ψ(qk) + ηA^T pk+1) and the standard commutator/trapezoid bound (Cf ≤ (M/12)||x′−x||^3 under third-derivative control) , the model obtains a sharper constant (1/6 instead of 4/3) but the same asymptotic rate. The identity linking total regret to the duality gap plus the boundary term matches Lemma 3.1 in the paper . Thus, both arguments are sound; the model’s proof is a direct, commutator-based alternative that yields a better constant under the same assumptions.

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The paper presents a clean, conceptually appealing analysis of AMD via a modified-energy lens, establishing an O(K\^{-2/3}) average duality gap under third-order smoothness and bounded domains. The approach is rigorous and unifies continuous-time intuition with discrete-time regret guarantees. Minor clarifications about constants, norms, and interiority would further strengthen the presentation.