2209.11920

Tradeoffs between convergence rate and noise amplification for momentum-based accelerated optimization algorithms

Hesameddin Mohammadi, Meisam Razaviyayn, Mihailo R. Jovanović

correctmedium confidence

Category: math.DS
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:56 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The uploaded paper (arXiv:2209.11920) proves explicit lower bounds for any stabilizing two‑step momentum method on Q^L_m: J_min/(1−ρ) ≥ σ^2(κ^2/64 + (n−1)(√κ+1)/2) and J_max/(1−ρ) ≥ σ^2((n−1)κ^2/64 + (√κ+1)/2), with the analogous noisy‑gradient bounds J_min/(1−ρ) ≥ (σ_a^2/L^2)(κ^2/4 + (n−1)max{(1−ρ)^3κ^2, 1/4}) and J_max/(1−ρ) ≥ (σ_a^2/L^2)((n−1)κ^2/4 + max{(1−ρ)^3κ^2, 1/4}); see Theorem 2 (eq. (12)) in the PDF . The paper’s proof relies on a geometric characterization of stability and ρ‑linear convergence via the triangles Δ and Δ_ρ, the modal decomposition J = ∑_i Ĵ(λ_i), and distance functions (d,h,l) (Theorem 4) , together with the Nesterov/Chebyshev rate barrier 1/(1−ρ) ≥ (√κ+1)/2 (via Proposition 1) . The candidate solution derives the same statements via a different route: AR(2) modalization, a Cauchy–Schwarz H2 lower bound evaluated at r=ρ, and affine‑in‑λ constraints obtained from scaled Jury conditions. Substantively, the two approaches agree on the results and the modal reduction strategy, but the model’s write‑up contains two issues: (i) the stated scaled Jury inequalities have sign errors (for z^2 − ρ a z − ρ^2 b Schur, the correct second‑order Jury conditions are 1 + ρ^2 b ≥ 0, 1 − ρ a − ρ^2 b ≥ 0, 1 + ρ a − ρ^2 b ≥ 0, aligning with the paper’s Δ_ρ triangle) ; (ii) the step that upper‑bounds |1 − aρ − bρ^2| at an endpoint using a global variation bound needs a clearer justification. These do not affect the final bounds because the needed “slope” constraints and midpoint bound follow cleanly from the Δ_ρ geometry used in the paper (e.g., d(λ)=αλ with d∈[(1−ρ)^2,(1+ρ)^2], yielding α|γ|(L−m) ≤ 2ρ^2 and α|1+γ|(L−m) ≤ 4ρ) , and the rate barrier 1/(1−ρ) ≥ (√κ+1)/2 is established in the paper as well . Taken together, the paper’s argument is complete and correct, and the model’s solution reaches the same conclusions with minor fixable gaps.

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The manuscript offers a compelling, unified geometric framework that clarifies the interplay between convergence rate and noise amplification for two-step momentum methods and proves sharp lower bounds that match constructive upper bounds. The results are technically sound and of clear interest to optimization and control communities. Minor expansions of a few proof details would further strengthen accessibility.