2403.08470
CONVERGENCE OF ADAM FOR LIPSCHITZ OBJECTIVE FUNCTIONS
Juan Ferrera, Javier Gómez Gil
correctmedium confidence
- Category
- Not specified
- Journal tier
- Specialist/Solid
- Processed
- Sep 28, 2025, 12:56 AM
- arXiv Links
- Abstract ↗PDF ↗
Audit review
The paper proves local exponential convergence for Adam with the minimal-norm Clarke subgradient under explicit and coupled parameter constraints and, via a short "reach the basin then restart" phase, a global-to-local result. The model’s proof departs materially from these conditions: it (i) assumes, without justification under the paper’s hypotheses, an M-Lipschitz selection w ↦ ζ_w, (ii) drops the crucial coupling β1 = 1 − δ√ε/(2αµ^2) and small-β1 requirement, instead claiming convergence for arbitrary β1, β2 ∈ (0,1), and (iii) replaces the paper’s nonautonomous-term control by an informal O(α) tracking argument. These gaps invalidate the model’s proof under the paper’s assumptions, while the paper’s results and proofs align coherently.
Referee report (LaTeX)
\textbf{Recommendation:} minor revisions \textbf{Journal Tier:} specialist/solid \textbf{Justification:} The manuscript rigorously establishes local exponential convergence of Adam with Clarke subgradients under explicit parameter couplings and extends to a global-to-local result via a short preparatory phase. The approach, built on a clean decomposition into an autonomous contraction and a decaying nonautonomous term, is technically sound and informative. Some assumptions (the quadratic upper model) could be more broadly illustrated, and the presentation of constants streamlined, but overall the work is correct and contributes meaningfully to the theory of adaptive methods in non-smooth settings.