2205.14173
Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport
Lingkai Kong, Yuqing Wang, Molei Tao
wrongmedium confidence
- Category
- math.DS
- Journal tier
- Strong Field
- Processed
- Sep 28, 2025, 12:56 AM
- arXiv Links
- Abstract ↗PDF ↗
Audit review
Projecting the constrained Euler–Lagrange equation onto the normal space with (I−XX^T) gives the coefficient −2a in front of (I−XX^T)QQ^T X. The paper’s Theorem 1 states −(3a/2) instead. Because (I−XX^T)QX^TQ = −(I−XX^T)QQ^T X on the Stiefel tangent bundle, the normal component fixes the factor to 2, not 3/2. The rest of the model’s derivation (structure preservation and Lyapunov dissipation) is consistent and aligns with the paper’s qualitative results, but the paper’s main ODE (5) has a systematic coefficient error.
Referee report (LaTeX)
\textbf{Recommendation:} major revisions \textbf{Journal Tier:} strong field \textbf{Justification:} The manuscript proposes a compelling variational framework and structure-preserving integrators on Stiefel manifolds, with potentially broad impact. However, the core continuous-time ODE in Theorem 1 has an incorrect coefficient in the normal term. Since the ODE is the basis for the equivalent decomposed system and informs the design of discretizations, this requires correction and careful propagation through the text and algorithms. With this fixed, the paper could be a strong contribution.