Back to search
2205.14173

Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport

Lingkai Kong, Yuqing Wang, Molei Tao

wrongmedium confidence
Category
math.DS
Journal tier
Strong Field
Processed
Sep 28, 2025, 12:56 AM

Audit review

Projecting the constrained Euler–Lagrange equation onto the normal space with (I−XX^T) gives the coefficient −2a in front of (I−XX^T)QQ^T X. The paper’s Theorem 1 states −(3a/2) instead. Because (I−XX^T)QX^TQ = −(I−XX^T)QQ^T X on the Stiefel tangent bundle, the normal component fixes the factor to 2, not 3/2. The rest of the model’s derivation (structure preservation and Lyapunov dissipation) is consistent and aligns with the paper’s qualitative results, but the paper’s main ODE (5) has a systematic coefficient error.

Referee report (LaTeX)

\textbf{Recommendation:} major revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The manuscript proposes a compelling variational framework and structure-preserving integrators on Stiefel manifolds, with potentially broad impact. However, the core continuous-time ODE in Theorem 1 has an incorrect coefficient in the normal term. Since the ODE is the basis for the equivalent decomposed system and informs the design of discretizations, this requires correction and careful propagation through the text and algorithms. With this fixed, the paper could be a strong contribution.