An entropy formula for the Deep Linear Network

Govind Menon, Tianmin Yu

correctmedium confidence

Category: Not specified
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:57 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper proves an explicit entropy formula S(X) by identifying O_X as an O_d^{N−1}-orbit, pulling back the ambient Frobenius metric to the orbit, showing it decomposes into (i,j)-sectors with identical (N−1)×(N−1) tridiagonal blocks, and evaluating the determinant via Chebyshev/Jacobi matrix diagonalization to obtain vol(O_X)=c_d^{N−1}√(van(Σ^2)/van(Σ_N^2)) and S(X)=(N−1)log c_d + 1/2 log(van(Σ^2)/van(Σ_N^2)) (Theorem 4). The candidate solution follows the same geometric setup and block structure, computes the same tridiagonal determinants via a recurrence, and reaches the same volume/entropy formula under the same genericity (distinct singular values). The two arguments are essentially the same in substance, differing only in how the tridiagonal determinant is evaluated (recurrence vs. explicit diagonalization).

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

The work offers a precise and elegant geometric derivation of an explicit entropy formula for balanced deep linear networks. It leverages group orbits and Riemannian geometry to yield a determinantal expression with clear ties to random matrix theory. The technical core—an orthonormal basis and block Jacobi analysis—is solid. Minor expository enhancements (e.g., motivic examples and an appendix on the determinant recurrence) would further increase accessibility.