Back to search
2402.15958

On the Dynamics of Three-Layer Neural Networks: Initial Condensation

Zheng-An Chen, Tao Luo

incompletemedium confidence
Category
Not specified
Journal tier
Specialist/Solid
Processed
Sep 28, 2025, 12:56 AM

Audit review

The paper’s condensation theorem (Theorem 3) is essentially correct under the intended regime but its statement omits an explicit dependence on the earlier blow‑up assumption (Assumption 1), which the proof later uses (via Corollary 1) to ensure at least one coordinate diverges. The model’s proof captures the right invariants and obtains clean monotonicity/convexity, and it correctly establishes condensation when T* = ∞. However, in the finite‑time case the model argues that ||b_i|| must blow up because ||ḃ_i|| → ∞ as ||a|| → ∞, which is not justified: an unbounded, even monotone‑increasing derivative near a finite endpoint can still be integrable. The paper remedies this with detailed angle control and an explicit integral lower bound, while the model does not. Hence both are incomplete: the paper in hypothesis bookkeeping, the model in a key step of the finite‑time blow‑up case.

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} specialist/solid

\textbf{Justification:}

The manuscript offers a clear, technically interesting analysis of a three-variable ODE model arising from three-layer neural networks, establishing finite-time blow-up under a generic initialization condition and proving a condensation phenomenon under a final-stage condition. The novelty lies in the angular analysis and integral estimates that bridge sign/angle constraints to quantitative growth. A minor revision is needed to explicitly state the reliance of the condensation theorem on the previously assumed blow-up condition, which is used implicitly in the proofs. Clarifying this will improve correctness and readability.