Learning Time Delay Systems with Neural Ordinary Differential Equations

Xunbi A. Ji, Gábor Orosz

incompletemedium confidence

Category: math.DS
Journal tier: Strong Field
Processed: Sep 28, 2025, 12:56 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper introduces a clear NODE-to-NDDE construction via history discretization and trainable delay taps, and supports it with empirical results on Mackey–Glass, including learned bifurcations and over-parameterized delay clustering, but provides no formal theorems or proofs (only methodology and experiments). Its key components—discretization X(t) with DM, interpolation matrix P, TDNN right-hand side, loss and delay-gradient—are explicit and internally consistent . The candidate model solution supplies plausible theoretical underpinnings: (i) a compact-set universal-approximation statement for the lifted discrete-delay vector field, and (ii) finite-horizon solution closeness plus bifurcation persistence under C^1-small perturbations, and (iii) identifiability-driven delay clustering when over-parameterizing. However, these arguments rely on substantial additional assumptions (e.g., C^1-uniform closeness of the learned nonlinearity over a parameter-state neighborhood for τ∈[0,2]; quantitative delay identifiability/persistent excitation) that are not derived from the paper’s setup or training protocol. Empirically, the paper’s bifurcation and clustering claims match the reported results (Hopf near 0.24; period doublings near 0.61 and 0.84; clustering of learned delays near 0 and 1) , but the model’s formal claims remain conditional. Therefore, both the paper and the model solution are incomplete: the paper lacks proofs; the model provides reasonable theoretical scaffolding but leaves key hypotheses unverified.

Referee report (LaTeX)

\textbf{Recommendation:} major revisions

\textbf{Journal Tier:} strong field

\textbf{Justification:}

A well-motivated and practically impactful NODE/NDDE framework with trainable delays is demonstrated convincingly on Mackey–Glass, including qualitative bifurcation reproduction and robust behavior under over-parameterization. However, the paper lacks theoretical guarantees (e.g., universal approximation, error bounds, bifurcation persistence, identifiability). With clearer positioning as an empirical study, and with added theoretical framing (even conditional) and expanded diagnostics, the contribution would be substantially strengthened.