Multitask Learning with Stochastic Interpolants

Hugo Negrel, Florentin Coeurdoux, Michael S Albergo, Eric Vanden-Eijnden

wrongmedium confidenceCounterexample detected

Category: math.DS
Journal tier: Specialist/Solid
Processed: Sep 28, 2025, 12:57 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper’s diffusion claim (Proposition 2.6) hinges on Lemma 2.4, which states η0(α,β,x) = −α sα,β(x) and hence sα,β(x) = −α−1η0(α,β,x) when x0 ~ N(0,Id). This omits a transpose: the correct Gaussian integration-by-parts identity is sα,β(x) = −α−TΣ0−1η0(α,β,x) (with Σ0 the covariance of x0; for Σ0 = I this reduces to s = −α−Tη0). Using the paper’s incorrect s = −α−1η0 leads to the SDE drift (α̇t − εtα−1t)η0 that does not preserve the intended marginals for non-symmetric αt. A concrete 2×2 counterexample shows the mismatch. By contrast, replacing α−1 by α−TΣ0−1 in the drift yields an SDE whose time marginals match the ODE, as the model proves via an Itô–Fokker–Planck cancellation using the corrected score identity. The paper’s ODE proposition (Proposition 2.5) is fine; the error is confined to the score identity and the diffusion Proposition 2.6. See Lemma 2.4 and its proof (eq. (26)) and Proposition 2.6 (eq. (8)) in the paper’s text and appendix for the exact statements the model corrects .

Referee report (LaTeX)

\textbf{Recommendation:} major revisions

\textbf{Journal Tier:} specialist/solid

\textbf{Justification:}

The operator-based interpolant framework is a valuable generalization with practical implications for multitask generation. The probability-flow ODE is correctly derived. However, the main diffusion equivalence contains a critical error in the score identity (missing transpose and, in general, Σ0−1), leading to an incorrect SDE in non-symmetric cases. This undermines a central theoretical claim but is straightforward to fix. The paper will be strong after correcting the theory, updating algorithms, and clarifying assumptions.