Complex fractal trainability boundary can arise from trivial non-convexity

Yizhou Liu

uncertainmedium confidence

Category: Not specified
Journal tier: Specialist/Solid
Processed: Sep 28, 2025, 12:56 AM
arXiv Links: Abstract ↗PDF ↗

Audit review

The paper empirically demonstrates renormalization invariance for the additive and multiplicative cosine-perturbed quadratics, deriving ε̃=ε/b^2, λ̃=λ/b for the additive case and ε̃=ε, λ̃=λ/b for the multiplicative case, and concludes that α depends only on θ+ = ε/λ^2 and θ× = ε, respectively, with a sharp empirical transition to non-zero α near θ+ = 1/(2π^2) where f+ becomes non-convex. These steps are supported by their Methods and figures but are not presented as rigorous theorems; the paper explicitly frames results as numerical/heuristic and discusses limitations and assumptions (e.g., approximate independence of α from initial conditions) . The candidate solution gives exact scaling conjugacies and the convexity threshold, and derives standard L-smooth descent guarantees and an explicit escape criterion, but (correctly) stops short of a general proof that α=0 throughout the entire convex additive regime or that α>0 always holds in the multiplicative case. Since the paper does not offer a complete proof of these global fractality claims either—and indeed acknowledges methodological limits of box-dimension estimation and classification heuristics—the overall status of the fully rigorous statements appears open as of the cutoff .

Referee report (LaTeX)

\textbf{Recommendation:} minor revisions

\textbf{Journal Tier:} specialist/solid

\textbf{Justification:}

The manuscript demonstrates, in a minimal and well-controlled setting, how fractal trainability boundaries emerge and how their measured box dimensions correlate with a renormalization-invariant roughness parameter. The numerical evidence is extensive and limitations are acknowledged. While the results are primarily empirical (with heuristic renormalization arguments), they open a promising path for theory and for interpreting instability in optimization. Minor clarifications on assumptions and reporting would materially improve the paper without altering conclusions.