Abstract:
Adam is one of the most widely used optimizers in modern machine learning, but its convergence theory remains incomplete, even in relatively simple convex settings. A central difficulty is that Adam tightly couples momentum and adaptive preconditioning, which hides the dissipative structure needed for a Lyapunov analysis.
In this talk, I will describe a Lyapunov-guided approach to Adam-type methods based on variable and operator splitting. In the deterministic full-batch setting, this leads to Adam-HNAG, a reformulation of Adam that combines adaptive diagonal preconditioning with a Hessian-driven correction and admits an exponentially decaying Lyapunov function. Its discrete variants have convergence guarantees for smooth convex optimization and show strong behavior on ill-conditioned problems.
I will then discuss Adam-SHANG, a stochastic extension that preserves the same structural splitting while replacing the unavailable full-gradient admissible stepsize by a computable trace-ratio rule. The resulting method converges in expectation for stochastic smooth convex optimization without imposing global monotonicity on the second-moment sequence. Numerical experiments on stochastic convex problems and deep learning tasks illustrate the predicted decay, the role of the trace-ratio stepsize, and competitive performance relative to Adam and AdamW.
Bio:
Long Chen is currently a Professor of Mathematics at the University of California, Irvine (UCI). He graduated from Nanjing University in 1997, earned a master’s degree from Peking University in 2000, and completed his Ph.D. at Pennsylvania State University in 2005 under the supervision of Professor Jinchao Xu. From 2005 to 2007, he worked as a postdoctoral fellow at the University of California, San Diego, and the University of Maryland, College Park. He joined UCI in 2007, received tenure in 2011, and was promoted to full professor in 2015.
Professor Chen’s research focuses on the numerical solution of partial differential equations, as well as broader topics in machine learning and computational mathematics. He developed the iFEM finite element software package, which has greatly facilitated the teaching and research of finite element methods. Professor Chen has published over 80 academic papers in internationally recognized journals and serves on the editorial boards of several SCI journals. His work has been consistently supported by the National Science Foundation. Additionally, he runs a WeChat account, *CAMtips*, where he shares insights on learning and research in computational and applied mathematics.
For more details, please visit Professor Chen's homepage: [https://www.math.uci.edu/~chenlong/](https://www.math.uci.edu/~chenlong/).
陈龙现任 University of California, Irvine (UCI) 数学系教授。他于 1997 年毕业于南京大学,2000 年获北京大学硕士学位,并于 2005 年在 Pennsylvania State University 师从许进超教授,完成博士学位。 2005 至 2007 年间,他先后在 University of California, San Diego 和 University of Maryland, College Park 从事博士后研究。 2007 年起在UCI工作,2011 年获得终身教职,2015 年晋升为正教授。
陈教授的研究领域集中于偏微分方程数值解,尤其是有限元方法的设计与分析。他开发了iFEM有限元软件包,为有限元方法的教学和研究提供了极大的便利。陈教授在国际知名期刊发表学术论文80余篇,担任多个SCI期刊编委。此外,他创办了微信公众号《CAM 传习录》(CAMtips),分享关于计算和应用数学 (Computational and Applied Mathematics) 的学习和研究经验。
如需了解更多信息,请访问陈教授主页:https://www.math.uci.edu/~chenlong/
