Upcoming AMCS Events
AMCS Colloquium
Monday, February 10, 2025 – 11:00am
Yaoyu Zhang, Institute of Natural Sciences and the School of Mathematical Sciences, Shanghai Jiao Tong University
Abstract: Condensation (also known as quantization, weight clustering, or alignment) is a widely observed phenomenon where neurons in the same layer tend to align with one another during the nonlinear training of deep neural networks (DNNs). It is a key characteristic of the feature learning process of neural networks. However, due to the strong nonlinear nature of this phenomenon, establishing its theoretical understanding remains challenging. In this talk, I will present our systematic efforts to tackle this challenge in recent years. First, I will present results regarding the dynamical regime identification of condensation at the infinite width limit, where small initialization is crucial. Then, I will discuss the mechanism of condensation at the initial training stage and the global loss landscape structure underlying condensation in later training stages, highlighting the prevalence of condensed critical points and global minimizers. Finally, I will present results on the quantification of condensation and its generalization advantage, which includes a novel estimate of sample complexity in the best possible scenario. These results underscore the effectiveness of the phenomenological approach to understanding DNNs, paving the way for a deeper understanding of deep learning in the near future.
Short Bio: Yaoyu Zhang obtained his Ph.D. in Mathematics in 2016 from Shanghai Jiao Tong University. Between 2016 and 2020, he engaged in postdoctoral research at New York University Abu Dhabi and the Courant Institute, as well as the Institute for Advanced Study in Princeton. His research is centered on the theoretical underpinnings of deep learning, with a particular emphasis on the nonlinear training dynamics and the analysis of the global loss landscape in deep learning.