deep-learning-theory

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

Phenomenology of Double Descent in Finite-Width Neural Networks

Analytic Insights into Structure and Rank of Neural Network Hessian Maps