Speaker
András György
(Deepmind)
Description
Using representations learned by large, pretrained models, also called foundation models, in new tasks with fewer data has been successful in a wide range of machine learning problems. In particular, recent results in the literature show that representations learned by a single classifier over many classes are competitive on few-shot learning problems with representations learned by special-purpose algorithms designed for such problems. In this talk, I will provide a theoretical explanation for this behavior based on the recently observed phenomenon that the features learned by overparameterized classification networks show an interesting clustering property, called neural collapse.
Based on joint work with Tomer Galanti and Marcus Hutter.
Primary author
András György
(Deepmind)