The Stanford Natural Language Processing Group

This talk is part of the NLP Seminar Series.

Interpreting Training

Naomi Saphra, Harvard University
Date: 11:00am - 12:00pm, May 2nd 2024
Venue: Room 287, Gates Computer Science Building

Abstract

For years, both learning theory and empirical science of deep learning used multipass training on small image classification datasets as a primary testbed and source of inspiration. As a result, our understanding of models and training has largely taken the form of smooth, simple, and continuous laws. Recently, the machine learning community has begun considering textual data and other settings that test discrete reasoning. Observations of training in these environments have introduced discontinuous training dynamics, questioned assumptions about the economy of representations, and highlighted the often-neglected role of random seed. This talk will focus on understanding the nuances of training in a wider range of settings. I will begin by discussing substantial phase transitions during masked language model pretraining, and how we can combine them with perspectives from evolutionary biology to repair the epistemology of mechanistic approaches to interpretability. Then, I will present recent results illuminating training through the historically neglected impact of random seeds. The first of these findings is that for text classification, in contrast with previous results in image classification, different fine-tuning seeds can lead to different loss surface basins that provide different generalization heuristics. Finally, I will discuss an unsupervised approach to discovering and visualizing random variation in training and its influence on the rate of convergence and spontaneous generalization. Overall, these results can support a complex and nuanced new science of deep learning.

Bio

Naomi Saphra is a research fellow at the Kempner Institute at Harvard University. She is interested in language model training dynamics: how models learn to encode linguistic patterns or other structure, how generalization develops, and how we can introduce useful inductive biases into the training process. She has a particular interest in applying models from evolutionary biology to understand neural networks. Recently, Dr. Saphra has become interested in fish. Previously, she earned a PhD from the University of Edinburgh on Training Dynamics of Neural Language Models; worked at NYU, Google and Facebook; attended Johns Hopkins and Carnegie Mellon University; and won multiple awards for being the best disabled person. Outside of research, she plays roller derby under the name Gaussian Retribution, performs standup comedy, and shepherds other disabled programmers into the world of code dictation.