This talk is part of the NLP Seminar Series.

Interpreting transformer LMs: Finding Functions and Facts, and a Fabric.

David Bau, Northeastern University
Date: 11:00am - 12:00pm, June 6th 2024
Venue: Room 287, Gates Computer Science Building

Abstract

Deep networks are based on the intuition that generalization arises from layers of computations that can be reused in different situations. In this talk we discuss recent work in interpreting and understanding the explicit structure of learned computations within large deep network models. We examine the localization of factual knowledge within transformer LMs, and discuss how these insights can be used to edit behavior of LLMs and multimodal diffusion models. Then we discuss recent findings on the structure of computations underlying in-context learning, and how these lead to insights about the representation and composition of functions within LLMs. Finally, time permitting, we discuss the technical challenges of doing interpretability research in a world where the most powerful models are only available via API, and we describe a National Deep Inference Fabric that will offer a transparent API standard that enables transparent scientific research on large-scale AI.

Bio

David Bau is an assistant professor at Northeastern University Khoury school of Computer Sciences. He is a pioneer on deep network interpretability and model editing methods for large-scale AI such as large language models and image synthesis diffusion models. He is leading an effort to create a National Deep Inference Fabric.