The Stanford Natural Language Processing Group

This talk is part of the NLP Seminar Series.

Beyond Monolithic Language Models

Weijia Shi, Univeristy of Washington
Date: 11:00am - 12:00 noon PT, Oct 17 2024
Venue: Room 287, Gates Computer Science Building

Abstract

The current practice of building language models focuses on training huge monolithic models on large-scale data. While scaling the model has been the primary focus of model development, this approach alone does not address several critical issues. For example, LMs often suffer from hallucinations, are difficult to update with new knowledge, and pose copyright and privacy risks. In this talk, I will explore factors important for building models beyond scale. I will discuss augmented LMs—models that access external data or tools during inference to improve reliability. Next, I will explore modular models that maintain data provenance to support takedown requests and unlearning processes.

Bio

Weijia Shi is a Ph.D. student at the University of Washington. Her research focuses on LM pretraining and retrieval-augmented models. She also studies multimodal reasoning and investigates copyright and privacy risks associated with LMs. She won an outstanding paper award at ACL 24 and was recognized as Machine Learning rising star in 2023.