The current practice of building language models focuses on training huge monolithic models on large-scale data. While scaling the model has been the primary focus of model development, this approach alone does not address several critical issues. For example, LMs often suffer from hallucinations, are difficult to update with new knowledge, and pose copyright and privacy risks. In this talk, I will explore factors important for building models beyond scale. I will discuss augmented LMs—models that access external data or tools during inference to improve reliability. Next, I will explore modular models that maintain data provenance to support takedown requests and unlearning processes.
Weijia Shi is a Ph.D. student at the University of Washington. Her research focuses on LM pretraining and retrieval-augmented models. She also studies multimodal reasoning and investigates copyright and privacy risks associated with LMs. She won an outstanding paper award at ACL 24 and was recognized as Machine Learning rising star in 2023.