The Stanford Natural Language Processing Group

This talk is part of the NLP Seminar Series.

The Challenges of Relying on Language Models Trained on non-Public Data

Daphne Ippolito, Carnegie Mellon University
Date: 11:00am - 12:00pm, Feb 1st 2024
Venue: Room 287, Gates Computer Science Building

Abstract

Many of today's widely used language models are trained on non-public datasets. Often, model trainers release only a limited amount of information about the training data curation process, for example, the proportions of different source domains (news, code, Wikipedia, etc.) or whether quality filters were applied. The overall lack of transparency into training data is problematic because, as we show in recent research, the decisions made during data curation can have a significant impact on model capabilities. The lack of transparency is also possible to circumvent—we show that through prompt-based data extraction attacks, popular language models can be made to output hundreds of thousands of tokens of memorized sequences from the training data. Memorized content can be extracted even when inference-time filtering methods are employed to prevent this. Our findings have ramifications both for NLP researchers and for real-world language model use cases, such as creative writing assistants, where users see novelty and avoidance of plagiarism as crucial attributes.

Bio

Daphne Ippolito recently started as an assistant professor in the Language Technologies Institute at CMU. Her research interests include the adversarial robustness of natural language generation systems, the impact of training data selection on language model performance, and strategies for making language models more useful for creative writing applications. Daphne is also a senior research scientist at Google Deepmind, and before that, she studied at University of Pennsylvania and University of Toronto.