This talk is part of the NLP Seminar Series.

Dissociating Language and Thought in Large Language Models

Anna Ivanova, MIT
Date: 11:00am - 12:00pm, November 16th 2023
Venue: Room 287, Gates Computer Science Building

Abstract

Today's large language models (LLMs) routinely generate coherent, grammatical and seemingly meaningful paragraphs of text. This achievement has led to speculation that LLMs have become "thinking machines", capable of performing tasks that require reasoning and/or world knowledge. In this talk, I will introduce a distinction between formal competence—knowledge of linguistic rules and patterns—and functional competence—understanding and using language in the world. This distinction is grounded in human neuroscience, which shows that formal and functional competence recruit different cognitive mechanisms. I will show that the word-in-context prediction objective has allowed LLMs to essentially master formal linguistic competence; however, pretrained LLMs still lag behind at many aspects of functional linguistic competence, prompting engineers to adopt specialized fune-tuning techniques and/or couple an LLM with external modules. I will illustrate the formal-functional distinction using the domains of English grammar and arithmetic, respectively. I will then turn to generalized world knowledge, a domain where this distinction is much less clear-cut, and discuss our efforts to leverage both cognitive science and NLP to develop systematic ways to probe generalized world knowledge in text-based LLMs. Overall, the formal/functional competence framework clarifies the discourse around LLMs, helps develop targeted evaluations of their capabilities, and suggests ways for developing better models of real-life language use.

Bio

Anna (Anya) Ivanova is a Postdoctoral Associate at MIT Quest for Intelligence and an incoming Assistant Professor at Georgia Tech Psychology (starting Jan 2024). She has a PhD from MIT's Department of Brain and Cognitive Sciences, where she studied the neural mechanisms underlying language processing in humans. Today, Anya is examining the language-thought relationship not only in human brains, but also in large language models, using her cognitive science training to identify similarities and differences between people and machines.