For the first time in history, we have a system other than the human brain that can generate fluent language: large language models (LLMs). Do the human brain and LLMs converge on shared representations and computations, and if so, what can LLMs tell us about the nature of human linguistic representations? In this talk, I will first characterize the human language network, a set of frontal and temporal brain regions that causally support language processing. I will then show that the alignment between the human language network and LLMs is strong enough that LLMs can non-invasively modulate language responses in the human brain. Moreover, this alignment can be attributed to small sets of interpretable LLM-based features, providing insight into the main axes of brain activity when humans comprehend sentences. Finally, I will turn to the question of how such linguistic representations can emerge from the messy acoustic signals that humans actually receive. I will introduce AuriStream, a self-supervised textless NLP model that learns from continuous speech and shows that linguistic structure can emerge without prespecified text tokens, given the right temporal predictive learning objective. Together, these results position LLMs as tools for characterizing the representational principles in the human language network—and AuriStream as a step toward understanding which inductive biases can give rise to human-like language representations from raw speech.
Greta Tuckute is a Research Fellow at the Kempner Institute for Natural and Artificial Intelligence at Harvard University. She received her PhD from MIT's Department of Brain and Cognitive Sciences in 2025.