This talk is part of the NLP Seminar Series.

Art or Artifice? Large Language Models and the False Promise of Creativity

Tuhin Chakrabarty, Columbia University
Date: 11:00am - 12:00pm, October 12th 2023
Venue: Room 287, Gates Computer Science Building; Zoom (link hidden)

Abstract

Researchers have argued that large language models (LLMs) exhibit high-quality writing capabilities from blogs to stories. However, evaluating objectively the creativity of a piece of writing is challenging. Inspired by the Torrance Test of Creative Thinking (TTCT), which measures creativity as a process, we use the Consensual Assessment Technique and propose the Torrance Test of Creative Writing (TTCW) to evaluate creativity as a product. TTCW consists of 14 binary tests organized into the original dimensions of Fluency, Flexibility, Originality, and Elaboration. We recruit 10 creative writing experts and implement a human assessment of 48 stories written either by professional authors or LLMs using TTCW. Our analysis shows that LLM-generated stories pass 3-10X less TTCW tests than stories written by professionals. In addition, we explore the use of LLMs as assessors to automate the TTCW evaluation, revealing that none of the LLMs positively correlate with the expert assessments.

Bio

I am a final-year PhD candidate in Computer Science at Columbia University. Within the department, I am a part of the Natural Language Processing group, where I am advised by Smaranda Muresan. My thesis is also advised by Kathleen McKeown, Yejin Choi, Violet Peng, and Lydia Chilton. My research is supported by the Columbia Center of Artificial Intelligence & Technology (CAIT) & Amazon Science Ph.D. Fellowship. During 2021-2022, I was a Computational Journalism fellow at NYTimes R&D. My research interests are broadly in Natural Language Processing and Machine Learning, with a special focus on Human-Centered Methods for Creative Understanding, Generation, and Evaluation. My overarching research question centers around how we can make large language models better for creative tasks.