This talk is part of the NLP Seminar Series.

Must NLP be extractive?

Steven Bird, Charles Darwin University
Date: 11:00am - 12:00pm, May 9th 2024
Venue: Room 287, Gates Computer Science Building

Abstract

Symbolic and sub-symbolic NLP are founded on western epistemologies of language, in particular, on language-as-lexicogrammatical-code and on language-as-data. However, outside the world's ~500 institutional languages, there are a further 6,500 languages with primary orality. Here, many NLP/AI researchers and large technology companies seek to support the 'next thousand languages' and deliver the standard suite of technologies centred on written language, such as speech-to-text and machine translation, and to extract substantial quantities of primary linguistic data in the process. I observe that such practices typically do not meet the requirements for prior informed consent or for the self-determination of Indigenous peoples, and that an ethical approach needs to take seriously local purposes linked to cultural survival, and local epistemologies including language-as-situated-and-embodied-communication. In this talk, I report on a five year study conducted in an Australian Aboriginal community, and how this gave rise to non-extractive designs for language technologies and an agency-enhancing language technology design pattern. This proposal represents a return to an older understanding of AI, not replicating but augmenting human information processing capabilities.

Bio

Over the past 3+ decades, Steven Bird has been working with minoritised people groups, and developing ways to keep oral languages and cultures strong, including fieldwork in Africa, Melanesia, Amazonia, and Australia. He has held academic appointments at Edinburgh, UPenn, Berkeley, and Melbourne. Since 2017 Steven has been research professor at Charles Darwin University, where he directs the Top End Language Lab, http://language-lab.cdu.edu.au. He pursues other language-related projects at http://aikuma.org.