There is a usability gap between manipulation-capable robots and helpful in-home digital agents. Dialog-enabled smart assistants have recently seen widespread adoption, but these cannot move or manipulate objects. By contrast, manipulation-capable and mobile robots are still largely deployed in industrial settings and do not interact with human users. Language-enabled robots can bridge this gap---natural language interfaces help robots and non-experts collaborate to achieve their goals. Navigation in unexplored environments to high-level targets like "Go to the room with a plant" can be facilitated by enabling agents to ask questions and react to human clarifications on-the-fly. Further, high-level instructions like "Put a plate of toast on the table" require inferring many steps, from finding a knife to operating a toaster. Low-level instructions can serve to clarify these individual steps. Through two new datasets and accompanying models, we study human-human dialog for cooperative navigation, and high- and low-level language instructions for cooking, cleaning, and tidying in interactive home environments. These datasets are a first step towards collaborative, dialog-enabled robots helpful in human spaces.
Jesse is starting as an Assistant Professor at the University of Southern California in fall 2021, and is currently hanging out at Amazon Alexa AI for a year. Recently, he was a postdoctoral researcher working with Luke Zettlemoyer at the University of Washington. His research focuses on language grounding and natural language processing applications for robotics (RoboNLP). Key to this work is using dialog with humans to facilitate both robot task execution and learning to enable lifelong improvement of robots’ language understanding capabilities. He has encouraged work in RoboNLP through workshop organization at NLP, robotics, and vision conference venues.