This talk is part of the NLP Seminar Series.

Sequence to Sequence Learning: Fast Training and Inference with Gated Convolutions

Michael Auli, Facebook AI Research
joint work with Sergey Edunov, Myle Ott, Jonas Gehring, Angela Fan, Denis Yarats, David Grangier, Yann N. Dauphin, Marc'Aurelio Ranzato
Date: 11:00am - 12:00pm, Mar 08 2018
Venue: Room 219, Gates Computer Science Building

Abstract

Neural architectures for machine translation and language modeling are currently a very active research field. I will first describe several architectural changes to the original work of Bahdanau et al. 2014. We replace non-linearities with our novel gated linear units, recurrent units with convolutions and we introduce multi-hop attention. The second part of the talk deals with ways to train models on the sequence level in order to avoid exposure bias. The final part of the talk analyses common issues such as performance degradation of larger beams or the under-estimation of rare words. I will relate some of these challenges to uncertainty in the data and discuss how uncertainty affects search as well as to which extent the model distribution matches the data distribution.

Bio

Michael Auli is a research scientist at Facebook AI Research in Menlo Park. His PhD was on CCG parsing at the University of Edinburgh where he was advised by Adam Lopez and Philipp Koehn. He did his postdoc at Microsoft Research where he worked on neural machine translation and neural dialogue models. He received an Outstanding Paper award at EMNLP 2011. Currently, Michael works on machine learning and its application to natural language processing, he is particularly interested in text generation. http://michaelauli.github.io