Linguistic Problems for Statistical Machine Translation

Philipp Koehn, University of Edinburgh

Abstract

Statistical methods have taken over the field of machine translation in recent years, at least in the area of academic research. While learning how to translate automatically is quick and easy, (we recently built systems for 462 different language pairs), we find that translation performance differs widely for different language pairs. Especially diverging syntactic structure and rich target-side morphology pose serious problems to current approaches. In this talk, we review some of our work on translation between German and English, where both of these problems are present.