Better Training in MT
Determining the appropriate weights for a translation system’s decoding model is usually performed using Minimum Error Rate Training (MERT), a procedure that optimizes the system’s performance on an automated measure of translation quality. In our lab, we have developed improved algorithms for performing MERT (Cer et al. 2008). We have also studied the consequences of training to different automated translation evaluation metrics. We found surprisingly that training to different popular word sequence matching based evaluation metrics, such a BLEU, TER, and METEOR, did not seem to have a reliable impact on human preferences for the resulting translations (Cer et al. 2010). However, preliminary results suggest that training to our textual entailment based evaluation metric, which performs a deep semantic analysis of the translations being evaluated, may in fact produce better translation performance (Pado et al. 2009). Currently, we are continuing to investigate the feasibility and effectiveness of training to evaluation metrics that perform a deeper semantic and syntactic analysis of the translations being evaluated.
Chinese MT
Our work also focuses on improving Chinese-to-English translation
using deep source-side linguistic analysis. In our Chinese-English
system, we train a classifier to categorize each occurrence of 的
(DE) according to its syntactic and semantic context. We use this
classifier to preprocess MT data by explicitly labeling 的
constructions, as well as reordering phrases. Our Chinese-English
system also uses typed dependencies identified in the source sentence
to improve a lexicalized phrase reordering model. Finally, we have
also done work to improve the segmentation consistency of our Chinese
word segmenter, a characteristic that is often desirable in MT.
These three components all show significant gains in translation
performance, and are respectively described in (Chang et al., 2009a)
(Chang et al., 2009b), and (Chang et al., 2008).
Arabic MT
Although Arabic-to-English translation quality has improved significantly in recent years, pervasive problems remain. One of them is the re-ordering of verb-initial clauses--especially matrix clauses--during translation. We have recently developed a high-precision Arabic subject detector that can be integrated into phrase-based translation pipelines (Green et al., 2009). A characteristic feature of our work is the decision to influence decoding directly instead of re-ordering the Arabic input prior to translation. We have also created a state-of-the-art Arabic parser that can be used for a variety of MT tasks.
People
NIST Evaluations
Our group has participated in two NIST Open MT
evaluations. We submitted one Chinese-English system in 2008,
which was ranked as the 8th best system (out of 20 institutions), and submitted one
Arabic-English system in 2009, which was ranked as the 2nd best system (out of 13
institutions).
Descriptions of our NIST systems: