Difference between revisions of "More details on Phrasal discriminative features"

From NLPWiki
Jump to: navigation, search
(Created page with "=== Discriminative Phrase Table === Specify in the *.ini file under [ The following options are available: "bleu" -- the percentage of common n-grams found in machine and r...")
 
m
 
(2 intermediate revisions by one user not shown)
Line 1: Line 1:
 +
Various discriminative features can be specified under the [additional-featurizers] section in Phrasal *.ini files.
 +
 
=== Discriminative Phrase Table ===
 
=== Discriminative Phrase Table ===
Specify in the *.ini file under [
+
edu.stanford.nlp.mt.decoder.feat.sparse.DiscriminativePhraseTable(arg1,arg2,arg3)
The following options are available:
+
   "arg1" --  true/false, whether to use lexicalized features or not (default=true)
   "bleu" --  the percentage of common n-grams found in machine and reference translations (Papineni et al., 2002).
+
   "arg2" --  true/false, whether to use class-based features or not (default=false)
   "ter" --  translation edit rate, i.e. shortest edit sequence to turn a machine translation into a reference (Snover et al., 2006).
+
   "arg3" --  int, whether to threshold feature counts (default=-1, no thresholding)
   "terp" --  a variant of TER with synonym and paraphrase matching turned on (super slow) (Snover et al., 2009).
+
 
   "nist" --  a variant of BLEU which weights n-gram matches by how informative they are (Doddington, 2002).
+
=== Discriminative Alignments ===
   "wer" --  word error rate (Nießen et al., 2000).
+
edu.stanford.nlp.mt.decoder.feat.sparse.DiscriminativeAlignments(arg1,arg2,arg3)
   "per"
+
   "arg1" --  true/false, whether to add source deletions or not (default=false)
  "bleu-ter:w" -- linearly combine BLEU and TER with the weight w placed on TER, i.e. BLEU + w*TER. "bleu-ter" implies w=1.0.
+
   "arg2" --  true/false, whether to add target insertions or not (default=false)
 +
   "arg3" -- true/false, whether to use class-based features or not (default=false)
  
For a comparison of these various metrics, see:
+
=== Lexical Reordering ===
@inproceedings{Cer:2010:BLM,
+
edu.stanford.nlp.mt.decoder.feat.base.LexicalReorderingFeaturizer(args), each argument can take one of the following values:
   author = {Cer, Daniel and Manning, Christopher D. and Jurafsky, Daniel},
+
  "conditionOnConstellations"
   title = {The best lexical metric for phrase-based statistical MT system optimization},
+
   "classes"
   booktitle = {Proceedings of NAACL},
+
   "useClasses" --  use word classes
  year = {2010},
+
   "countFeatureIndexN" -- N is an int, whether to threshold feature counts (N<0, no thresholding).
}
+

Latest revision as of 16:14, 16 January 2014

Various discriminative features can be specified under the [additional-featurizers] section in Phrasal *.ini files.

Discriminative Phrase Table

edu.stanford.nlp.mt.decoder.feat.sparse.DiscriminativePhraseTable(arg1,arg2,arg3)

 "arg1" --  true/false, whether to use lexicalized features or not (default=true)
 "arg2" --  true/false, whether to use class-based features or not (default=false)
 "arg3" --  int, whether to threshold feature counts (default=-1, no thresholding)

Discriminative Alignments

edu.stanford.nlp.mt.decoder.feat.sparse.DiscriminativeAlignments(arg1,arg2,arg3)

 "arg1" --  true/false, whether to add source deletions or not (default=false)
 "arg2" --  true/false, whether to add target insertions or not (default=false)
 "arg3" --  true/false, whether to use class-based features or not (default=false)

Lexical Reordering

edu.stanford.nlp.mt.decoder.feat.base.LexicalReorderingFeaturizer(args), each argument can take one of the following values:

 "conditionOnConstellations"
 "classes"
 "useClasses" --  use word classes
 "countFeatureIndexN" -- N is an int, whether to threshold feature counts (N<0, no thresholding).