edu.stanford.nlp.trees.international.pennchinese
Class ChineseEscaper

java.lang.Object
  extended by edu.stanford.nlp.trees.international.pennchinese.ChineseEscaper
All Implemented Interfaces:
Function<List<HasWord>,List<HasWord>>, Serializable

public class ChineseEscaper
extends Object
implements Function<List<HasWord>,List<HasWord>>

An Escaper for Chinese normalization to match Treebank. Currently normalizes "ASCII" characters into the full-width range used inside the Penn Chinese Treebank.

Notes: Smart quotes appear in CTB, and are left unchanged. I think you get various hyphen types from U+2000 range too - certainly, Roger lists them in LanguagePack.

Author:
Christopher Manning
See Also:
Serialized Form

Constructor Summary
ChineseEscaper()
           
 
Method Summary
 List<HasWord> apply(List<HasWord> arg)
          Note: At present this clobbers the input list items.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChineseEscaper

public ChineseEscaper()
Method Detail

apply

public List<HasWord> apply(List<HasWord> arg)
Note: At present this clobbers the input list items. This should be fixed.

Specified by:
apply in interface Function<List<HasWord>,List<HasWord>>


Stanford NLP Group