edu.stanford.nlp.trees.international.pennchinese
Class ChineseEscaper
java.lang.Object
edu.stanford.nlp.trees.international.pennchinese.ChineseEscaper
- All Implemented Interfaces:
- Function<List<HasWord>,List<HasWord>>, Serializable
public class ChineseEscaper
- extends Object
- implements Function<List<HasWord>,List<HasWord>>
An Escaper for Chinese normalization to match Treebank.
Currently normalizes "ASCII" characters into the full-width
range used inside the Penn Chinese Treebank.
Notes: Smart quotes appear in CTB, and are left unchanged.
I think you get various hyphen types from U+2000 range too - certainly,
Roger lists them in LanguagePack.
- Author:
- Christopher Manning
- See Also:
- Serialized Form
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ChineseEscaper
public ChineseEscaper()
apply
public List<HasWord> apply(List<HasWord> arg)
- Note: At present this clobbers the input list items.
This should be fixed.
- Specified by:
apply
in interface Function<List<HasWord>,List<HasWord>>
Stanford NLP Group