edu.stanford.nlp.trees.international.pennchinese
Class ChineseEscaper
java.lang.Object
edu.stanford.nlp.trees.international.pennchinese.ChineseEscaper
- All Implemented Interfaces:
- Function<List<HasWord>,List<HasWord>>
public class ChineseEscaper
- extends Object
- implements Function<List<HasWord>,List<HasWord>>
An Escaper for Chinese normalization to match Treebank.
Currently normalizes "ASCII" characters into the full-width
range used inside the Penn Chinese Treebank.
Notes: Smart quotes appear in CTB, and are left unchanged.
I think you get various hyphen types from U+2000 range too - certainly,
Roger lists them in LanguagePack.
- Author:
- Christopher Manning
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ChineseEscaper
public ChineseEscaper()
apply
public List<HasWord> apply(List<HasWord> arg)
- Note: At present this clobbers the input list items.
This should be fixed.
- Specified by:
apply
in interface Function<List<HasWord>,List<HasWord>>
- Parameters:
arg
- The function's argument
- Returns:
- The function's evaluated value
Stanford NLP Group