Class Summary |
AbstractListProcessor<IN,OUT,L,F> |
Class AbstractListProcessor |
AbstractTokenizer<T> |
An abstract tokenizer. |
Americanize |
Takes a HasWord or String and returns an Americanized version of it. |
CoreLabelTokenFactory |
Constructs CoreLabel s as Strings with a corresponding BEGIN and END position. |
Morphology |
Morphology computes the base form of English words, by removing just
inflections (not derivational morphology). |
PTBTokenizer<T extends HasWord> |
Tokenizer implementation that conforms to the Penn Treebank tokenization
conventions. |
PTBTokenizer.PTBTokenizerFactory<T extends HasWord> |
|
StripTagsProcessor<L,F> |
A Processor whose process method deletes all
SGML/XML/HTML tags (tokens starting with < and ending
with >. |
TokenizerAdapter |
This class adapts between a java.io.StreamTokenizer
and a edu.stanford.nlp.process.Tokenizer . |
WordShapeClassifier |
Provides static methods which
map any String to another String indicative of its "word shape" -- e.g.,
whether capitalized, numeric, etc. |
WordTokenFactory |
Constructs a Word from a String. |
WordToSentenceProcessor<IN,L,F> |
Transforms a Document of Words into a Document of Sentences by grouping the
Words. |