edu.stanford.nlp.objectbank
Interface TokenizerFactory<T>
- Type Parameters:
T
- The type of the tokens returned by the Tokenizer
- All Superinterfaces:
- IteratorFromReaderFactory<T>
- All Known Implementing Classes:
- PTBTokenizer.PTBTokenizerFactory, TreeTokenizerFactory, WhitespaceTokenizer.WhitespaceTokenizerFactory
public interface TokenizerFactory<T>
- extends IteratorFromReaderFactory<T>
A TokenizerFactory is used to convert a java.io.Reader
into a Tokenizer (or an Iterator) over the Objects represented by the text
in the java.io.Reader. It's mainly a convenience, since you could cast
down anyway.
Note: A TokenizerFactory should also provide a static method:
public static TokenizerFactory<? extends HasWord> newTokenizerFactory();
This is expected by certain JavaNLP code which wants to produce a
TokenizerFactory by reflection.
- Author:
- Christopher Manning
getTokenizer
Tokenizer<T> getTokenizer(Reader r)
Stanford NLP Group