- Type Parameters:
T
- The type of the tokens returned by the Tokenizer
- All Superinterfaces:
- IteratorFromReaderFactory<T>, java.io.Serializable
- All Known Implementing Classes:
- ArabicTokenizer.ArabicTokenizerFactory, FrenchTokenizer.FrenchTokenizerFactory, PTBTokenizer.PTBTokenizerFactory, SpanishTokenizer.SpanishTokenizerFactory, TreeTokenizerFactory, WhitespaceTokenizer.WhitespaceTokenizerFactory
public interface TokenizerFactory<T>
extends IteratorFromReaderFactory<T>
A TokenizerFactory is a factory that can build a Tokenizer (an extension of Iterator)
from a java.io.Reader.
IMPORTANT NOTE:
A TokenizerFactory should also provide two static methods:
public static TokenizerFactory<? extends HasWord> newTokenizerFactory();
public static TokenizerFactory<Word> newWordTokenizerFactory(String options);
These are expected by certain JavaNLP code (e.g., LexicalizedParser),
which wants to produce a TokenizerFactory by reflection.
- Author:
- Christopher Manning