edu.stanford.nlp.process
Class PTBTokenizer.PTBTokenizerFactory<T>
java.lang.Object
edu.stanford.nlp.process.PTBTokenizer.PTBTokenizerFactory<T>
- All Implemented Interfaces:
- IteratorFromReaderFactory<T>, TokenizerFactory<T>
- Enclosing class:
- PTBTokenizer<T>
public static class PTBTokenizer.PTBTokenizerFactory<T>
- extends Object
- implements TokenizerFactory<T>
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
tokenizeCRs
protected boolean tokenizeCRs
invertible
protected boolean invertible
suppressEscaping
protected boolean suppressEscaping
factory
protected LexedTokenFactory<T> factory
PTBTokenizer.PTBTokenizerFactory
public PTBTokenizer.PTBTokenizerFactory(boolean tokenizeCRs,
LexedTokenFactory<T> factory)
newPTBTokenizerFactory
public static PTBTokenizer.PTBTokenizerFactory<Word> newPTBTokenizerFactory()
- Constructs a new PTBTokenizerFactory that treats carriage returns as
normal whitespace.
newPTBTokenizerFactory
public static PTBTokenizer.PTBTokenizerFactory<Word> newPTBTokenizerFactory(boolean tokenizeCRs)
- Constructs a new PTBTokenizer that optionally returns carriage returns
as their own token. CRs come back as Words whose text is
the value of
PTBLexer.cr
.
newPTBTokenizerFactory
public static PTBTokenizer.PTBTokenizerFactory<FeatureLabel> newPTBTokenizerFactory(boolean tokenizeCRs,
boolean invertible)
newPTBTokenizerFactory
public static PTBTokenizer.PTBTokenizerFactory<Word> newPTBTokenizerFactory(boolean tokenizeCRs,
boolean invertible,
boolean suppressEscaping)
getIterator
public Iterator<T> getIterator(Reader r)
- Specified by:
getIterator
in interface IteratorFromReaderFactory<T>
getTokenizer
public Tokenizer<T> getTokenizer(Reader r)
- Specified by:
getTokenizer
in interface TokenizerFactory<T>
Stanford NLP Group