|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.stanford.nlp.process.PTBTokenizer.PTBTokenizerFactory<T>
T
- The class of the returned tokenspublic static class PTBTokenizer.PTBTokenizerFactory<T extends HasWord>
This class provides a factory which will vend instances of PTBTokenizer
which wrap a provided Reader. See the documentation for
PTBTokenizer
for details of the parameters and options.
PTBTokenizer
Field Summary | |
---|---|
protected LexedTokenFactory<T> |
factory
|
protected java.lang.String |
options
|
Method Summary | ||
---|---|---|
java.util.Iterator<T> |
getIterator(java.io.Reader r)
Returns a tokenizer wrapping the given Reader. |
|
Tokenizer<T> |
getTokenizer(java.io.Reader r)
Returns a tokenizer wrapping the given Reader. |
|
static PTBTokenizer.PTBTokenizerFactory<CoreLabel> |
newCoreLabelTokenizerFactory(java.lang.String options)
Constructs a new PTBTokenizer that returns CoreLabel objects and uses the options passed in. |
|
static PTBTokenizer.PTBTokenizerFactory<Word> |
newPTBTokenizerFactory(boolean tokenizeNLs)
Constructs a new PTBTokenizer that optionally returns carriage returns as their own token. |
|
static PTBTokenizer.PTBTokenizerFactory<CoreLabel> |
newPTBTokenizerFactory(boolean tokenizeNLs,
boolean invertible)
|
|
static
|
newPTBTokenizerFactory(LexedTokenFactory<T> tokenFactory,
java.lang.String options)
Constructs a new PTBTokenizer that uses the LexedTokenFactory and options passed in. |
|
static TokenizerFactory<Word> |
newTokenizerFactory()
Constructs a new TokenizerFactory that returns Word objects and treats carriage returns as normal whitespace. |
|
static PTBTokenizer.PTBTokenizerFactory<Word> |
newWordTokenizerFactory(java.lang.String options)
Constructs a new PTBTokenizer that returns Word objects and uses the options passed in. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected LexedTokenFactory<T extends HasWord> factory
protected java.lang.String options
Method Detail |
---|
public static TokenizerFactory<Word> newTokenizerFactory()
public static PTBTokenizer.PTBTokenizerFactory<Word> newPTBTokenizerFactory(boolean tokenizeNLs)
tokenizeNLs
- If true, newlines come back as Words whose text is
the value of PTBLexer.NEWLINE_TOKEN
.
public static PTBTokenizer.PTBTokenizerFactory<Word> newWordTokenizerFactory(java.lang.String options)
options
- A String of options
public static PTBTokenizer.PTBTokenizerFactory<CoreLabel> newCoreLabelTokenizerFactory(java.lang.String options)
options
- A String of options
public static <T extends HasWord> PTBTokenizer.PTBTokenizerFactory<T> newPTBTokenizerFactory(LexedTokenFactory<T> tokenFactory, java.lang.String options)
tokenFactory
- The LexedTokenFactoryoptions
- A String of options
public static PTBTokenizer.PTBTokenizerFactory<CoreLabel> newPTBTokenizerFactory(boolean tokenizeNLs, boolean invertible)
public java.util.Iterator<T> getIterator(java.io.Reader r)
getIterator
in interface IteratorFromReaderFactory<T extends HasWord>
public Tokenizer<T> getTokenizer(java.io.Reader r)
getTokenizer
in interface TokenizerFactory<T extends HasWord>
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |