|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectedu.stanford.nlp.process.PTBTokenizer.PTBTokenizerFactory<T>
T - The class of the returned tokenspublic static class PTBTokenizer.PTBTokenizerFactory<T extends HasWord>
This class provides a factory which will vend instances of PTBTokenizer
which wrap a provided Reader. See the documentation for
PTBTokenizer for details of the parameters and options.
PTBTokenizer| Field Summary | |
|---|---|
protected LexedTokenFactory<T> |
factory
|
protected java.lang.String |
options
|
| Method Summary | ||
|---|---|---|
java.util.Iterator<T> |
getIterator(java.io.Reader r)
Returns a tokenizer wrapping the given Reader. |
|
Tokenizer<T> |
getTokenizer(java.io.Reader r)
Returns a tokenizer wrapping the given Reader. |
|
Tokenizer<T> |
getTokenizer(java.io.Reader r,
java.lang.String extraOptions)
|
|
static PTBTokenizer.PTBTokenizerFactory<CoreLabel> |
newCoreLabelTokenizerFactory(java.lang.String options)
Constructs a new PTBTokenizer that returns CoreLabel objects and uses the options passed in. |
|
static PTBTokenizer.PTBTokenizerFactory<Word> |
newPTBTokenizerFactory(boolean tokenizeNLs)
Constructs a new PTBTokenizer that optionally returns carriage returns as their own token. |
|
static PTBTokenizer.PTBTokenizerFactory<CoreLabel> |
newPTBTokenizerFactory(boolean tokenizeNLs,
boolean invertible)
|
|
static
|
newPTBTokenizerFactory(LexedTokenFactory<T> tokenFactory,
java.lang.String options)
Constructs a new PTBTokenizer that uses the LexedTokenFactory and options passed in. |
|
static TokenizerFactory<Word> |
newTokenizerFactory()
Constructs a new TokenizerFactory that returns Word objects and treats carriage returns as normal whitespace. |
|
static PTBTokenizer.PTBTokenizerFactory<Word> |
newWordTokenizerFactory(java.lang.String options)
Constructs a new PTBTokenizer that returns Word objects and uses the options passed in. |
|
void |
setOptions(java.lang.String options)
|
|
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected LexedTokenFactory<T extends HasWord> factory
protected java.lang.String options
| Method Detail |
|---|
public static TokenizerFactory<Word> newTokenizerFactory()
public static PTBTokenizer.PTBTokenizerFactory<Word> newPTBTokenizerFactory(boolean tokenizeNLs)
tokenizeNLs - If true, newlines come back as Words whose text is
the value of PTBLexer.NEWLINE_TOKEN.
public static PTBTokenizer.PTBTokenizerFactory<Word> newWordTokenizerFactory(java.lang.String options)
options - A String of options
public static PTBTokenizer.PTBTokenizerFactory<CoreLabel> newCoreLabelTokenizerFactory(java.lang.String options)
options - A String of options
public static <T extends HasWord> PTBTokenizer.PTBTokenizerFactory<T> newPTBTokenizerFactory(LexedTokenFactory<T> tokenFactory,
java.lang.String options)
tokenFactory - The LexedTokenFactoryoptions - A String of options
public static PTBTokenizer.PTBTokenizerFactory<CoreLabel> newPTBTokenizerFactory(boolean tokenizeNLs,
boolean invertible)
public java.util.Iterator<T> getIterator(java.io.Reader r)
getIterator in interface IteratorFromReaderFactory<T extends HasWord>r - Where to read objects from
public Tokenizer<T> getTokenizer(java.io.Reader r)
getTokenizer in interface TokenizerFactory<T extends HasWord>
public Tokenizer<T> getTokenizer(java.io.Reader r,
java.lang.String extraOptions)
getTokenizer in interface TokenizerFactory<T extends HasWord>public void setOptions(java.lang.String options)
setOptions in interface TokenizerFactory<T extends HasWord>
|
|||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||