public static class SpanishTokenizer.SpanishTokenizerFactory<T extends HasWord> extends java.lang.Object implements TokenizerFactory<T>
Modifier and Type | Field and Description |
---|---|
protected LexedTokenFactory<T> |
factory |
protected java.util.Properties |
lexerProperties |
protected boolean |
splitCompoundOption |
protected boolean |
splitContractionOption |
protected boolean |
splitVerbOption |
Modifier and Type | Method and Description |
---|---|
java.util.Iterator<T> |
getIterator(java.io.Reader r)
Return an iterator over the contents read from r.
|
Tokenizer<T> |
getTokenizer(java.io.Reader r)
Get a tokenizer for this reader.
|
Tokenizer<T> |
getTokenizer(java.io.Reader r,
java.lang.String extraOptions)
Get a tokenizer for this reader.
|
static TokenizerFactory<CoreLabel> |
newCoreLabelTokenizerFactory() |
static <T extends HasWord> |
newSpanishTokenizerFactory(LexedTokenFactory<T> factory,
java.lang.String options)
Constructs a new SpanishTokenizer that returns T objects and uses the options passed in.
|
void |
setOptions(java.lang.String options)
Set underlying tokenizer options.
|
protected final LexedTokenFactory<T extends HasWord> factory
protected java.util.Properties lexerProperties
protected boolean splitCompoundOption
protected boolean splitVerbOption
protected boolean splitContractionOption
public static TokenizerFactory<CoreLabel> newCoreLabelTokenizerFactory()
public static <T extends HasWord> SpanishTokenizer.SpanishTokenizerFactory<T> newSpanishTokenizerFactory(LexedTokenFactory<T> factory, java.lang.String options)
options
- a String of options, separated by commasfactory
- a factory for the token type that the tokenizer will returnpublic java.util.Iterator<T> getIterator(java.io.Reader r)
IteratorFromReaderFactory
getIterator
in interface IteratorFromReaderFactory<T extends HasWord>
r
- Where to read objects frompublic Tokenizer<T> getTokenizer(java.io.Reader r)
TokenizerFactory
getTokenizer
in interface TokenizerFactory<T extends HasWord>
r
- A Reader (which is assumed to already by buffered, if appropriate)public void setOptions(java.lang.String options)
setOptions
in interface TokenizerFactory<T extends HasWord>
options
- A comma-separated list of optionspublic Tokenizer<T> getTokenizer(java.io.Reader r, java.lang.String extraOptions)
TokenizerFactory
getTokenizer
in interface TokenizerFactory<T extends HasWord>
r
- A Reader (which is assumed to already by buffered, if appropriate)extraOptions
- Options for how this tokenizer should behave