public class TokenizerAnnotator extends Object implements Annotator
List<CoreLabel>
) under
CoreAnnotation.TokensAnnotation.Modifier and Type | Class and Description |
---|---|
static class |
TokenizerAnnotator.TokenizerType
Enum to identify the different TokenizerTypes.
|
Annotator.Requirement
Modifier and Type | Field and Description |
---|---|
static String |
EOL_PROPERTY |
BINARIZED_TREES_REQUIREMENT, CLEAN_XML_REQUIREMENT, COLUMN_DATA_CLASSIFIER, DETERMINISTIC_COREF_REQUIREMENT, GENDER_REQUIREMENT, GUTIME_REQUIREMENT, HEIDELTIME_REQUIREMENT, LEMMA_REQUIREMENT, NER_REQUIREMENT, NUMBER_REQUIREMENT, PARSE_AND_TAG, PARSE_REQUIREMENT, PARSE_TAG_BINARIZED_TREES, POS_REQUIREMENT, QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT, RELATION_EXTRACTOR_REQUIREMENT, SSPLIT_REQUIREMENT, STANFORD_CLEAN_XML, STANFORD_COLUMN_DATA_CLASSIFIER, STANFORD_DEPENDENCIES, STANFORD_DETERMINISTIC_COREF, STANFORD_GENDER, STANFORD_LEMMA, STANFORD_NER, STANFORD_PARSE, STANFORD_POS, STANFORD_REGEXNER, STANFORD_RELATION, STANFORD_SENTIMENT, STANFORD_SSPLIT, STANFORD_TOKENIZE, STANFORD_TRUECASE, STEM_REQUIREMENT, SUTIME_REQUIREMENT, TIME_WORDS_REQUIREMENT, TOKENIZE_AND_SSPLIT, TOKENIZE_REQUIREMENT, TOKENIZE_SSPLIT_NER, TOKENIZE_SSPLIT_PARSE, TOKENIZE_SSPLIT_PARSE_NER, TOKENIZE_SSPLIT_POS, TOKENIZE_SSPLIT_POS_LEMMA, TRUECASE_REQUIREMENT
Constructor and Description |
---|
TokenizerAnnotator() |
TokenizerAnnotator(boolean verbose) |
TokenizerAnnotator(boolean verbose,
Properties props) |
TokenizerAnnotator(boolean verbose,
Properties props,
String options) |
TokenizerAnnotator(boolean verbose,
String lang) |
TokenizerAnnotator(boolean verbose,
String lang,
String options) |
TokenizerAnnotator(boolean verbose,
TokenizerAnnotator.TokenizerType lang) |
TokenizerAnnotator(String lang) |
Modifier and Type | Method and Description |
---|---|
void |
annotate(Annotation annotation)
Does the actual work of splitting TextAnnotation into CoreLabels,
which are then attached to the TokensAnnotation.
|
Tokenizer<CoreLabel> |
getTokenizer(Reader r)
Returns a thread-safe tokenizer
|
Set<Annotator.Requirement> |
requirementsSatisfied()
Returns a set of requirements for which tasks this annotator can
provide.
|
Set<Annotator.Requirement> |
requires()
Returns the set of tasks which this annotator requires in order
to perform.
|
public static final String EOL_PROPERTY
public TokenizerAnnotator()
public TokenizerAnnotator(boolean verbose)
public TokenizerAnnotator(String lang)
public TokenizerAnnotator(boolean verbose, TokenizerAnnotator.TokenizerType lang)
public TokenizerAnnotator(boolean verbose, String lang)
public TokenizerAnnotator(boolean verbose, Properties props)
public TokenizerAnnotator(boolean verbose, Properties props, String options)
public void annotate(Annotation annotation)
public Set<Annotator.Requirement> requires()
Annotator
public Set<Annotator.Requirement> requirementsSatisfied()
Annotator
requirementsSatisfied
in interface Annotator