WordsToSentencesAnnotator (Stanford CoreNLP API)

java.lang.Object
- edu.stanford.nlp.pipeline.WordsToSentencesAnnotator

All Implemented Interfaces:

Annotator
```
public class WordsToSentencesAnnotator
extends Object
implements Annotator
```
This class assumes that there is a List<? extends CoreLabel> under the TokensAnnotation field, and runs it through WordToSentenceProcessor and puts the new List<List<? extends CoreLabel>> under the SentencesAnnotation field.

Author:

Jenny Finkel, Christopher Manning

Nested Class Summary
- Nested classes/interfaces inherited from interface edu.stanford.nlp.pipeline.Annotator
  Annotator.Requirement

Field Summary
- Fields inherited from interface edu.stanford.nlp.pipeline.Annotator
  BINARIZED_TREES_REQUIREMENT, CLEAN_XML_REQUIREMENT, COLUMN_DATA_CLASSIFIER, DETERMINISTIC_COREF_REQUIREMENT, GENDER_REQUIREMENT, GUTIME_REQUIREMENT, HEIDELTIME_REQUIREMENT, LEMMA_REQUIREMENT, NER_REQUIREMENT, NUMBER_REQUIREMENT, PARSE_AND_TAG, PARSE_REQUIREMENT, PARSE_TAG_BINARIZED_TREES, POS_REQUIREMENT, QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT, RELATION_EXTRACTOR_REQUIREMENT, SSPLIT_REQUIREMENT, STANFORD_CLEAN_XML, STANFORD_COLUMN_DATA_CLASSIFIER, STANFORD_DEPENDENCIES, STANFORD_DETERMINISTIC_COREF, STANFORD_GENDER, STANFORD_LEMMA, STANFORD_NER, STANFORD_PARSE, STANFORD_POS, STANFORD_REGEXNER, STANFORD_RELATION, STANFORD_SENTIMENT, STANFORD_SSPLIT, STANFORD_TOKENIZE, STANFORD_TRUECASE, STEM_REQUIREMENT, SUTIME_REQUIREMENT, TIME_WORDS_REQUIREMENT, TOKENIZE_AND_SSPLIT, TOKENIZE_REQUIREMENT, TOKENIZE_SSPLIT_NER, TOKENIZE_SSPLIT_PARSE, TOKENIZE_SSPLIT_PARSE_NER, TOKENIZE_SSPLIT_POS, TOKENIZE_SSPLIT_POS_LEMMA, TRUECASE_REQUIREMENT

Constructor Summary

Constructors
Constructor and Description
`WordsToSentencesAnnotator()`
`WordsToSentencesAnnotator(boolean verbose)`
`WordsToSentencesAnnotator(boolean verbose, String boundaryTokenRegex, Set<String> boundaryToDiscard, Set<String> htmlElementsToDiscard, String newlineIsSentenceBreak)`
`WordsToSentencesAnnotator(boolean verbose, String boundaryTokenRegex, Set<String> boundaryToDiscard, Set<String> htmlElementsToDiscard, String newlineIsSentenceBreak, String boundaryMultiTokenRegex, Set<String> tokenRegexesToDiscard)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`annotate(Annotation annotation)` If setCountLineNumbers is set to true, we count line numbers by telling the underlying splitter to return empty lists of tokens and then treating those empty lists as empty lines.
`static WordsToSentencesAnnotator`	`newlineSplitter(boolean verbose, String... nlToken)` Return a WordsToSentencesAnnotator that splits on newlines (only), which are then deleted.
`static WordsToSentencesAnnotator`	`nonSplitter(boolean verbose)` Return a WordsToSentencesAnnotator that never splits the token stream.
`Set<Annotator.Requirement>`	`requirementsSatisfied()` Returns a set of requirements for which tasks this annotator can provide.
`Set<Annotator.Requirement>`	`requires()` Returns the set of tasks which this annotator requires in order to perform.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator()
```
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator(boolean verbose)
```
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator(boolean verbose,
                                 String boundaryTokenRegex,
                                 Set<String> boundaryToDiscard,
                                 Set<String> htmlElementsToDiscard,
                                 String newlineIsSentenceBreak)
```
  - WordsToSentencesAnnotator
```
public WordsToSentencesAnnotator(boolean verbose,
                                 String boundaryTokenRegex,
                                 Set<String> boundaryToDiscard,
                                 Set<String> htmlElementsToDiscard,
                                 String newlineIsSentenceBreak,
                                 String boundaryMultiTokenRegex,
                                 Set<String> tokenRegexesToDiscard)
```
- Method Detail
  - newlineSplitter
```
public static WordsToSentencesAnnotator newlineSplitter(boolean verbose,
                                                        String... nlToken)
```
    Return a WordsToSentencesAnnotator that splits on newlines (only), which are then deleted. This constructor counts the lines by putting in empty token lists for empty lines. It tells the underlying splitter to return empty lists of tokens and then treats those empty lists as empty lines. We don't actually include empty sentences in the annotation, though. But they are used in numbering the sentence. Only this constructor leads to empty sentences.
    
    Parameters:
    
    verbose - Whether it is verbose.
    
    nlToken - Zero or more new line tokens, which might be a \n or the fake newline tokens returned from the tokenizer.
    
    Returns:
    
    A WordsToSentenceAnnotator.
  - nonSplitter
```
public static WordsToSentencesAnnotator nonSplitter(boolean verbose)
```
    Return a WordsToSentencesAnnotator that never splits the token stream. You just get one sentence.
    
    Parameters:
    
    verbose - Whether it is verbose.
    
    Returns:
    
    A WordsToSentenceAnnotator.
  - annotate
```
public void annotate(Annotation annotation)
```
    If setCountLineNumbers is set to true, we count line numbers by telling the underlying splitter to return empty lists of tokens and then treating those empty lists as empty lines. We don't actually include empty sentences in the annotation, though.
    
    Specified by:
    
    annotate in interface Annotator
  - requires
```
public Set<Annotator.Requirement> requires()
```
    Description copied from interface: Annotator
    
    Returns the set of tasks which this annotator requires in order to perform. For example, the POS annotator will return "tokenize", "ssplit".
    
    Specified by:
    
    requires in interface Annotator
  - requirementsSatisfied
```
public Set<Annotator.Requirement> requirementsSatisfied()
```
    Description copied from interface: Annotator
    
    Returns a set of requirements for which tasks this annotator can provide. For example, the POS annotator will return "pos".
    
    Specified by:
    
    requirementsSatisfied in interface Annotator

Class WordsToSentencesAnnotator

Nested Class Summary

Nested classes/interfaces inherited from interface edu.stanford.nlp.pipeline.Annotator

Field Summary

Fields inherited from interface edu.stanford.nlp.pipeline.Annotator

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

WordsToSentencesAnnotator

WordsToSentencesAnnotator

WordsToSentencesAnnotator

WordsToSentencesAnnotator

Method Detail

newlineSplitter

nonSplitter

annotate

requires

requirementsSatisfied