StanfordCoreNLP (Stanford CoreNLP API)

java.lang.Object
- edu.stanford.nlp.pipeline.AnnotationPipeline
- - edu.stanford.nlp.pipeline.StanfordCoreNLP

All Implemented Interfaces:

Annotator
```
public class StanfordCoreNLP
extends AnnotationPipeline
```
This is a pipeline that takes in a string and returns various analyzed linguistic forms. The String is tokenized via a tokenizer (using a TokenizerAnnotator), and then other sequence model style annotation can be used to add things like lemmas, POS tags, and named entities. These are returned as a list of CoreLabels. Other analysis components build and store parse trees, dependency graphs, etc.
This class is designed to apply multiple Annotators to an Annotation. The idea is that you first build up the pipeline by adding Annotators, and then you take the objects you wish to annotate and pass them in and get in return a fully annotated object. At the command-line level you can, e.g., tokenize text with StanfordCoreNLP with a command like:
```
 java edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit -file document.txt
 
```
Please see the package level javadoc for sample usage and a more complete description.
The main entry point for the API is StanfordCoreNLP.process() .
Implementation note: There are other annotation pipelines, but they don't extend this one. Look for classes that implement Annotator and which have "Pipeline" in their name.
Author:

Jenny Finkel, Anna Rafferty, Christopher Manning, Mihai Surdeanu, Steven Bethard

Nested Class Summary
- Nested classes/interfaces inherited from interface edu.stanford.nlp.pipeline.Annotator
  Annotator.Requirement

Field Summary

Fields
Modifier and Type	Field and Description
`static String`	`CUSTOM_ANNOTATOR_PREFIX`
`static String`	`DEFAULT_NEWLINE_IS_SENTENCE_BREAK`
`static String`	`DEFAULT_OUTPUT_FORMAT`
`static String`	`NEWLINE_IS_SENTENCE_BREAK_PROPERTY`
`static String`	`NEWLINE_SPLITTER_PROPERTY`
`protected static AnnotatorPool`	`pool` Maintains the shared pool of annotators

Fields inherited from class edu.stanford.nlp.pipeline.AnnotationPipeline
TIME

Fields inherited from interface edu.stanford.nlp.pipeline.Annotator
BINARIZED_TREES_REQUIREMENT, CLEAN_XML_REQUIREMENT, COLUMN_DATA_CLASSIFIER, DETERMINISTIC_COREF_REQUIREMENT, GENDER_REQUIREMENT, GUTIME_REQUIREMENT, HEIDELTIME_REQUIREMENT, LEMMA_REQUIREMENT, NER_REQUIREMENT, NUMBER_REQUIREMENT, PARSE_AND_TAG, PARSE_REQUIREMENT, PARSE_TAG_BINARIZED_TREES, POS_REQUIREMENT, QUANTIFIABLE_ENTITY_NORMALIZATION_REQUIREMENT, RELATION_EXTRACTOR_REQUIREMENT, SSPLIT_REQUIREMENT, STANFORD_CLEAN_XML, STANFORD_COLUMN_DATA_CLASSIFIER, STANFORD_DEPENDENCIES, STANFORD_DETERMINISTIC_COREF, STANFORD_GENDER, STANFORD_LEMMA, STANFORD_NER, STANFORD_PARSE, STANFORD_POS, STANFORD_REGEXNER, STANFORD_RELATION, STANFORD_SENTIMENT, STANFORD_SSPLIT, STANFORD_TOKENIZE, STANFORD_TRUECASE, STEM_REQUIREMENT, SUTIME_REQUIREMENT, TIME_WORDS_REQUIREMENT, TOKENIZE_AND_SSPLIT, TOKENIZE_REQUIREMENT, TOKENIZE_SSPLIT_NER, TOKENIZE_SSPLIT_PARSE, TOKENIZE_SSPLIT_PARSE_NER, TOKENIZE_SSPLIT_POS, TOKENIZE_SSPLIT_POS_LEMMA, TRUECASE_REQUIREMENT

Constructor Summary

Constructors
Constructor and Description
`StanfordCoreNLP()` Constructs a pipeline using as properties the properties file found in the classpath
`StanfordCoreNLP(Properties props)` Construct a basic pipeline.
`StanfordCoreNLP(Properties props, boolean enforceRequirements)`
`StanfordCoreNLP(String propsFileNamePrefix)` Constructs a pipeline with the properties read from this file, which must be found in the classpath
`StanfordCoreNLP(String propsFileNamePrefix, boolean enforceRequirements)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`void`	`annotate(Annotation annotation)` Run the pipeline on an input annotation.
`static void`	`clearAnnotatorPool()` Call this if you are no longer using StanfordCoreNLP and want to release the memory associated with the annotators.
`void`	`conllPrint(Annotation annotation, Writer w)` Displays the output of many annotators in CoNLL format.
`protected AnnotatorImplementations`	`getAnnotatorImplementations()` Get the implementation of each relevant annotator in the pipeline.
`double`	`getBeamPrintingOption()`
`TreePrint`	`getConstituentTreePrinter()`
`protected AnnotatorPool`	`getDefaultAnnotatorPool(Properties inputProps, AnnotatorImplementations annotatorImplementation)` Construct the default annotator pool from the passed properties, and overwriting annotations which have changed since the last
`TreePrint`	`getDependencyTreePrinter()`
`String`	`getEncoding()`
`static Annotator`	`getExistingAnnotator(String name)`
`boolean`	`getPrintSingletons()`
`Properties`	`getProperties()` Fetches the Properties object used to construct this Annotator
`static boolean`	`isXMLOutputPresent()`
`void`	`jsonPrint(Annotation annotation, Writer w)` Displays the output of all annotators in JSON format.
`static void`	`main(String[] args)` This can be used just for testing or for command-line text processing.
`void`	`prettyPrint(Annotation annotation, OutputStream os)` Displays the output of all annotators in a format easily readable by people.
`void`	`prettyPrint(Annotation annotation, PrintWriter os)` Displays the output of all annotators in a format easily readable by people.
`protected static void`	`printHelp(PrintStream os, String helpTopic)` Prints the list of properties required to run the pipeline
`Annotation`	`process(String text)` Runs the entire pipeline on the content of the given text passed in.
`void`	`processFiles(Collection<File> files)`
`void`	`processFiles(Collection<File> files, int numThreads)`
`void`	`processFiles(String base, Collection<File> files, int numThreads)`
`void`	`run()`
`String`	`timingInformation()` Return a String that gives detailed human-readable information about how much time was spent by each annotator and by the entire annotation pipeline.
`static boolean`	`usesBinaryTrees(Properties props)` Determines whether the parser annotator should default to producing binary trees.
`void`	`xmlPrint(Annotation annotation, OutputStream os)` Displays the output of all annotators in XML format.
`void`	`xmlPrint(Annotation annotation, Writer w)` Wrapper around xmlPrint(Annotation, OutputStream).

Methods inherited from class edu.stanford.nlp.pipeline.AnnotationPipeline
addAnnotator, annotate, annotate, annotate, annotate, getTotalTime, requirementsSatisfied, requires

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - CUSTOM_ANNOTATOR_PREFIX
```
public static final String CUSTOM_ANNOTATOR_PREFIX
```
    See Also:
    
    Constant Field Values
  - NEWLINE_SPLITTER_PROPERTY
```
public static final String NEWLINE_SPLITTER_PROPERTY
```
    See Also:
    
    Constant Field Values
  - NEWLINE_IS_SENTENCE_BREAK_PROPERTY
```
public static final String NEWLINE_IS_SENTENCE_BREAK_PROPERTY
```
    See Also:
    
    Constant Field Values
  - DEFAULT_NEWLINE_IS_SENTENCE_BREAK
```
public static final String DEFAULT_NEWLINE_IS_SENTENCE_BREAK
```
    See Also:
    
    Constant Field Values
  - DEFAULT_OUTPUT_FORMAT
```
public static final String DEFAULT_OUTPUT_FORMAT
```
  - pool
```
protected static AnnotatorPool pool
```
    Maintains the shared pool of annotators
- Constructor Detail
  - StanfordCoreNLP
```
public StanfordCoreNLP()
```
    Constructs a pipeline using as properties the properties file found in the classpath
  - StanfordCoreNLP
```
public StanfordCoreNLP(Properties props)
```
    Construct a basic pipeline. The Properties will be used to determine which annotators to create, and a default AnnotatorPool will be used to create the annotators.
  - StanfordCoreNLP
```
public StanfordCoreNLP(Properties props,
                       boolean enforceRequirements)
```
  - StanfordCoreNLP
```
public StanfordCoreNLP(String propsFileNamePrefix)
```
    Constructs a pipeline with the properties read from this file, which must be found in the classpath
    
    Parameters:
    
    propsFileNamePrefix -
  - StanfordCoreNLP
```
public StanfordCoreNLP(String propsFileNamePrefix,
                       boolean enforceRequirements)
```
- Method Detail
  - getAnnotatorImplementations
```
protected AnnotatorImplementations getAnnotatorImplementations()
```
    Get the implementation of each relevant annotator in the pipeline. The primary use of this method is to be overwritten by subclasses of StanfordCoreNLP to call different annotators that obey the exact same contract as the default annotator.
    
    The canonical use case for this is as an implementation of the Curator server, where the annotators make server calls rather than calling each annotator locally.
    
    Returns:
    
    A class which specifies the actual implementation of each of the annotators called when creating the annotator pool. The canonical annotators are defaulted to in AnnotatorImplementations.
  - getProperties
```
public Properties getProperties()
```
    Fetches the Properties object used to construct this Annotator
  - getConstituentTreePrinter
```
public TreePrint getConstituentTreePrinter()
```
  - getDependencyTreePrinter
```
public TreePrint getDependencyTreePrinter()
```
  - getBeamPrintingOption
```
public double getBeamPrintingOption()
```
  - getEncoding
```
public String getEncoding()
```
  - getPrintSingletons
```
public boolean getPrintSingletons()
```
  - isXMLOutputPresent
```
public static boolean isXMLOutputPresent()
```
  - clearAnnotatorPool
```
public static void clearAnnotatorPool()
```
    Call this if you are no longer using StanfordCoreNLP and want to release the memory associated with the annotators.
  - getDefaultAnnotatorPool
```
protected AnnotatorPool getDefaultAnnotatorPool(Properties inputProps,
                                                AnnotatorImplementations annotatorImplementation)
```
    Construct the default annotator pool from the passed properties, and overwriting annotations which have changed since the last
    
    Parameters:
    
    inputProps -
    
    annotatorImplementation -
    
    Returns:
  - getExistingAnnotator
```
public static Annotator getExistingAnnotator(String name)
```
  - annotate
```
public void annotate(Annotation annotation)
```
    Description copied from class: AnnotationPipeline
    
    Run the pipeline on an input annotation. The annotation is modified in place.
    
    Specified by:
    
    annotate in interface Annotator
    
    Overrides:
    
    annotate in class AnnotationPipeline
    
    Parameters:
    
    annotation - The input annotation, usually a raw document
  - usesBinaryTrees
```
public static boolean usesBinaryTrees(Properties props)
```
    Determines whether the parser annotator should default to producing binary trees. Currently there is only one condition under which this is true: the sentiment annotator is used.
  - process
```
public Annotation process(String text)
```
    Runs the entire pipeline on the content of the given text passed in.
    
    Parameters:
    
    text - The text to process
    
    Returns:
    
    An Annotation object containing the output of all annotators
  - prettyPrint
```
public void prettyPrint(Annotation annotation,
                        OutputStream os)
```
    Displays the output of all annotators in a format easily readable by people.
    
    Parameters:
    
    annotation - Contains the output of all annotators
    
    os - The output stream
  - prettyPrint
```
public void prettyPrint(Annotation annotation,
                        PrintWriter os)
```
    Displays the output of all annotators in a format easily readable by people.
    
    Parameters:
    
    annotation - Contains the output of all annotators
    
    os - The output stream
  - xmlPrint
```
public void xmlPrint(Annotation annotation,
                     Writer w)
              throws IOException
```
    Wrapper around xmlPrint(Annotation, OutputStream). Added for backward compatibility.
    
    Parameters:
    
    annotation -
    
    w - The Writer to send the output to
    
    Throws:
    
    IOException
  - jsonPrint
```
public void jsonPrint(Annotation annotation,
                      Writer w)
               throws IOException
```
    Displays the output of all annotators in JSON format.
    
    Parameters:
    
    annotation - Contains the output of all annotators
    
    w - The Writer to send the output to
    
    Throws:
    
    IOException
  - conllPrint
```
public void conllPrint(Annotation annotation,
                       Writer w)
                throws IOException
```
    Displays the output of many annotators in CoNLL format.
    
    Parameters:
    
    annotation - Contains the output of all annotators
    
    w - The Writer to send the output to
    
    Throws:
    
    IOException
  - xmlPrint
```
public void xmlPrint(Annotation annotation,
                     OutputStream os)
              throws IOException
```
    Displays the output of all annotators in XML format.
    
    Parameters:
    
    annotation - Contains the output of all annotators
    
    os - The output stream
    
    Throws:
    
    IOException
  - printHelp
```
protected static void printHelp(PrintStream os,
                                String helpTopic)
```
    Prints the list of properties required to run the pipeline
    
    Parameters:
    
    os - PrintStream to print usage to
    
    helpTopic - a topic to print help about (or null for general options)
  - timingInformation
```
public String timingInformation()
```
    Return a String that gives detailed human-readable information about how much time was spent by each annotator and by the entire annotation pipeline. This String includes newline characters but does not end with one, and so it is suitable to be printed out with a println().
    
    Overrides:
    
    timingInformation in class AnnotationPipeline
    
    Returns:
    
    Human readable information on time spent in processing.
  - processFiles
```
public void processFiles(String base,
                         Collection<File> files,
                         int numThreads)
                  throws IOException
```
    Throws:
    
    IOException
  - processFiles
```
public void processFiles(Collection<File> files,
                         int numThreads)
                  throws IOException
```
    Throws:
    
    IOException
  - processFiles
```
public void processFiles(Collection<File> files)
                  throws IOException
```
    Throws:
    
    IOException
  - run
```
public void run()
         throws IOException
```
    Throws:
    
    IOException
  - main
```
public static void main(String[] args)
                 throws IOException,
                        ClassNotFoundException
```
    This can be used just for testing or for command-line text processing. This runs the pipeline you specify on the text in the file that you specify and sends some results to stdout. The current code in this main method assumes that each line of the file is to be processed separately as a single sentence.
    Example usage:
    java -mx6g edu.stanford.nlp.pipeline.StanfordCoreNLP properties
    
    Parameters:
    
    args - List of required properties
    
    Throws:
    
    IOException - If IO problem
    
    ClassNotFoundException - If class loading problem

Class StanfordCoreNLP

Nested Class Summary

Nested classes/interfaces inherited from interface edu.stanford.nlp.pipeline.Annotator

Field Summary

Fields inherited from class edu.stanford.nlp.pipeline.AnnotationPipeline

Fields inherited from interface edu.stanford.nlp.pipeline.Annotator

Constructor Summary

Method Summary

Methods inherited from class edu.stanford.nlp.pipeline.AnnotationPipeline

Methods inherited from class java.lang.Object

Field Detail

CUSTOM_ANNOTATOR_PREFIX

NEWLINE_SPLITTER_PROPERTY

NEWLINE_IS_SENTENCE_BREAK_PROPERTY

DEFAULT_NEWLINE_IS_SENTENCE_BREAK

DEFAULT_OUTPUT_FORMAT

pool

Constructor Detail

StanfordCoreNLP

StanfordCoreNLP

StanfordCoreNLP

StanfordCoreNLP

StanfordCoreNLP

Method Detail

getAnnotatorImplementations

getProperties

getConstituentTreePrinter

getDependencyTreePrinter

getBeamPrintingOption

getEncoding

getPrintSingletons

isXMLOutputPresent

clearAnnotatorPool

getDefaultAnnotatorPool

getExistingAnnotator

annotate

usesBinaryTrees

process

prettyPrint

prettyPrint

xmlPrint

jsonPrint

conllPrint

xmlPrint

printHelp

timingInformation

processFiles

processFiles

processFiles

run

main