edu.stanford.nlp.ling (Stanford JavaNLP API)

Interface Summary
Interface	Description
AbstractCoreLabel
AbstractToken	An abstract token.
CoreAnnotation<V>	The base class for any annotation that can be marked on a `CoreMap`, parameterized by the type of the value associated with the annotation.
CoreLabel.GenericAnnotation<T>	Class that all "generic" annotations extend.
Datum<L,F>	Interface for Objects which can be described by their features.
Document<L,F,T>	Represents a text document as a list of Words with a String title.
Featurizable<F>	Interface for Objects that can be described by their features.
HasCategory	Something that implements the `HasCategory` interface knows about categories.
HasContext
HasIndex
HasLemma	Something that implements the `HasLemma` interface knows about lemmas.
HasNER	This token is able to produce NER tags
HasOffset	Something that implements the `HasOffset` interface carries char offset references to an original text String.
HasOriginalText	This token can produce / set original texts
HasTag	Something that implements the `HasTag` interface knows about part-of-speech tags.
HasWord	Something that implements the `HasWord` interface knows about words.
Label	Something that implements the `Label` interface can act as a constituent, node, or word label with linguistic attributes.
Labeled<E>	Interface for Objects that have a label, whose label is an Object.
LabelFactory	A `LabelFactory` object acts as a factory for creating objects of class `Label`, or some descendant class.

Class Summary
Class	Description
AnnotationLookup	Provides a mapping between CoreAnnotation keys, which are classes, and a text String that names them, which is needed for things like text serializations and the Semgrex query language.
BasicDatum<LabelType,FeatureType>	Basic implementation of Datum interface that can be constructed with a Collection of features and one more more labels.
BasicDocument<L>	Basic implementation of Document that should be suitable for most needs.
CategoryWordTag	A `CategoryWordTag` object acts as a complex Label which contains a category, a head word, and a tag.
CategoryWordTagFactory	A `CategoryWordTagFactory` is a factory that makes a `Label` which is a `CategoryWordTag` triplet.
CoreAnnotations	Set of common annotations for `CoreMap`s.
CoreAnnotations.AbbrAnnotation
CoreAnnotations.AbgeneAnnotation
CoreAnnotations.AbstrAnnotation
CoreAnnotations.AfterAnnotation	Annotation for the whitespace characters appear after this word.
CoreAnnotations.AnswerAnnotation	The standard key for the answer which is a String
CoreAnnotations.AnswerObjectAnnotation
CoreAnnotations.AnswerProbAnnotation	The matching probability for the AnswerAnnotation
CoreAnnotations.AntecedentAnnotation	The CoreMap key identifying the annotation's antecedent.
CoreAnnotations.ArabicCharAnnotation	for Arabic: character level information, segmentation
CoreAnnotations.ArabicSegAnnotation	For Arabic: the segmentation information from the segmenter.
CoreAnnotations.ArgDescendentAnnotation
CoreAnnotations.ArgumentAnnotation	The standard key for a propbank label which is of type Argument
CoreAnnotations.AuthorAnnotation	Author for the document (really should be a set of authors, but just have single string for simplicity)
CoreAnnotations.BagOfWordsAnnotation
CoreAnnotations.BeAnnotation	annotation stolen from the lex parser
CoreAnnotations.BeforeAnnotation	Annotation for the whitespace characters appearing before this word.
CoreAnnotations.BeginIndexAnnotation	This indexes the beginning of a span of words, e.g., a constituent in a tree.
CoreAnnotations.BestCliquesAnnotation	Used in Task3 Pascal system
CoreAnnotations.BestFullAnnotation
CoreAnnotations.CalendarAnnotation	The CoreMap key identifying the date and time associated with an annotation.
CoreAnnotations.CanonicalEntityMentionIndexAnnotation	Index into the list of entity mentions in a document for canonical entity mention.
CoreAnnotations.CategoryAnnotation
CoreAnnotations.CategoryFunctionalTagAnnotation	The standard key for storing category with functional tags.
CoreAnnotations.CharacterOffsetBeginAnnotation	The CoreMap key identifying the offset of the first char of an annotation.
CoreAnnotations.CharacterOffsetEndAnnotation	The CoreMap key identifying the offset of the last character after the end of an annotation.
CoreAnnotations.CharAnnotation
CoreAnnotations.ChineseCharAnnotation	For Chinese: character level information, segmentation.
CoreAnnotations.ChineseIsSegmentedAnnotation	Not sure exactly what this is, but it is different from ChineseSegAnnotation and seems to indicate if the text is segmented
CoreAnnotations.ChineseOrigSegAnnotation	For Chinese: the segmentation info existing in the original text.
CoreAnnotations.ChineseSegAnnotation	For Chinese: the segmentation information from the segmenter.
CoreAnnotations.ChunkAnnotation
CoreAnnotations.CoarseNamedEntityTagAnnotation	The CoreMap key for getting the coarse named entity tag (i.e.
CoreAnnotations.CoarseTagAnnotation	CoNLL dep parsing - coarser POS tags.
CoreAnnotations.CodepointOffsetBeginAnnotation	Some codepoints count as more than one character.
CoreAnnotations.CodepointOffsetEndAnnotation	Some codepoints count as more than one character.
CoreAnnotations.ColumnDataClassifierAnnotation
CoreAnnotations.CommonWordsAnnotation
CoreAnnotations.CoNLLDepAnnotation	CoNLL dep parsing - the dependency type, such as SBJ or OBJ.
CoreAnnotations.CoNLLDepParentIndexAnnotation	CoNLL dep parsing - the index of the word which is the parent of this word in the dependency tree
CoreAnnotations.CoNLLDepTypeAnnotation	CoNLL dep parsing - the dependency type, such as SBJ or OBJ.
CoreAnnotations.CoNLLPredicateAnnotation	CoNLL SRL/dep parsing - whether the word is a predicate
CoreAnnotations.CoNLLSRLAnnotation	CoNLL SRL/dep parsing - map which, for the current word, specifies its specific role for each predicate
CoreAnnotations.CoNLLUFeats	CoNLL-U dep parsing - List of morphological features
CoreAnnotations.CoNLLUMisc	CoNLL-U dep parsing - Any other annotation
CoreAnnotations.CoNLLUSecondaryDepsAnnotation	CoNLL-U dep parsing - List of secondary dependencies
CoreAnnotations.CoNLLUTokenSpanAnnotation	CoNLL-U dep parsing - span of multiword tokens
CoreAnnotations.ContextsAnnotation
CoreAnnotations.CorefMentionToEntityMentionMappingAnnotation	mapping from coref mentions to corresponding ner derived entity mentions
CoreAnnotations.CostMagnificationAnnotation	Key for relative value of a word - used in RTE
CoreAnnotations.CovertIDAnnotation
CoreAnnotations.D2_LBeginAnnotation
CoreAnnotations.D2_LEndAnnotation
CoreAnnotations.D2_LMiddleAnnotation
CoreAnnotations.DayAnnotation
CoreAnnotations.DependentsAnnotation
CoreAnnotations.DictAnnotation
CoreAnnotations.DistSimAnnotation
CoreAnnotations.DoAnnotation	annotation stolen from the lex parser
CoreAnnotations.DocDateAnnotation
CoreAnnotations.DocIDAnnotation	This refers to the unique identifier for a "document", where document may vary based on your application.
CoreAnnotations.DocSourceTypeAnnotation	Document source type What kind of place did the document come from: newswire, discussion forum, web...
CoreAnnotations.DocTitleAnnotation	Document title What is the document title
CoreAnnotations.DocTypeAnnotation	Document type What kind of document is it: story, multi-part article, listing, email, etc
CoreAnnotations.DomainAnnotation	Used in CRFClassifier stuff PositionAnnotation should possibly be an int - it's present as either an int or string depending on context CharAnnotation may be "CharacterAnnotation" - not sure
CoreAnnotations.EmptyIndexAnnotation	Some datasets - for example, the UD Estonian EWT dataset - use "empty" nodes to represent words that were unspoken / unwritten but can be inferred from the structure of the sentence.
CoreAnnotations.EndIndexAnnotation	This indexes the end of a span of words, e.g., a constituent in a tree.
CoreAnnotations.EntityClassAnnotation
CoreAnnotations.EntityMentionIndexAnnotation	index into the list of entity mentions in a document
CoreAnnotations.EntityMentionToCorefMentionMappingAnnotation	Mapping from NER-derived entity mentions to coref mentions.
CoreAnnotations.EntityRuleAnnotation
CoreAnnotations.EntityTypeAnnotation
CoreAnnotations.ExceptionAnnotation	Stores an exception associated with processing this document
CoreAnnotations.FeaturesAnnotation	The standard key for the features which is a Collection
CoreAnnotations.FemaleGazAnnotation
CoreAnnotations.FineGrainedNamedEntityTagAnnotation	The CoreMap key for getting the fine grained named entity tag (i.e.
CoreAnnotations.FirstChildAnnotation	used in binarized trees to specify the first child in the rule for which this node is the parent
CoreAnnotations.ForcedSentenceEndAnnotation	This indicates the sentence should end at this token.
CoreAnnotations.ForcedSentenceUntilEndAnnotation	This indicates that starting at this token, the sentence should not be ended until we see a ForcedSentenceEndAnnotation.
CoreAnnotations.FreqAnnotation
CoreAnnotations.GazAnnotation	Possibly this should be grouped with gazetteer annotation - original key was "gaz".
CoreAnnotations.GazetteerAnnotation	The standard key for the gazetteer information
CoreAnnotations.GenderAnnotation	The CoreMap key identifying an entity mention's potential gender.
CoreAnnotations.GenericTokensAnnotation	The CoreMap key for getting the tokens (can be words, phrases or anything that are of type CoreMap) contained by an annotation.
CoreAnnotations.GeniaAnnotation
CoreAnnotations.GoldAnswerAnnotation	The standard key for gold answer which is a String
CoreAnnotations.GovernorAnnotation
CoreAnnotations.GrandparentAnnotation	specifies the base state of the parent of this node in the parse tree
CoreAnnotations.HaveAnnotation	annotation stolen from the lex parser
CoreAnnotations.HeadWordStringAnnotation	The key for storing a Head word as a string rather than a pointer (as in TreeCoreAnnotations.HeadWordAnnotation)
CoreAnnotations.HeightAnnotation	Used in srl.unsup
CoreAnnotations.IDAnnotation
CoreAnnotations.IDFAnnotation	Inverse document frequency of the word this label represents
CoreAnnotations.INAnnotation
CoreAnnotations.IndexAnnotation	This indexes a token number inside a sentence.
CoreAnnotations.InterpretationAnnotation	The standard key for the semantic interpretation
CoreAnnotations.IsDateRangeAnnotation	it really seems like this should have a different name or else be a boolean
CoreAnnotations.IsFirstWordOfMWTAnnotation	The CoreLabel key identifying whether a token is the first word derived from a multi-word-token.
CoreAnnotations.IsMultiWordTokenAnnotation	The CoreLabel key identifying whether a token is a multi-word-token This is attached to `CoreLabel`s.
CoreAnnotations.IsNewlineAnnotation	The CoreLabel key identifying whether a token is a newline or not This is attached to `CoreLabel`s.
CoreAnnotations.IsURLAnnotation	it really seems like this should have a different name or else be a boolean
CoreAnnotations.KBPTriplesAnnotation	An annotation for a sentence tagged with its KBP relation.
CoreAnnotations.LabelAnnotation	Used in wsd.supwsd package
CoreAnnotations.LabelIDAnnotation
CoreAnnotations.LabelWeightAnnotation
CoreAnnotations.LastGazAnnotation
CoreAnnotations.LastTaggedAnnotation
CoreAnnotations.LBeginAnnotation	Used in Gale2007ChineseSegmenter
CoreAnnotations.LeftChildrenNodeAnnotation	used in incremental DAG parser
CoreAnnotations.LeftTermAnnotation	The Standard key for storing the left terminal number relative to the root of the tree of the leftmost terminal dominated by the current node
CoreAnnotations.LemmaAnnotation	The CoreMap key for getting the lemma (morphological stem, lexeme form) of a token.
CoreAnnotations.LEndAnnotation
CoreAnnotations.LengthAnnotation
CoreAnnotations.LineNumberAnnotation	Line number for a sentence in a document delimited by newlines instead of punctuation.
CoreAnnotations.LinkAnnotation
CoreAnnotations.LMiddleAnnotation
CoreAnnotations.LocationAnnotation	Reference location for the document
CoreAnnotations.MaleGazAnnotation
CoreAnnotations.MarkingAnnotation	Another key used for propbank - to signify core arg nodes or predicate nodes
CoreAnnotations.MentionsAnnotation
CoreAnnotations.MentionTokenAnnotation	used in dcoref.
CoreAnnotations.MonthAnnotation	Used in nlp.coref
CoreAnnotations.MorphoCaseAnnotation
CoreAnnotations.MorphoGenAnnotation
CoreAnnotations.MorphoNumAnnotation
CoreAnnotations.MorphoPersAnnotation
CoreAnnotations.MWTTokenMiscAnnotation	CoNLL-U misc features specifically on the MWT part of a token rather than the word
CoreAnnotations.MWTTokenTextAnnotation	Text of the token that was used to create this word during a multi word token split.
CoreAnnotations.NamedEntityTagAnnotation	The CoreMap key for getting the token-level named entity tag (e.g., DATE, PERSON, etc.) This key is typically set on token annotations.
CoreAnnotations.NamedEntityTagProbsAnnotation	Label and probability pair representing the coarse grained label and probability
CoreAnnotations.NeighborsAnnotation
CoreAnnotations.NERIDAnnotation	This is an NER ID annotation (in case the all caps parsing didn't work out for you...)
CoreAnnotations.NormalizedNamedEntityTagAnnotation	The key for the normalized value of numeric named entities.
CoreAnnotations.NotAnnotation	annotation stolen from the lex parser
CoreAnnotations.NumericCompositeObjectAnnotation	Annotation indicating the numeric object associated with an annotation.
CoreAnnotations.NumericCompositeTypeAnnotation	Annotation indicating the numeric value of the phrase the token is part of (twenty first => 21 21 ).
CoreAnnotations.NumericCompositeValueAnnotation	Annotation indicating whether the numeric phrase the token is part of represents a NUMBER or ORDINAL (twenty first => ORDINAL ORDINAL).
CoreAnnotations.NumericObjectAnnotation
CoreAnnotations.NumericTypeAnnotation
CoreAnnotations.NumericValueAnnotation
CoreAnnotations.NumerizedTokensAnnotation
CoreAnnotations.NumTxtSentencesAnnotation	Used by RTE to track number of text sentences, to determine when hyp sentences begin.
CoreAnnotations.OriginalCharAnnotation	Seems like this could be consolidated with something else...
CoreAnnotations.OriginalTextAnnotation	The exact original surface form of a token.
CoreAnnotations.ParagraphAnnotation	used in dcoref.
CoreAnnotations.ParagraphIndexAnnotation	used in ParagraphAnnotator.
CoreAnnotations.ParagraphsAnnotation	The CoreMap key for getting the paragraphs contained by an annotation.
CoreAnnotations.ParaPositionAnnotation
CoreAnnotations.ParentAnnotation	The standard key for the parent which is a String
CoreAnnotations.PartOfSpeechAnnotation	The CoreMap key for getting the Penn part of speech of a token.
CoreAnnotations.PercentAnnotation	annotation stolen from the lex parser
CoreAnnotations.PhraseWordsAnnotation
CoreAnnotations.PhraseWordsTagAnnotation
CoreAnnotations.PolarityAnnotation
CoreAnnotations.PositionAnnotation
CoreAnnotations.PossibleAnswersAnnotation
CoreAnnotations.PredictedAnswerAnnotation
CoreAnnotations.PresetAnswerAnnotation	The standard key for the answer which is a String
CoreAnnotations.PrevChildAnnotation	used in binarized trees to say the name of the most recent child
CoreAnnotations.PriorAnnotation	Used in propbank.srl
CoreAnnotations.ProtoAnnotation
CoreAnnotations.QuotationIndexAnnotation	Unique identifier within a document for a given quotation.
CoreAnnotations.QuotationsAnnotation	The CoreMap key for getting the quotations contained by an annotation.
CoreAnnotations.QuotedAnnotation	Indicate whether a sentence is quoted
CoreAnnotations.QuotesAnnotation	Store a list of CoreMaps representing quotes
CoreAnnotations.RoleAnnotation	The standard key for the semantic role label of a phrase.
CoreAnnotations.SectionAnnotation	Section of a document
CoreAnnotations.SectionAuthorCharacterOffsetBeginAnnotation	Store the beginning of the author mention for this section
CoreAnnotations.SectionAuthorCharacterOffsetEndAnnotation	Store the end of the author mention for this section
CoreAnnotations.SectionDateAnnotation	Date for a section of a document
CoreAnnotations.SectionEndAnnotation	Indicates that the token end a section and the label of the section
CoreAnnotations.SectionIDAnnotation	Id for a section of a document
CoreAnnotations.SectionIndexAnnotation	Store an index into a list of sections
CoreAnnotations.SectionsAnnotation	Store a list of sections in the document
CoreAnnotations.SectionStartAnnotation	Indicates that the token starts a new section and the attributes that should go into that section
CoreAnnotations.SectionTagAnnotation	Store the xml tag for the section as a CoreLabel
CoreAnnotations.SemanticHeadTagAnnotation	The standard key for Semantic Head Word POS which is a String
CoreAnnotations.SemanticHeadWordAnnotation	The standard key for Semantic Head Word which is a String
CoreAnnotations.SemanticTagAnnotation
CoreAnnotations.SemanticWordAnnotation
CoreAnnotations.SentenceBeginAnnotation	The index of the sentence that this annotation begins in.
CoreAnnotations.SentenceEndAnnotation	The index of the sentence that this annotation begins in.
CoreAnnotations.SentenceIDAnnotation
CoreAnnotations.SentenceIndexAnnotation	Unique identifier within a document for a given sentence.
CoreAnnotations.SentencePositionAnnotation
CoreAnnotations.SentencesAnnotation	The CoreMap key for getting the sentences contained in an annotation.
CoreAnnotations.ShapeAnnotation	The standard key for the "shape" of a word: a String representing the type of characters in a word, such as "Xx" for a capitalized word.
CoreAnnotations.SpaceBeforeAnnotation	Used in Chinese segmenters for whether there was space before a character.
CoreAnnotations.SpanAnnotation	The standard key for span which is an IntPair
CoreAnnotations.SpeakerAnnotation	used in dcoref.
CoreAnnotations.SpeakerTypeAnnotation	used to store speaker type information for coref
CoreAnnotations.SRLIDAnnotation	The key for semantic role labels (Note: please add to this description if you use this key)
CoreAnnotations.SRLInstancesAnnotation
CoreAnnotations.StackedNamedEntityTagAnnotation	The CoreMap key for getting the token-level named entity tag (e.g., DATE, PERSON, etc.) from a previous NER tagger.
CoreAnnotations.StateAnnotation	The base version of the parser state, like NP or VBZ or ...
CoreAnnotations.StatementTextAnnotation	The CoreMap key identifying the annotation's text, as formatted by the `QuestionToStatementTranslator`.
CoreAnnotations.StemAnnotation	Stem of the word this label represents.
CoreAnnotations.SubcategorizationAnnotation
CoreAnnotations.TagLabelAnnotation	Used in Trees
CoreAnnotations.TextAnnotation	The CoreMap key identifying the annotation's text.
CoreAnnotations.TokenBeginAnnotation	The CoreMap key identifying the first token included in an annotation.
CoreAnnotations.TokenEndAnnotation	The CoreMap key identifying the last token after the end of an annotation.
CoreAnnotations.TokensAnnotation	The CoreMap key for getting the tokens contained by an annotation.
CoreAnnotations.TopicAnnotation	Used for Topic Assignments from LDA or its equivalent models.
CoreAnnotations.TrueCaseAnnotation	The CoreMap key for getting the token-level true case annotation (e.g., INIT_UPPER) This key is typically set on token annotations.
CoreAnnotations.TrueCaseTextAnnotation	The CoreMap key identifying the annotation's true-cased text.
CoreAnnotations.TrueTagAnnotation
CoreAnnotations.UBlockAnnotation
CoreAnnotations.UnaryAnnotation	whether the node is the parent in a unary rule
CoreAnnotations.UnclosedQuotationsAnnotation	The CoreMap key for getting the quotations contained by an annotation.
CoreAnnotations.UnknownAnnotation	Note: this is not a catchall "unknown" annotation but seems to have a specific meaning for sequence classifiers
CoreAnnotations.UseMarkedDiscourseAnnotation	used in dcoref.
CoreAnnotations.UtteranceAnnotation	used in dcoref.
CoreAnnotations.UTypeAnnotation
CoreAnnotations.ValueAnnotation	Contains the "value" - an ill-defined string used widely in MapLabel.
CoreAnnotations.VerbSenseAnnotation	Probank key for the Verb sense given in the Propbank Annotation, should only be in the verbnode
CoreAnnotations.WebAnnotation
CoreAnnotations.WikipediaEntityAnnotation	An annotation for the Wikipedia page (i.e., canonical name) associated with this token.
CoreAnnotations.WordFormAnnotation
CoreAnnotations.WordnetSynAnnotation
CoreAnnotations.WordPositionAnnotation
CoreAnnotations.WordSenseAnnotation
CoreAnnotations.XmlContextAnnotation	Used in CleanXMLAnnotator.
CoreAnnotations.XmlElementAnnotation	Used in SimpleXMLAnnotator.
CoreAnnotations.YearAnnotation
CoreLabel	A CoreLabel represents a single word with ancillary information attached using CoreAnnotations.
CoreUtilities
DocumentReader<L>	Basic mechanism for reading in Documents from various input sources.
IndexedWord	This class provides a `CoreLabel` that uses its DocIDAnnotation, SentenceIndexAnnotation, and IndexAnnotation to implement Comparable/compareTo, hashCode, and equals.
LabeledWord	A `LabeledWord` object contains a word and its tag.
MultiTokenTag	Represents a tag for a multi token expression Can be used to annotate individual tokens without having nested annotations
MultiTokenTag.Tag
RVFDatum<L,F>	A basic implementation of the Datum interface that can be constructed with a Collection of features and one more more labels.
SegmenterCoreAnnotations
SegmenterCoreAnnotations.CharactersAnnotation
SegmenterCoreAnnotations.XMLCharAnnotation
SentenceUtils	SentenceUtils holds a couple utility methods for lists that are sentences.
StringLabel	A `StringLabel` object acts as a Label by containing a single String, which it sets or returns in response to requests.
StringLabelFactory	A `StringLabelFactory` object makes a simple `StringLabel` out of a `String`.
Tag	A `Tag` object acts as a Label by containing a `String` that is a part-of-speech tag.
TaggedWord	A `TaggedWord` object contains a word and its tag.
TaggedWordFactory	A `TaggedWordFactory` acts as a factory for creating objects of class `TaggedWord`.
ValueLabel	A `ValueLabel` object acts as a Label with linguistic attributes.
Word	A `Word` object acts as a Label by containing a String.
WordFactory	A `WordFactory` acts as a factory for creating objects of class `Word`.
WordLemmaTag	A WordLemmaTag corresponds to a pair of a tagged (e.g., for part of speech) word and its lemma.
WordLemmaTagFactory	A `WordLemmaTagFactory` acts as a factory for creating objects of class `WordLemmaTag`.
WordTag	A WordTag corresponds to a tagged (e.g., for part of speech) word and is implemented with String-valued word and tag.
WordTagFactory	A `WordTagFactory` acts as a factory for creating objects of class `WordTag`.

Enum Summary
Enum Description

CoreAnnotations.SRL_ID

CoreLabel.OutputFormat

Enum Summary
Enum	Description
CoreAnnotations.SRL_ID
CoreLabel.OutputFormat

Package edu.stanford.nlp.ling Description

This package contains the different data structures used by JavaNLP throughout the years for dealing with linguistic objects in general, of which words are the most generally used. Most data structures in this package are deprecated. The current recommendation is to represent an annotated word as a CoreMap (e.g., an ArrayCoreMap) from the util package.

CoreMap is a basic type-safe data structure that maps keys to corresponding values, where each value's type must be consistent with the key's definition. The CoreAnnotations class in this package contains many common annotations used by different portions of JavaNLP, but you can define new keys locally to a package if they aren't of general applicability. See the CoreMap unit tests for an example usage of CoreMap and of defining a key.

The oldest code in JavaNLP uses various types of ValueLabel, and might expect data types from the Has* family (like HasWord, HasTag, et al., denoting presence or absence of that particular annotation). Second generation code made use of the MapLabel family (including AbstractMapLabel, FeatureLabel, and IndexedFeatureLabel), but this code has all been converted across to use CoreLabel. More modern code will use CoreMap as its basic data structure. CoreLabel is a CoreMap that unifies all the families of interfaces into a single view of an underlying (Array)CoreMap.

It is recommended that new code use the ArrayCoreMap class from the util package as the base representation of a word when possible. Any CoreMap can be presented as one of the older data structures (MapLabel, HasWord, etc.), by simply wrapping it in a CoreLabel "view" with CoreLabel.forCoreMap(map).

Legacy description: Classes for linguistic concepts which are common to many NLP classes, such as Word, Tag, etc. Also contains classes for building and operating on documents and data collections. Two of the basic interfaces are Document for representing a document as a list of words with meta-data, and DataCollection for representing a collection of documents. The most common document class you will probably use is BasicDocument, which provides support for constructing documents from a variety of input sources.

Author:: Sepandar Kamvar (sdkamvar@stanford.edu), Joseph Smarr (jsmarr@stanford.edu), dramage, rafferty