Stanford Deterministic Coreference Resolution System

News | About | Download | Usage | Questions | Mailing lists | Release history

News

May 7, 2013: Recent improvements to the Stanford Deterministic Coreference Resolution System (Recasens et al., below) won the best short paper award at NAACL 2013.

June 30, 2011: This system was the top ranked system at the CoNLL-2011 shared task.

About

This system implements the multi-pass sieve coreference resolution (or anaphora resolution) system described in Lee et al. (CoNLL Shared Task 2011) and Raghunathan et al. (EMNLP 2010).

The score is higher than that in EMNLP 2010 paper because of additional sieves and better rules (see Lee et al. 2011 for details). Mention detection is included in the package. (see Usage for instructions).

The computational linguistics paper includes more details and additional experimental results.

The papers to cite for this system are as follows:

Marta Recasens, Marie-Catherine de Marneffe, and Christopher Potts.
The Life and Death of Discourse Entities: Identifying Singleton Mentions.
In Proceedings of NAACL 2013.
Heeyoung Lee, Angel Chang, Yves Peirsman, Nathanael Chambers, Mihai Surdeanu and Dan Jurafsky.
Deterministic coreference resolution based on entity-centric, precision-ranked rules.
Computational Linguistics 39(4), 2013.
Heeyoung Lee, Yves Peirsman, Angel Chang, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky.
Stanford's Multi-Pass Sieve Coreference Resolution System at the CoNLL-2011 Shared Task.
In Proceedings of the CoNLL-2011 Shared Task, 2011.
Karthik Raghunathan, Heeyoung Lee, Sudarshan Rangarajan, Nathanael Chambers, Mihai Surdeanu, Dan Jurafsky, Christopher Manning
A Multi-Pass Sieve for Coreference Resolution
EMNLP-2010, Boston, USA. 2010.

Current Evaluation Results


The scores on CoNLL-2011 Shared Task data set.
-----------------------------------------------------------------------------------------------------------------------------------------
                            MUC               B cubed              CEAF (M)            CEAF (E)            BLANC        | 
                       P     R     F1      P     R     F1      P     R     F1      P     R     F1      P     R     F1   | Avg F1
-----------------------------------------------------------------------------------------------------------------------------------------
conllst2011 dev   |   62.4  59.3  60.8  | 74.2  67.6  70.8  | 59.3  59.3  59.3  | 45.5  48.6  47.0  | 79.1  72.5  75.3  |  59.5  
-----------------------------------------------------------------------------------------------------------------------------------------
* Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.

Download

The coreference resolution system is integrated in the Stanford suite of NLP tools, StanfordCoreNLP. Please download the entire suite from this page.


Usage

Running coreference resolution on raw text

This software is now fully incorporated in StanfordCoreNLP, so all you have to do is add the dcoref annotator to the "annotators" property in StanfordCoreNLP. For example, add "dcoref" to the end of the list of text annotators:

annotators = tokenize, ssplit, pos, lemma, ner, parse, dcoref
The properties you can set for the dcoref system itself are the following:
dcoref.demonym                   // The path for a file that includes a list of demonyms 
dcoref.animate                   // The list of animate/inanimate mentions (Ji and Lin, 2009)
dcoref.inanimate 
dcoref.male                      // The list of male/neutral/female mentions (Bergsma and Lin, 2006) 
dcoref.neutral                   // Neutral means a mention that is usually referred by 'it'
dcoref.female 
dcoref.plural                    // The list of plural/singular mentions (Bergsma and Lin, 2006)
dcoref.singular

// above 8 options do not have to be set; default models in StanfordCoreNLP package will be used if unspecified.

dcoref.score = false             // Scoring the output of the system
dcoref.postprocessing = false    // Do post processing
dcoref.maxdist = -1              // Maximum sentence distance between two mentions for resolution (-1: no constraint on the distance)
dcoref.use.big.gender.number = false // Load a big list of gender and number information
dcoref.replicate.conll = false   // Turn on this for replicating conllst result

// if above 5 options are omitted, default values (as shown in above) are used.

sievePasses                      // Sieve passes - each class is defined in dcoref/sievepasses/
                                 // If omitted, the default sieves will be used (recommended).
See StanfordCoreNLP for more details.


How to replicate the results in our CoNLL Shared Task 2011 paper

To replicate the results in the paper run:

java -cp <jars_in_corenlp> -Xmx8g edu.stanford.nlp.dcoref.SieveCoreferenceSystem -props <properties file>
A sample properties file (coref.properties) is included in the dcoref package. The properties file includes the following:
# annotators needed for coreference resolution
annotators = pos, lemma, ner, parse    

# Scoring the output of the system. 
# Scores in log file are different from the output of CoNLL scorer because it is before post processing.
dcoref.score = true                    

                                       
# Do post processing
dcoref.postprocessing = true           
# Maximum sentence distance between two mentions for resolution (-1: no constraint on the distance)
dcoref.maxdist = -1                    
# Load a big list of gender and number information
dcoref.use.big.gender.number = true    
dcoref.big.gender.number = edu/stanford/nlp/models/dcoref/gender.data.gz
# Turn on this for replicating conllst result
dcoref.replicate.conll = true          
# Path for the official CoNLL 2011 scorer script. if omitted, no scoring
dcoref.conll.scorer = /PATH/FOR/SCORER  

# Path for log file for coref system evaluation 
dcoref.logFile = /PATH/FOR/LOGS

# for scoring on other corpora, one of following options can be set 
# dcoref.conll2011: path for the directory containing conllst files
# dcoref.ace2004: path for the directory containing ACE2004 files
# dcoref.mucfile: path for the MUC file
dcoref.conll2011 = /PATH/FOR/CORPUS
This system can process ACE2004, MUC6, and CoNLL Shared Task 2011 corpora in their original formats. Examples from the corpora are given here:

CoNLLst 2011:

nw/wsj/00/wsj_0020          0          0        The         DT (TOP_(S_(NP_*          -          -          -          -          *          *     (ARG0*          *          *          *        (11
nw/wsj/00/wsj_0020          0          1       U.S.        NNP         *)          -          -          -          -      (GPE)          *         *)          *          *          *        11)     
nw/wsj/00/wsj_0020          0          2          ,          ,          *          -          -          -          -          *          *          *          *          *          *          -
nw/wsj/00/wsj_0020          0          3   claiming        VBG   (S_(VP_*      claim         01          2          -          *       (V*) (ARGM-ADV*          *          *          *          -

MUC6:

...
<s> By/IN proposing/VBG <COREF ID="13" TYPE="IDENT" REF="6" MIN="date"> a/DT meeting/NN date/NN</COREF> ,/, <COREF ID="14" TYPE="IDENT" REF="0">
<ORGANIZATION> Eastern/NNP</ORGANIZATION></COREF> moved/VBD one/CD step/NN closer/JJR toward/IN reopening/VBG current/JJ high-cost/JJ contract/NN agreements/NNS with/IN <COREF ID="15" TYPE="IDENT" REF="8" MIN="unions"><COREF ID="16" TYPE="IDENT" REF="14"> its/PRP$</COREF> unions/NNS</COREF> ./. </s>
...
ACE2004:
...
<document DOCID="20001115_AFP_ARB.0212.eng">
<entity ID="20001115_AFP_ARB.0212.eng-E1" TYPE="ORG" SUBTYPE="Educational" CLASS="SPC">
  <entity_mention ID="1-47" TYPE="NAM" LDCTYPE="NAM">
      <extent>
            <charseq START="475" END="506">the Globalization Studies Center</charseq>
                </extent>
                    <head>
                          <charseq START="479" END="506">Globalization Studies Center</charseq>
                              </head>
                                </entity_mention>
                                ...

If you have issues getting this to work, you may need to follow a few steps:

Questions

Questions, feedback, and bug reports/fixes can be sent to our mailing lists.

Mailing Lists

We have 3 mailing lists for the Stanford Coreference Resolution System, all of which are shared with other JavaNLP tools (with the exclusion of the parser). Each address is at @lists.stanford.edu:

  1. java-nlp-user This is the best list to post to in order to ask questions, make announcements, or for discussion among JavaNLP users. You have to subscribe to be able to use it. Join the list via this webpage or by emailing java-nlp-user-join@lists.stanford.edu. (Leave the subject and message body empty.) You can also look at the list archives.
  2. java-nlp-announce This list will be used only to announce new versions of Stanford JavaNLP tools. So it will be very low volume (expect 1-3 messages a year). Join the list via this webpage or by emailing java-nlp-announce-join@lists.stanford.edu. (Leave the subject and message body empty.)
  3. java-nlp-support This list goes only to the software maintainers. It's a good address for licensing questions, etc. For general use and support questions, you're better off joining and using java-nlp-user. You cannot join java-nlp-support, but you can mail questions to java-nlp-support@lists.stanford.edu.

Release History

July 9, 2013

Single mention detection (Recasens et al. 2013) is integrated. The score may differ due to the change in Parser or NER.

-----------------------------------------------------------------------------------------------------------------------------------------
                            MUC               B cubed              CEAF (M)            CEAF (E)            BLANC        | 
                       P     R     F1      P     R     F1      P     R     F1      P     R     F1      P     R     F1   | Avg F1
-----------------------------------------------------------------------------------------------------------------------------------------
conllst2011 dev   |   62.4  59.3  60.8  | 74.2  67.6  70.8  | 59.3  59.3  59.3  | 45.5  48.6  47.0  | 79.1  72.5  75.3  |  59.5  
-----------------------------------------------------------------------------------------------------------------------------------------
* Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.

June 6, 2011

This release is the code used for CoNLL Shared Task 2011. The score may differ due to the change in Parser or NER.

-----------------------------------------------------------------------------------------------------------------------------------------
                   conllst         MUC               B cubed              CEAF (M)            CEAF (E)            BLANC        | 
                    track     P     R     F1      P     R     F1      P     R     F1      P     R     F1      P     R     F1   | Avg F1
-----------------------------------------------------------------------------------------------------------------------------------------
conllst2011 dev   | close |  59.1  57.5  58.3  | 69.2  71.0  70.1  | 58.6  58.6  58.6  | 46.5  48.1  47.3  | 72.2  78.1  74.8  |  58.6  
conllst2011 dev   | open  |  60.1  59.5  59.8  | 69.5  71.9  70.7  | 59.0  59.0  59.0  | 46.5  47.1  46.8  | 73.8  78.6  76.0  |  59.1
conllst2011 test  | close |  57.5  61.8  59.6  | 68.2  68.4  68.3  | 56.4  56.4  56.4  | 47.8  43.4  45.5  | 76.2  70.6  73.0  |  57.8 
conllst2011 test  | open  |  59.3  62.8  61.0  | 69.0  68.9  68.9  | 56.7  56.7  56.7  | 46.8  43.3  45.0  | 76.6  71.9  74.0  |  58.3
-----------------------------------------------------------------------------------------------------------------------------------------
* Automatic mention detection used. Avg F1 = (MUC + B cubed + CEAFE)/3.

----------------------------------------------------------------------------
                      MUC               B cubed             Pairwise
                 P     R     F1      P     R     F1      P     R     F1
----------------------------------------------------------------------------
ACE2004 dev   | 86.0  75.5  80.4  | 89.3  76.5  82.4  | 81.7  55.2  65.9 
ACE2004 test  | 82.7  70.2  75.9  | 88.7  74.5  81.0  | 77.2  44.6  56.6 
ACE2004 nwire | 84.6  75.1  79.6  | 87.3  74.1  80.2  | 79.4  50.1  61.4
MUC6 test     | 90.6  69.1  78.4  | 90.6  63.1  74.4  | 89.7  57.0  69.7
----------------------------------------------------------------------------
* Gold mentions are used. 

August 26, 2010

This release is generally similar to the code used for EMNLP 2010, with one additional sieve: relaxed exact string match.

----------------------------------------------------------------------------
                      MUC               B cubed             Pairwise
                 P     R     F1      P     R     F1      P     R     F1
----------------------------------------------------------------------------
ACE2004 dev   | 84.1  73.9  78.7  | 88.3  74.2  80.7  | 80.0  51.0  62.3
ACE2004 test  | 80.5  72.4  76.2  | 85.4  75.9  80.4  | 68.7  47.9  56.4 
ACE2004 nwire | 83.8  72.8  77.9  | 87.5  72.1  79.0  | 79.3  47.6  59.5
MUC6 test     | 90.3  68.9  78.2  | 90.5  62.3  73.8  | 89.4  55.5  68.5
----------------------------------------------------------------------------