Stanford NLP Named Entity Recognition Results

Corpus		# Word Tokens		# Entities		# Features		Exact Match Score (conlleval)			Technology		Notes
Name	Language	Train	Test	Types	Instances	Φ(X)	λ/f(X,Y)	Prec	Rec	F₁	Classifier	Properties file/flag
CoNLL 2002	Dutch news testa (devset)	218737	37761	4	2616	838524	4192620	78.99%	77.33%	78.15%	pure CMM	-goodCoNLL	1, 3, 5, 7
CoNLL 2002	Dutch news testb	218737	68994	4	3941	838559	4192795	80.48%	78.96%	79.71%	pure CMM	-goodCoNLL	1, 3, 5, 7
CoNLL 2002	Spanish news testa (devset)	273037	52923	4	4352	776511	3882555	78.01%	76.19%	77.09%	pure CMM	-goodCoNLL	1, 3, 5, 7
CoNLL 2002	Spanish news testb	273037	51533	4	3559	776444	3882220	81.24%	81.03%	81.14%	pure CMM	-goodCoNLL	1, 3, 5, 7
CoNLL 2003	English news testa (devset)	219553	51578	4	5942	738378	3691890	91.37%	91.22%	91.29%	pure CMM	-goodCoNLL	1, 5, b
CoNLL 2003	English news testa (devset)	219554	51578	4	5942			92.15%	92.39%	92.27%	postprocessed CMM		1, 2, 4
CoNLL 2003	English news testb	219553	46666	4	5648	738378	3691890	85.65%	85.41%	85.53%	pure CMM	-goodCoNLL	1, 5, b
CoNLL 2003	English news testb	219554	46666	4	5648			86.12%	86.49%	86.31%	postprocessed CMM		1, 2, 4
CoNLL 2003	German news testa (devset)	220189	51645	4	4833	1079044	5395220	77.12%	61.37%	68.35%	pure CMM	-goodCoNLL	1, 3, 5, 6, 7, a
CoNLL 2003	German news testa (devset)	220189	51645	4	4833			75.36%	60.36%	67.03%	postprocessed CMM		1, 2, 3, 4
CoNLL 2003	German news testb	220189	52098	4	3673	1079037	5395185	79.23%	63.65%	70.59%	pure CMM	-goodCoNLL	1, 3, 5, 6, 7, a
CoNLL 2003	German news testb	220189	52098	4	3673			80.38%	65.04%	71.90%	postprocessed CMM		1, 2, 3, 4
CoNLL 2003	English news testa (devset)	219553	51578	4	5942	616918	11532202	91.64%	90.93%	91.28%	CRF (closed task)	conll.crf.chris2009.prop iob2	1, 5, c
CoNLL 2003	English news testa (devset)	219553	51578	4	5942	633786	12285708	93.28%	92.71%	92.99%	CRF (with distsim)	conll.crf.chris2009.prop iob2 distsim	1, 5, c
CoNLL 2003	English news testb	219553	46666	4	5648	633786	12285708	88.21%	87.68%	87.94%	CRF (with distsim)	conll.crf.chris2009.prop iob2 distsim	1, 5, c

Other results that should be on this page

BioCreative, JNLPBA, MUC, all3.

Notes

1. Test token counts exclude boundaries (blank lines) but they are included in the sequence model used.

2. Postprocessing Perl scripts improved handling of names and datelines. See Klein et al. 2003 CoNLL paper for features used.

3. This model has not been separately optimized on a per-dataset or even per-language basis. The model just uses the feature set that had been found to be effective for English.

4. Official score of system submitted for listed competition.

5. Score of current in-house version.

6. This model adds current, previous, and next word lemma features to the German model (word lemmas are present in the provided CoNLL data but we did not use it at the time of the official competition run).

7. Feature counts differ slightly even with the same training data because the unknown word model does a tiny amount of transductive learning: unknown word features include whether a capitalized word has also been seen all lowercase, and the test set data is included in the dictionary for this purpose.

a. Date: 2005/09/14.

b. Date: 2006/08/28.

c. Date: 2009/07/15.