The Stanford NLP Group makes several pieces of NLP software available to the public. These are statistical NLP (Natural Language Processing) toolkits for various major computational linguistics problems. All these software distributions are licensed under the GNU Public License. (Note that this is the full GPL, which allows its use for research purposes or other free software projects but does not allow its incorporation into any type of distributed proprietary software, even in part or in translation. Please contact us if you are interested in getting NLP software with a different commercial license.)

All the software we distribute is written in Java. Recent distributions require Sun JDK 1.5+ (some of the older ones run on JDK 1.4). Distribution packages include components for command-line invocation, jar files, a Java API, and source code.

Supported software distributions

This code is being developed, and we try to answer questions and fix bugs on a best-effort basis.

The Stanford Parser
Java implementations of probabilistic natural language parsers, both highly optimized PCFG and dependency parsers, and a lexicalized PCFG parser. Including: Parser FAQ and Online parser demo.
The Stanford POS Tagger
A Java implementation of a maximum-entropy (CMM) part-of-speech (POS) tagger.
The Stanford Named Entity Recognizer
A Java implementation of a Conditional Random Field sequence model, together with well-engineered features for Named Entity Recognition.
Stanford Chinese Word Segmenter
A Java implementation of a CRF-based Chinese Word Segmenter
The Stanford Classifier
A Java implementation of a conditional loglinear classifier (a.k.a. a maximum entropy or multiclass logistic regression model)
Tregex and Tsurgeon
A Java implementation of a Tgrep2-style utility for matching patterns in trees, and a tree-transformation utility built on top of this matching language.

End-of-life distributions

This is software that we at one point distributed. But we feel either that we are unable to or it isn't useful to maintain it any more. It's still here in case it's useful, but we won't answer questions about it.

FrameNet Reader software
Support files for reading FrameNet XML files (as they existed in 2002-03 - FrameNet version 0.75/1.0) into Java data structures.
Simple manual annotation tool
A simple tool for annotating spans of text with classes suitable for supervised training of named entity recognition and information extraction models. Works on plain text and HTML documents. Click to download stanford-manual-annotation-tool-2004-05-16.tar.gz.