Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

Code for Deeply Moving: Deep Learning for Sentiment Analysis

The original code was written in Matlab. Due to the strong interest in this work we decided to re-write the entire algorithm in Java for easier and more scalable use, and without requiring a Matlab license.

The current model is integrated into Stanford CoreNLP as of version 3.3.0 or later and is available here. This includes the model and the source code, as well as the parser and sentence splitter needed to use the sentiment tool.

Stanford CoreNLP home page

You can run this code with our trained model on text files with the following command:

java -cp "*" -mx5g edu.stanford.nlp.sentiment.SentimentPipeline -file foo.txt
java -cp "*" -mx5g edu.stanford.nlp.sentiment.SentimentPipeline -stdin

An evaluation tool is included with the distribution:

java edu.stanford.nlp.sentiment.Evaluate edu/stanford/nlp/models/sentiment/sentiment.ser.gz test.txt

Models can be retrained using the following command using the PTB format dataset:

java -mx8g edu.stanford.nlp.sentiment.SentimentTraining -numHid 25 -trainPath train.txt -devPath dev.txt -train -model model.ser.gz

Paper: Download pdf

Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher Manning, Andrew Ng and Christopher Potts

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

Conference on Empirical Methods in Natural Language Processing (EMNLP 2013)

Dataset Downloads:

Main zip file with readme (6mb)
Dataset raw counts (5mb)
Train,Dev,Test Splits in PTB Tree Format

Code: Download from the CoreNLP home page

Dataset visualization and web design by Jason Chuang. Live demo by Jean Wu, Richard Socher, Rukmani Ravisundaram and Tayyab Tariq. Java code package by John Bauer and Richard Socher.

This webpage requires one of the following web browsers: