Scene Graphs are a graph-based semantic representation of image contents. They encode the objects in an image, their attributes and the relationships between objects. This system takes a single-sentence image description and parses it into a scene graph as described in the paper:
Sebastian Schuster, Ranjay Krishna, Angel Chang, Li Fei-Fei, and Christopher D. Manning. 2015. Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval. In Proceedings of the Fourth Workshop on Vision and Language (VL15). [bib]
The system requires Java 1.8+ to be installed,
and it requires Stanford CoreNLP 3.6.0. We recommend running the system with at least 2gb of memory.
The system is licensed under the GNU General Public License (v2 or later). Source is included. The package includes components for command-line invocation, and a Java API.
To run the code, you need the CoreNLP jar and the CoreNLP models jar as well as the Scene Graph Parser jar in your classpath.
|Download the latest CoreNLP distribution [404 MB]|
|Download the Scene Graph Parser [0.2 MB]|
You can either run the parser programmatically or in interactive mode through the command line.
To parse sentences interactively, put all the jar files from the CoreNLP distribution and the Scene Graph Parser jar into one directory and then run the following command from this directory.
java -mx2g -cp "*" edu.stanford.nlp.scenegraph.RuleBasedParser
Alternatively, you can also run the parser programmatically as following.
import edu.stanford.nlp.scenegraph.RuleBasedParser; import edu.stanford.nlp.scenegraph.SceneGraph; String sentence = "A brown fox chases a white rabbit."; RuleBasedParser parser = new RuleBasedParser(); SceneGraph sg = parser.parse(sentence); //printing the scene graph in a readable format System.out.println(sg.toReadableString()); //printing the scene graph in JSON form System.out.println(sg.toJSON());
Please email Sebastian Schuster if you have any questions.