SEMPRE: Semantic Parsing with Execution

SEMPRE is a toolkit for training semantic parsers, which map natural language utterances to denotations (answers) via intermediate logical forms. Here's an example for querying databases: Here's another example for programming via natural language:

SEMPRE has the following functionality:

Code

You can download all the code and documentation for SEMPRE from GitHub. To learn more about the system, walk through our tutorial.

Data

In our EMNLP 2013 paper, we created a new dataset, WebQuestions, which is released under the CC BY 4.0 license. Here are the train and test splits. You can also see the leader board, upload your predictions, and evaluate your system in this CodaLab worksheet.

In addition, we preprocessed the Free917 dataset (Cai & Yates, 2013) to work with our system. Here are the train and test splits.

Both datasets are provided in JSON format. WebQuestions contains 3,778 training examples and 2,032 test examples. Free917 contains 641 training example and 276 test examples.

On WebQuestions, each example contains three fields:

On Free917, each example contains two fields:

Papers

SEMPRE was used in the papers:

Jonathan Berant, Andrew Chou, Roy Frostig, Percy Liang. Semantic Parsing on Freebase from Question-Answer Pairs. Empirical Methods in Natural Language Processing (EMNLP), 2013.
Jonathan Berant, Percy Liang. Semantic Parsing via Paraphrasing. Association for Computational Linguistics (ACL), 2014.
Yushi Wang, Jonathan Berant, Percy Liang. Building a Semantic Parser Overnight. Association for Computational Linguistics (ACL), 2015.
Panupong Pasupat, Percy Liang. Compositional Semantic Parsing on Semi-Structured Tables. Association for Computational Linguistics (ACL), 2015. [Project Page]
Jonathan Berant, Percy Liang. Imitation Learning of Agenda-based Semantic Parsers. Transactions of ACL (TACL), 2015.
Panupong Pasupat, Percy Liang. Inferring Logical Forms From Denotations. Association for Computational Linguistics (ACL), 2016.
Reginald Long, Panupong Pasupat, Percy Liang. Simpler Context-Dependent Logical Forms via Model Projections. Association for Computational Linguistics (ACL), 2016.
Sida Wang, Percy Liang, Christopher Manning. Learning Language Games through Interaction. Association for Computational Linguistics (ACL), 2016.

SEMPRE supports lambda DCS logical forms, which is the default one used for querying Freebase:

Percy Liang. Lambda Dependency-Based Compositional Semantics. arXiv report.