Distant Supervision for Event Extraction

Overview

In a distantly supervised information extraction system, training texts are labeled automatically (and noisily) by leveraging an existing database of known facts. While this approach has typically been applied to the extraction of binary relations, this project explores the use of distant supervision for template-based event extraction.

Joint Models

This work places emphasis on joint extraction models, where sentence and entity level decisions are made jointly in a unified probablistic framework. In particular we explore Search-based Structured Prediction (Searn) and Conditional Random Fields (CRF).

Plane Crashes

Our study was conducted on a plane crash knowledge base derived from wikipedia infoboxes. Links to the dataset and presentations of this work are given below.

Plane Crash Dataset:    plane_crash_dataset.zip
Slide Deck:     LREC-2014

People

Papers

  • Kevin Reschke, Martin Jankowiak, Mihai Surdeanu, Christopher D. Manning, and Daniel Jurafsky, "Event Extraction Using Distant Supervision". LREC 2014. pdf errata

Contact Information

For any comments or questions, please e-mail kreschke@cs.stanford.edu.