The
Hypergraph Parsing
Project Page
Overview
Parsing tasks can be seen
as analysis of an associated hypergraph. Reachability corresponds to
parsability and shortest paths correspond to best parses. This view
allows one to adapt graph algorithms for use in parsing. In
particular, Dijkstra's algorithm can be generalized to work on
hypergraphs, and gives rise to a probabilistic chart parser. A-star
search can also be generalized to this setting. The A-star parser
gives a way of reducing the work needed to find a best parse which
does not sacrifice the correctness or time bounds of the parser, and
which gives speed-ups comparable to best-first parsing (which neither
guarantees correctness nor preserves the worst-case cubic bounds).
Additionally, these algorithms work on fully general PCFGs and can be
shown to be correct over a wide range of parsing control strategies
and grammar encodings.
We have used these parsers to investigate properties of
wide-coverage PCFG grammars. With increased coverage generally comes
increased ambiguitiy. For a grammar like the Penn treebank PCFG
grammar, essentially any sequence can derive any category. Whether a
rule can apply tends to depend on the teminal symbols which anchor
that tag. Models based on this observation closely predict parser
load. In particular, they explain the somewhat counterintuitive fact
that the time to parse a sentence grows more rapidly with more
efficient (faster) rule encodings and tree transformations.
Publications
- Dan Klein and Christopher D. Manning, "A* Parsing: Fast Exact Viterbi Parse Selection", Stanford Technical Report, dbpubs, 2002. [ps]
[pdf]
[bib]
- Dan Klein and Christopher D. Manning, "Parsing and Hypergraphs",
Proceedings of the 7th International Workshop on Parsing
Technologies (IWPT-2001), 2001. [ps]
[pdf]
[bib]
- Dan Klein and Christopher D. Manning, "Parsing with Treebank
Grammars: Empirical Bounds, Theoretical Models, and the Structure of
the Penn Treebank", Proceedings of the 39th Annual Meeting of the
ACL, 2001. [ps]
[pdf]
[bib]
- Dan Klein and Christopher D. Manning, "An O(n3)
Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free
Grammars", Stanford Technical Report, 2001. [ps]
[pdf]
[bib]
Contact Information
Comments about the project page? Feel free to email Dan.