|
COURSE
INFORMATION
|
|
Instructor |
Chris Manning, manning@cs.stanford.edu Office: Gates 158. Office hours: Thu 10-12. |
Time |
T/F 8:00-9:45 AM. (I, of course, apologize for the time, which isn't the one I would have chosen either.) |
Location |
Hewlett 101. (That's the SEQ Teaching Center. Not the huge room.) |
Reading |
There is no assigned text. However, if you want some good background to get you started look at: |
Description |
Over the last decade, statistical parsing has transformed our ability to produce automatic, high-accuracy parses of arbitrary human language text. This course aims to teach from the basics up to the state-of-the-art in this domain. It will begin by reviewing the phenomena that motivated statistical approaches to parsing, context-free grammars (CFGs), and probabilistic CFGs. Next it will present basic parsing algorithms, concentrating on generalized CKY and A* parsing algorithms, and discuss treebanks, their design and nature, and the methods of building and evaluating parsers based on them. The course will then turn to the well-known and successful Collins and Charniak generative parsing models of the late 1990s, and discuss issues such as smoothing, head lexicalization, engineering for efficiency, and what kinds of information parsers use and need. Finally, we will turn to discriminative methods of parsing, and discuss both parse re-ranking techniques and the direct construction of discriminative parsers. |
Prerequisites |
Prerequisites: Reasonable familiarity and competence with mathematical notation, probability, algorithmic thinking, and programming. (We're just going to dive into statistical parsing assuming that you already know about empirical computational linguistics, probabilities, and algorithms. If you should be taking LSA 325, I hope you are!) |
Required Work |
If enrolled for credit, you need to complete a project for this class! You should do one of the following: |
SCHEDULE
|
||||
Date
|
HW
|
Lec
|
Topic and Readings |
|
Fri July 6 |
1. |
Overview of Course, Parsing and Statistical Parsing [ppt] [pdf] Motivation for statistical parsing, recursive phrase structure. Attachment decisions and probabilities. The Penn Treebank. Top-down, and bottom-up parsing.
|
||
Tue July 10 |
Assignments announced. |
2. |
PCFGs and the CKY algorithm [ppt] [pdf] PCFGs. Grammar transformations. Recursive parsing and memoization. Dynamic programming for parsing: the CKY algorithm.
|
|
Fri July 13 [Black Friday!] |
3. |
Generalized CKY parsing and unlexicalized parsing
[ppt] [pdf]
Unaries and empties. Parser evaluation. Improving the context-freedom assumptions of grammars: accurate unlexicalized parsing.
|
||
Tue July 17 |
4. |
Search in parsing and lexicalized probabilistic parsing [ppt] [pdf] Beam parsing. Agenda-based (chart) parsing. A* parsing. Lexicalized probabilistic context-free grammars: The Charniak (1997) model.
|
||
Fri July 20 |
5. |
Treebanks and statistical parsing [ppt] [pdf] Lexicalized parsing: Collins (1997/1999). The status of information in treebanks.
|
||
Tue July 24 |
Assignments due |
6. |
Multilingual parsing and dependency parsing [ppt] [pdf]
|
|
Fri July 27 |
7. |
Discriminative parsing [ppt] [pdf] An introduction to discriminative parsing. Features in discriminative parsers, presented using some of Mark Johnson's slides on discriminative reranking.
|