
COURSE
INFORMATION


Instructor 
Chris Manning, manning@cs.stanford.edu Office: Gates 158. Office hours: Thu 1012. 
Time 
T/F 8:009:45 AM. (I, of course, apologize for the time, which isn't the one I would have chosen either.) 
Location 
Hewlett 101. (That's the SEQ Teaching Center. Not the huge room.) 
Reading 
There is no assigned text. However, if you want some good background to get you started look at: 
Description 
Over the last decade, statistical parsing has transformed our ability to produce automatic, highaccuracy parses of arbitrary human language text. This course aims to teach from the basics up to the stateoftheart in this domain. It will begin by reviewing the phenomena that motivated statistical approaches to parsing, contextfree grammars (CFGs), and probabilistic CFGs. Next it will present basic parsing algorithms, concentrating on generalized CKY and A* parsing algorithms, and discuss treebanks, their design and nature, and the methods of building and evaluating parsers based on them. The course will then turn to the wellknown and successful Collins and Charniak generative parsing models of the late 1990s, and discuss issues such as smoothing, head lexicalization, engineering for efficiency, and what kinds of information parsers use and need. Finally, we will turn to discriminative methods of parsing, and discuss both parse reranking techniques and the direct construction of discriminative parsers. 
Prerequisites 
Prerequisites: Reasonable familiarity and competence with mathematical notation, probability, algorithmic thinking, and programming. (We're just going to dive into statistical parsing assuming that you already know about empirical computational linguistics, probabilities, and algorithms. If you should be taking LSA 325, I hope you are!) 
Required Work 
If enrolled for credit, you need to complete a project for this class! You should do one of the following: 
SCHEDULE


Date

HW

Lec

Topic and Readings 

Fri July 6 
1. 
Overview of Course, Parsing and Statistical Parsing [ppt] [pdf] Motivation for statistical parsing, recursive phrase structure. Attachment decisions and probabilities. The Penn Treebank. Topdown, and bottomup parsing.


Tue July 10 
Assignments announced. 
2. 
PCFGs and the CKY algorithm [ppt] [pdf] PCFGs. Grammar transformations. Recursive parsing and memoization. Dynamic programming for parsing: the CKY algorithm.


Fri July 13 [Black Friday!] 
3. 
Generalized CKY parsing and unlexicalized parsing
[ppt] [pdf]
Unaries and empties. Parser evaluation. Improving the contextfreedom assumptions of grammars: accurate unlexicalized parsing.


Tue July 17 
4. 
Search in parsing and lexicalized probabilistic parsing [ppt] [pdf] Beam parsing. Agendabased (chart) parsing. A* parsing. Lexicalized probabilistic contextfree grammars: The Charniak (1997) model.


Fri July 20 
5. 
Treebanks and statistical parsing [ppt] [pdf] Lexicalized parsing: Collins (1997/1999). The status of information in treebanks.


Tue July 24 
Assignments due 
6. 
Multilingual parsing and dependency parsing [ppt] [pdf]


Fri July 27 
7. 
Discriminative parsing [ppt] [pdf] An introduction to discriminative parsing. Features in discriminative parsers, presented using some of Mark Johnson's slides on discriminative reranking.
