Foundations of Statistical Natural Language Processing

Christopher D. Manning and Hinrich Schütze

Published May 1999 by
The MIT Press
Cambridge, Massachusetts.

This book is designed as a thorough introduction to statistical approaches to natural language processing. There is a companion website for the book.

You can order the book at Amazon, Barnes and Noble, or The MIT Press website.

Sample Chapters

These are prefinal versions of two chapters. They largely correspond to the published versions, but lack final copyediting and have different fonts and pagination.

Please send any feedback or comments to or



The chapter headings are given below. A more detailed table of contents is available from MIT Press, and the full contents can also be downloaded as a postscript file.

Brief Contents

  1. Preliminaries
    1. Introduction
    2. Mathematical Foundations
    3. Linguistic Essentials
    4. Corpus-Based Work
  2. Words
    1. Collocations
    2. Statistical Inference: n-gram models over sparse data
    3. Word Sense Disambiguation
    4. Lexical Acquisition
  3. Grammar
    1. Markov Models
    2. Part-Of-Speech Tagging
    3. Probabilistic Context Free Grammars
    4. Probabilistic Parsing
  4. Applications and Techniques
    1. Statistical Alignment and Machine Translation
    2. Clustering
    3. Topics in Information Retrieval
    4. Text Categorization

Christopher Manning and Hinrich Schütze -- last modified