Web search basics

In this and the following two chapters, we consider web search engines. Sections 19.1 -19.4 provide some background and history to help the reader appreciate the forces that conspire to make the Web chaotic, fast-changing and (from the standpoint of information retrieval) very different from the ``traditional'' collections studied thus far in this book. Sections 19.5 -19.6 deal with estimating the number of documents indexed by web search engines, and the elimination of duplicate documents in web indexes, respectively. These two latter sections serve as background material for the following two chapters.


