Many complex or technical concepts and many organization and product
names are multiword compounds or phrases. We would like to be
able to pose a query
such as Stanford University by treating it as a phrase so that a
sentence in a document like The inventor Stanford
Ovshinsky never went to university. is
not a match. Most recent search engines support a double quotes
syntax (``stanford university'') for phrase queries , which has
proven to
be very easily understood and successfully used by users. As many as 10%
of web queries are phrase queries, and many more are implicit phrase
queries (such as person names), entered without use of double quotes.
To be able to support such queries, it is no longer sufficient for
postings lists to be simply lists of documents that contain individual terms.
In this section we consider two approaches to supporting
phrase queries and their combination. A search engine should not only
support phrase queries, but implement them efficiently.
A related but distinct concept is term proximity weighting,
where a document is preferred to the extent that the query terms appear
close to each other in the text. This technique is covered in
Section 7.2.2 (page ) in the context of ranked retrieval.