next up previous contents index
Next: Query-term proximity Up: Components of an information Previous: Components of an information   Contents   Index


Tiered indexes

We mentioned in Section 7.1.2 that when using heuristics such as index elimination for inexact top-$K$ retrieval, we may occasionally find ourselves with a set $A$ of contenders that has fewer than $K$ documents. A common solution to this issue is the user of tiered indexes , which may be viewed as a generalization of champion lists . We illustrate this idea in Figure 7.4 , where we represent the documents and terms of Figure 6.9 . In this example we set a tf threshold of 20 for tier 1 and 10 for tier 2, meaning that the tier 1 index only has postings entries with tf values exceeding 20, while the tier 2 index only has postings entries with tf values exceeding 10. In this example we have chosen to order the postings entries within a tier by document ID.

\includegraphics[width=11cm]{tiered.eps} Tiered indexes.If we fail to get $K$ results from tier 1, query processing ``falls back'' to tier 2, and so on. Within each tier, postings are ordered by document ID.


next up previous contents index
Next: Query-term proximity Up: Components of an information Previous: Components of an information   Contents   Index
© 2008 Cambridge University Press
This is an automatically generated page. In case of formatting errors you may want to look at the PDF edition of the book.
2009-04-07