Now, given a query we create a set
as follows: we take the union of the champion lists for each of the terms comprising
. We now restrict cosine computation to only the documents in
. A critical parameter in this scheme is the value
, which is highly application dependent. Intuitively,
should be large compared with
, especially if we use any form of the index elimination described in Section 7.1.2 . One issue here is that the value
is set at the time of index construction, whereas
is application dependent and may not be available until the query is received; as a result we may (as in the case of index elimination) find ourselves with a set
that has fewer than
documents. There is no reason to have the same value of
for all terms in the dictionary; it could for instance be set to be higher for rarer terms.