next up previous contents index
Next: Okapi BM25: a non-binary Up: An appraisal and some Previous: An appraisal of probabilistic   Contents   Index

Tree-structured dependencies between terms

\begin{figure}
% latex2html id marker 14314
\psset{unit=0.75cm}
\begin{pspicture...
...endent on a term $x_k$\ if there is an arrow $x_k \rightarrow x_i$.}\end{figure}

Some of the assumptions of the BIM can be removed. For example, we can remove the assumption that terms are independent. This assumption is very far from true in practice. A case that particularly violates this assumption is term pairs like Hong and Kong, which are strongly dependent. But dependencies can occur in various complex configurations, such as between the set of terms New, York, England, City, Stock, Exchange, and University. van Rijsbergen (1979) proposed a simple, plausible model which allowed a tree structure of term dependencies, as in Figure 11.1 . In this model each term can be directly dependent on only one other term, giving a tree structure of dependencies. When it was invented in the 1970s, estimation problems held back the practical success of this model, but the idea was reinvented as the Tree Augmented Naive Bayes model by Friedman and Goldszmidt (1996), who used it with some success on various machine learning data sets.


next up previous contents index
Next: Okapi BM25: a non-binary Up: An appraisal and some Previous: An appraisal of probabilistic   Contents   Index
© 2008 Cambridge University Press
This is an automatically generated page. In case of formatting errors you may want to look at the PDF edition of the book.
2009-04-07