Top-down clustering is conceptually more complex than
bottom-up clustering since we need a second, flat clustering
algorithm as a ``subroutine''. It has the advantage of being
more efficient if we do not generate a complete hierarchy
all the way down to individual document leaves. For a fixed
number of top levels, using an efficient flat algorithm like
-means, top-down algorithms are linear in the number of
documents and clusters. So they run much faster than HAC
algorithms, which are at least quadratic.
There is evidence that divisive algorithms produce more accurate
hierarchies than bottom-up algorithms in some
circumstances. See the references on
bisecting -means in Section 17.9 . Bottom-up methods
make clustering decisions based on local patterns without
initially taking into account the global distribution. These
early decisions cannot be undone. Top-down clustering
benefits from complete information about the global
distribution when making top-level partitioning decisions.