Our discussion of DNS resolution (Section 20.2.2 ) uses the current convention for internet addresses, known as IPv4 (for Internet Protocol version 4) - each IP address is a sequence of four bytes. In the future, the convention for addresses (collectively known as the internet address space) is likely to use a new standard known as IPv6 (http://www.ipv6.org/).
Tomasic and Garcia-Molina (1993) and Jeong and Omiecinski (1995) are key early papers evaluating term partitioning versus document partitioning for distributed indexes. Document partitioning is found to be superior, at least when the distribution of terms is skewed, as it typically is in practice. This result has generally been confirmed in more recent work (MacFarlane et al., 2000). But the outcome depends on the details of the distributed system; at least one thread of work has reached the opposite conclusion (Ribeiro-Neto and Barbosa, 1998, Badue et al., 2001).
Sornil (2001) argues for a partitioning scheme that is a hybrid between term and document partitioning.
Barroso et al. (2003) describe the distribution methods used at Google. The first implementation of a connectivity server was described by Bharat et al. (1998). The scheme discussed in this chapter, currently believed to be the best published scheme (achieving as few as 3 bits per link for encoding), is described in a series of papers by Boldi and Vigna (2004b;a).