Report Number: CS-TN-98-70
Institution: Stanford University, Department of Computer Science
Title: Merging Ranks from Heterogeneous Internet Sources
Author: Garcia-Molina, Hector
Author: Gravano, Luis
Date: May 1998
Abstract: Many sources on the Internet and elsewhere rank the objects in query results according to how well these objects match the original query. For example, a real-estate agent might rank the available houses according to how well they match the user's preferred location and price. In this environment, ``meta-brokers'' usually query multiple autonomous, heterogeneous sources that might use varying result- ranking strategies. A crucial problem that a meta-broker then faces is extracting from the underlying sources the top objects for a user query according to the meta-broker's ranking function. This problem is challenging because these top objects might not be ranked high by the sources where they appear. In this paper we discuss strategies for solving this ``meta-ranking'' problem. In particular, we present a condition that a source must satisfy so that a meta-broker can extract the top objects for a query from the source without examining its entire contents. Not only is this condition necessary but it is also sufficient, and we show an efficient algorithm to extract the top objects from sources that satisfy the given condition.