Report Number: CS-TN-93-1
Institution: Stanford University, Department of Computer Science
Title: Incremental Updates of Inverted Lists for Text Document
Retrieval
Author: Tomasic, Anthony
Author: Garcia-Molina, Hector
Author: Shoens, Kurt
Date: December 1993
Abstract: With the proliferation of the world's "information
highways" a renewed interest in efficient document indexing
techniques has come about. In this paper, the problem of
incremental updates of inverted lists is addressed using a
new dual-structure index data structure. The index
dynamically separates long and short inverted lists and
optimizes the retrieval, update, and storage of each type of
list. To study the behavior of the index, a space of
engineering trade-offs which range from optimizing update
time to optimizing query performance is described. We
quantitatively explore this space by using actual data and
hardware in combination with a simulation of an information
retrieval system. We then describe the best algorithm for a
variety of criteria.
http://i.stanford.edu/pub/cstr/reports/cs/tn/93/1/CS-TN-93-1.pdf