The New RUFUS Data Model
Dr. Peter Schwarz
IBM Almaden
RUFUS is a tool for finding and exploiting the information contained
in semi-structured data, such as documents, mail, memos, etc. Unlike
traditional database applications, for a which an appropriate schema
can be designed in advance, RUFUS must contend with an evolving set of
formats and datatypes. The New RUFUS Data Model (NRDM) is a flexible,
object-oriented data model that accommodates change. Instead of an
explicit type hierarchy, the relationship between types in NRDM is
implicit, based on similarity of the types' interfaces. NRDM also
supports change at the level of individual objects, by allowing a
single object to support multiple interfaces. NRDM maintains strict
separation between types and implementations, allowing one to define
useful general-purpose types whose interfaces are supported by many
implementations.
Both types and their implementations are described in the New RUFUS
Schema Language (NRSL), an extension of C that features type checking
and multiple inheritance. This talk will describe the features of
both NRDM and NRSL, as well as our initial implementation of the NRDM
runtime system.
Publications Related to Talk
"Managing Change in the Rufus System." In Proceedings of the
Eleventh International Conference on Data Engineering, Taipei, Taiwan,
March 1995. (with K. Shoens)
"The Rufus System: Information Organization for Semi-Structured Data." In
Proceedings of the Nineteenth International Conference on
Very Large Data Bases, August 1993. (with K. Shoens, A. Luniewski,
J. Stamos, and J. Thomas)
Additional papers from the IBM Almaden database group can be obtained via
anonymous ftp
from host www-i.almaden.ibm.com in directory /pub/cs/reports/database/.