Scalability Issues in Data Integration

By Arnon Rosenthal, Ph.D. , Dr. Len Seligman

Data integration efforts often aim to give users access to multiple data sources through queries (and other requests) against a global schema.

Download Resources


PDF Accessibility

One or more of the PDF files on this page fall under E202.2 Legacy Exceptions and may not be completely accessible. You may request an accessible version of a PDF using the form on the Contact Us page.

Data integration efforts often aim to give users access to multiple data sources through queries (and other requests) against a global schema. As sources change, new ones become available, and others become unavailable (at least temporarily), it becomes very burdensome to maintain the necessary mappings and other metadata. We compare the administrative labor and data accessibility for two popular approaches: federated databases that derive each table in the global schema as a view over sources, and source-profile systems that describe each source's offerings as a view over the global tables. We then propose a hybrid process that combines their advantages.