16. Database Management

The INGRES project was initiated in 1973 to design and implement a full-function relational database management system. When the INGRES system was first released in 1976, it contained over 100,000 lines of code in the C programming language and ran on PDP-11s under the UNIX operating system. INGRES, along with several other research prototypes, proved that the relational model was a suitable basis for an efficient, easy-to-use database management system. The system was distributed to over 150 sites around the world and provided the foundation for several commercial products.

By the late 1970s, the INGRES project shifted its focus to distributed database systems and application-development environments. A distributed database management system allows users to query and update data stored on different host computers in a computer network. The prototype distributed INGRES system, running on two VAX machines, was demonstrated in 1982. During this period, a group also worked on the problem of connecting heterogeneous database systems (e.g., hierarchical, network, and relational) that might be geographically dispersed. Also demonstrated was a prototype forms-based programming environment called the Forms Application Development System (FADS) that allowed end users to code their applications by filling in forms displayed on a terminal screen. By the 1980s, all of this research was transferred to commercial products. Distributed databases, gateways to heterogeneous database systems, and forms-based application development tools are now available from many vendors.

In the mid-1980s, the focus of the INGRES project again shifted to a new topic: database support for complex, dynamic data. Early database systems, particularly those based on the relational model, could solve many of the problems facing data-processing organizations that dealt with rigidly structured, regular data (e.g., business data). However, they have not solved the problems posed by more dynamic data, such as documents, geographical data, CAD/CAM data, and programs.

By 1985, the INGRES project embarked on the design and implementation of a new database management system called POSTGRES (for Rafter INGRES) and a new application development environment called PICASSO to provide support for these applications. In 1987 the focus of POSTGRES was expanded to include effective execution on a shared-memory multiprocessor. This resulting system, XPRS (eXtended POSTGRES on RAID and Sprite), attempts to provide both high performance and high availability.

Mariposa: An Economic Paradigm for Query Processing and Data Migration

Paul M. Aoki, Marcel Kornacker, Adam Sah, Jeff Sidell, and Rex Winterbottom
(Professor M. R. Stonebraker)
(ARO) DAAH04-94-G-0223, (ARPA) DABT63-92-C-0007, Microsoft, and (NSF) IRI-9107455

Mariposa is a distributed database system designed to provide high performance in an environment of high data mobility over thousands of autonomous sites and on memory hierarchies with very large capacity. The complexity in scheduling distributed actions in such a large system stems from the large number of possible choices for each action, the expense of global synchronization, and the dynamically changing network environment.

To deal with the complexity of these issues, we have elected to reformulate all issues relating to shared resources (query optimization and processing, storage management, and naming services) into a microeconomic framework. The advantages of this approach over traditional solutions are: (1) the decision process is inherently decentralized, which is a prerequisite for achieving scalability; (2) prices in a market system fluctuate in accordance with the demand and supply of resources, allowing the system to adapt dynamically to resource contention; and (3) queries, storage managers, and name servers can be integrated into a single market-based economy, simplifying resource mangement algorithms. This will also result in an efficient allocation of every available resource.

Our most recent work extends the economic model to support replica management and describes our mechanisms for propagating updates among replicas [1]. We show how our replica control mechanism provides consistent, although potentially stale, views of data across many machines without expensive per-transaction synchronization.

We have constructed a prototype system and conducted preliminary performance measurements in both local area and wide area network environments [2]. We expect to make our prototype software publicly available by early 1996.

[1] J. Sidell, P. M. Aoki, S. Barr, A. Sah, C. Staelin, M. R. Stonebraker, and A. Yu, "Data Replication in Mariposa," Proc. Int. Conf. Data Engineering, New Orleans, LA, February 1996.

[2] M. R. Stonebraker, P. M. Aoki, W. Litwin, A. Pfeffer, A. Sah, J. Sidell, C. Staelin, and A. Yu, "Mariposa:
A Wide-Area Distributed Database System," VLDB J., Vol. 5, No. 1, January 1996.

Non-Quiescing Bulk Data Movement Mechanisms for Distributed Databases

Paul M. Aoki
(Professor M. R. Stonebraker)
(ARO) DAAH04-94-G-0223, (ARPA) DABT63-92-C-0007, Microsoft, and (NSF) IRI-9107455

Future distributed data managers will require efficient, high-concurrency mechanisms for reorganizing the physical layout of the distributed database. Examples of such reorganizations include movement of tables between hosts and fragmentation (splitting) of tables in preparation for multi-site declustering. The Mariposa DDBMS, currently under development at UC Berkeley, requires high-performance reorganization algorithms in order to perform automated, dynamic load balancing [1].

This research focuses on reducing the cost of distributed database reorganization by reducing the processor, I/O, and concurrency control costs of reorganization operations. For example, we have implemented and benchmarked mechanisms for the efficient (re)construction of secondary index structures when moving tables between heterogeneous computers [2]. We are currently working on concurrency control issues.

[1] M. R. Stonebraker, P. M. Aoki, R. Devine, W. Litwin, and M. Olson, "Mariposa: A New Architecture for Distributed Data," Proc. Int. Conf. Data Engineering, Houston, TX, February 1994.

[2] P. M. Aoki, Recycling Secondary Index Structures, Sequoia 2000 Technical Report No. 95-66, July 1995.

Automatic Replication in a Wide Area Distributed Database

Adam Sah
(Professor M. R. Stonebraker)
(NASA) PR10-77553

The File Allocation Problem (FAP) is a 25-year-old problem attempting to automatically place copies of a piece of data on computers in a network. The goals are to improve read performance and availability while minimizing write overhead. This work represents a more difficult (and more useful) version where (1) the elements of data are horizontal fragments of tables in a relational database, and (2) the network consists of local area networks (LANs) connected in a wide area network (WAN) supporting multicast. Currently, data placement is performed manually, a laborious process that is not only expensive, but often results in poor results [1].

As simple as this sounds, virtually every variation on the FAP is NP-complete or NP-hard in number of computers and/or network links [1]. Since realistic wide area networks have thousands of potential servers, an optimal solution is intractable in practice. This work instead focuses on practical solutions in the framework of the Mariposa wide area distributed database project. My goal is to demonstrate a heuristic that: (1) incurs little overhead to run as part of normal operation; (2) empirically results in optimal placement for small networks and isolated LANs; (3) improves upon previous results for larger networks; and (4) preserves the requirements of a wide area database, such as local host autonomy, no global synchronization, and resiliency to host and network failure. [2]

[1] L. Dowdy and D. Foster, "Comparative Models of the File Allocation Problem," Computing Surveys,
Vol. 14, No. 2, June 1982.

[2] M. R. Stonebraker, P. M. Aoki, R. Devine, and W. Litwin, et al., "Mariposa: A New Architecture for Distributed Data," Proc. Int. Conf. Data Engineering, Houston, TX, February 1994.

Query Processing in Tertiary Memory Databases

Sunita Sarawagi
(Professor M. R. Stonebraker)
(ARO) 91-G-0183, (ARPA) T63-92-C-0007, Digital Equipment Corporation, and (NSF) IRI-91-07455

With the rapid increase in the number of applications that require access to large amounts of data, it is becoming increasingly important for database systems to handle tertiary storage devices. The characteristics of tertiary memory devices are very different from secondary storage devices that conventional database systems are designed for. This requires new approaches to managing data location and movement, together with query execution in a unified framework. In this paper we present methods of scheduling queries, caching, and controlling the order of data retrieval for efficient operation in a tertiary memory environment. We show how careful interspersing of queries and informed cache management can achieve remarkable reductions in access time compared to conventional methods. Our algorithms use a few model parameters for each tertiary memory device and are thus designed to be portable across a wide variety of tertiary memory devices and database types. We have extended the POSTGRES database system to implement the new query processing strategies. Initial measure ments of the Sequoia benchmark on the prototype yield impressive results [1,2].

[1] S. Sarawagi, "Query Processing in Tertiary Memory Databases," Int. Conf. Very Large Databases, Zurich, Switzerland, September 1995.

[2] S. Sarawagi, "Database Systems for Efficient Access to Tertiary Memory," Proc. IEEE Mass Storage Symp., Monterey, CA, September 1995.

16. Database Management

Mariposa: An Economic Paradigm for Query Processing and Data Migration

Paul M. Aoki, Marcel Kornacker, Adam Sah, Jeff Sidell, and Rex Winterbottom (Professor M. R. Stonebraker) (ARO) DAAH04-94-G-0223, (ARPA) DABT63-92-C-0007, Microsoft, and (NSF) IRI-9107455