OctopusDB

Fund Coordinator

Project Description

In the past ten years, we have seen considerable evidence that there is no one-size-fits-all database architecture. We are currently witnessing a split of data management systems into several specialized solutions. For instance, for data warehousing, database engineers already understood in the mid-nineties that the DBMSs of that time were ill-equipped to cope with the size of the datasets and complexity of OLAP-queries. Therefore a separate type of system was forked from the one-size-fits-all DBMS code line. That system is based on a column store, and it became one of the most popular and successful approaches for OLAP; products include SAP BI Accelerator, InfiniDB, and Paraccel. At the same time, other types of systems were forked including DSMS (data stream management systems); products include StreamBase. As a consequence, today's companies have to manage and integrate several types of data management systems. Data has to be copied from one database system to another. To achieve this, complex, ETL-style data pipelines have to be glued together. The different database systems may also use different query languages or dialects. Obviously, all of this leads to extra costs in terms of development costs, maintenance costs, and DBA costs.

Contribution

We are currently building a new type of database system which fits several use-cases while reducing costs, boosting performance, and improving the ease-of-use at the same time. We present the research challenges in building such a system. We believe that by dropping the assumption of a fixed store, as in traditional systems like row store and column store, and instead having a flexible storage scheme, we can realize much better performance without compromising on cost. We outline OctopusDB as our plan for such a system and discuss how it can mimic several existing as well as newer systems. To do so, we present the concept of storage view as an abstraction of all storage layouts in OctopusDB. We discuss how the heterogenous optimization problems in OctopusDB can be reduced to a single problem, namely storage view selection, and describe how a Holistic Storage View Optimizer can deal with it. We present simulation results to justify our core idea and experimental evidence on our initial prototype to demonstrate our approach. Initial experiments show very promising results.

Patents

We just filed a patent on OctopusDB.