Key Concepts

When a database comprises millions of records spread over many tables and business intelligence/science is the prevalent application domain, a column-oriented database management system is called for. Unlike traditional row-oriented systems, such as MySQL and PostgreSQL, a column-store provides a modern and scalable solution without calling for substantial hardware requirements.

MonetDB has pioneered column-store solutions for high-performance data warehouses in business intelligence and eScience contexts since 1993. Our innovations are evident within every layer of a database management system (DBMS) and include, but are not limited to, a storage model based on vertical fragmentation, a modern CPU-tuned query execution architecture, automatic and adaptive indices, run-time query optimization and a modular software architecture. The SQL front end complies with the SQL:2003 standard with full support for foreign keys, joins, views, triggers and stored procedures. Furthermore, database transactions are fully ACID compliant and a rich spectrum of programming interfaces are supported, namely JDBC, ODBC, PHP, Python, RoR, C/C++ and Perl.

MonetDB is distributed via source tarball packages for installation as well as binary installers for a variety of platforms. The latest release has been tested on Fedora Linux, Red Hat Enterprise Linux, Debian, Ubuntu, Gentoo Linux, macOS, Windows 7, Windows Server 2012 and Windows 10. A periodic release schedule ensures that the latest functional improvements regularly reach the community.

MonetDB is the focus of database research pushing the envelope in many technical areas. Its three-level software stack, comprised of a SQL front end, tactical optimizers and columnar abstract machine kernel, provides a flexible environment to customize it in many different ways. A rich collection of linked-in libraries provide functionality for temporal, geometric, JSON, URL and UUID data types, as well as mathematical routines and user defined functions (UDFs) written in Python, R or C/C++. In-depth information on the technical innovations in the design and implementation of MonetDB can be found in our 'Science Library'.

Last but not least, the MonetDB suite is distributed under an open-source license. This allows anyone to modify and extend the source code as desired and subsequently redistribute it in open-source, as well as proprietary, products. Bug fixes and functional enhancements to the MonetDB code base are highly appreciated.

To summarize, the MonetDB suite exhibits the following features:

A column-store database kernel.MonetDB is built on the canonical representation of database relations as columns, a.k.a. arrays. They are sizeable entities -up to GigaBytes- swapped into memory by the operating system.
A high-performance system.MonetDB excels in applications where the database hot-set -the part actually touched- can be largely held in main-memory or where a few columns of a broad relational table are sufficient to handle individual requests. Further exploitation of cache-conscious algorithms proved the validity of these design decisions.
A multi-core power engine.MonetDB is designed for multi-core parallel executions on desktops to reduce response time for complex query processing. Several techniques for distributed processing are explored, but as many has found out, there is no silver bullet to improve parallel processing performance. For simple data-parallel problems, a map-reduce scheme suffices, but for more complex cases careful database design and (partial) replication is called for.
A versatile algebraic database kernel.MonetDB is designed to accommodate different query languages through its proprietary algebraic-language, called the MonetDB Assembly Language (MAL). It paves the route from declarative expression received from a query compiler up to and including the necessary distributed processing protocols to steer execution of the individual database servers. The primary front-end being distributed is a SQL to MAL compiler.
A size for all.The maximal database size supported by MonetDB depends on the underlying processing platform, e.g. a 32- or 64-bit operating system, and storage device, e.g. the file system and disk RAIDs. The number of columns per tables is practically unlimited. For each column is mapped onto a file, whose limit is dictated by the operating system and hardware platform. The number of concurrent user threads is a configuration parameter.
An extendable platform.MonetDB has been strongly influenced by the scientific experiments to understand the interplay between algorithms and application requirements. It has turned MonetDB into an extensible database system with hooks at all levels in the software stack. This allows for extension of the optimizer pipe-line with domain specific rules; the bulk operations in the kernel for domain specific algorithms; as well as traditional encapsulation of operations take from existing science libraries.
A broad application scope.MonetDB supports a broad palette of application domains by hooking up external supplied libraries, e.g. pcre, raptor, libxml and geos. Several external file formats are being encapsulated into data vaults, which creates a symbiosis and natural bridge between database processing and legacy file-based processing prevalent in some science domains.
An open-source solution.MonetDB has been developed over many years of research at CWI, whose charter ensures that results are easily accessible to others. The MonetDB forum and mailing list are the access point to the development team. Turn-key extensions, high-end technical consultancy and joint-venture projects can be accommodated through the MonetDB Solutions company.