Skip to main content

MonetDB SQL transaction management scheme

The transaction management scheme deployed in MonetDB sometimes creates confusion with application developers. In particular how are transactions committed, persisted and isolated to guarantee the ACID properties of SQL. Most of the confusion stems from the difference between OnLine-Analytical-Processing (OLAP) and OnLine-Transaction-Processing (OLTP).

MonetDB is primarily an analytical (OLAP) database, where most operations are SELECT-FROM-WHERE-GROUPBY queries. With that in mind, its transaction management scheme is tuned for reading large chunks of data, rather than writing  small chunks of data at high speed concurrently. Nevertheless, MonetDB SQL obeys the transaction semantics of the SQL standard.

Transaction management scheme

The MonetDB transaction management is based on optimistic concurrency control (OCC). OCC is particularly useful for read-heavy applications, with the design that transaction management overhead should only be paid when necessary. While OCC does provide each transaction with a consistent view on the database, updates are collected in an addendum. The addendum is then processed on transaction commit. If at commit time the tables affected by the transaction have not changed, the results are merged. Otherwise the transaction is aborted and the pending updates discarded. As a result, OCC can negatively impact long running transactions, as these can be affected by concurrent updates on their underlying tables. The same holds for applications that try to perform concurrent updates to the same tables from multiple application threads or to create/modify the database schema concurrently. To avoid such unexpected transaction aborts, the transactions should be internally serialized by the application developer.

Transaction log

The MonetDB SQL layer runs in auto-commit mode by default. This means that the database will commit transactions as frequently as possible, usually after each and every SQL statement or catalog change. Auto-commit can be disabled by setting the session variable auto_commit to false. By disabling auto-commit, one must manually mark the beginning of a transaction with START TRANSACTION and terminate it with COMMIT. Alternative, calling ROLLBACK will revert all operations since the beginning of the transaction. This mode is useful for bulk-loading large datasets with simple insert statements.

When a transaction is committed, its changes are written out to the MonetDB transaction log or Write-Ahead Log (WAL). Actually the commit will not succeed until the transaction is fully persisted in the WAL. Committed transactions are also immediately applied to the logical store, thus visible to all following transactions and all clients querying the database. At certain intervals, if the database is idle, the store manager will merge the  updates from the WAL into the primary storage - the Binary Association Tables (BATs). By default, persisted transactions are cleaned up from the WAL. During database startup, all transaction still left in the WAL are read and any that are not yet applied are written into the persistent BAT storage. Hence, in case of failure, all transactions that were reported as committed will indeed be present, once the database is made available to the user.

The WAL is written to disk, where every transaction is stored in a separate file. By default there is only one transaction file in the WAL directory - the current, not yet committed one. The WAL directory is by default found under <database directory>/sql_logs. Inside it is a directory sql for the SQL module transactions. Different modules/top-level interfaces can have different WAL directories. The inner directory contains the following files:

  • log - the WAL catalog files
    • The first line of the file is the internal MonetDB kernel WAL version
    • The next line is empty
    • The third line is the latest transaction id, stored as a Long Long number.
  • log.<transaction id> - each of these files stores a single MonetDB transaction
    • Part of the file content is binary

It is highly advisable never to manually edit the MonetDB WAL files.

Transaction replication

There is ongoing work on a transaction replication scheme for MonetDB. With transaction replication, replica instances of MonetDB can be configured, ensuring higher availability of the system and improved level of concurrency. More information about the architecture of transaction replication in MonetDB can be found in the paper "Single-click to data insights: transaction replication and deployment automation made simple for the cloud age". We plan to have the feature ready in the feature release following Jul2015.