Server Administration

Server Administration zhang Mon, 02/17/2020 - 17:25

Once you have a running MonetDB setup, you might want to customize it according to your specific needs.

In this section we are going to give details on how to administrate a MonetDB instance.

Installation

Installation zhang Mon, 02/17/2020 - 17:42

Although there are pre-built images for MonetDB, it is also possible to download and build it from source code.

In this section we are going to give a guide for binary installers, source code build, Windows and Mac OS setup.

Install from Source

Install from Source giulia Tue, 02/25/2020 - 16:36

Installing MonetDB from sources is often considered a tedious job.  Compilation of MonetDB on the Linux platforms is based on the bootstrap-configure-make pattern. It can be compiled in any location on your file system and it does not require special (root) permissions.

While the core task of compiling MonetDB itself is extremely simple (on UNIX based platforms), inexperienced users often run into problems caused by many effects of the build environment. The catch is the differences between the target platforms, because not all libraries needed may have been installed by your system administrator.  Typical missing components on a binary distribution such as Fedora or Ubuntu are the xxx-dev packages necessary for MonetDB to compile against the package itself.

MonetDB is a rather large C-based program, and it sometimes closely interacts with the host system it runs on to achieve the best end results for the user.  This makes the code not always as generic as we ultimately would like it to be.  Because we understand that these parts of the code are likely suspects for malfunctioning, we try to test our sources on a nightly basis on as many different platforms that we can access.  This mainly focuses on different architectures, kernels and libcs, but we also test a couple of Linux distributions, for example (See Nightly testing).

To allow doing the right thing on your system, we use autoconf, automake and libtool to build the software.  We aim at being very standard in this respect.   In a nutshell, building the software works with the usual ritual:

% configure

% make

% make install

This process often fails for users in the configure step, but the make step may also give problems.  We will discuss some of the frequent issues people find here in detail.

For Mac OS X instruction please check this page.

Configure

Calling the configure script will check the system for required files and packages, and determine the capabilities of the system, such that the right approach can be taken in certain parts of the code.

Before calling configure, it is wise to examine all available options this script has by using

% configure --help

In particular the --prefix argument will be necessary to be set in most cases.  If you don't know what value to use for the prefix argument, a location in your homedir will typically do, e.g. --prefix=$HOME/MonetDB.

If configure finishes, it gives a conclusion at the end of its output.  Be sure to read this carefully, if a component is disabled this means it will not be built and installed.  At the time of this writing, you will need at least the monetdb5 and sql components.  Unless if you know what you're doing, you should not continue as long as configure reports these components to be disabled.

When a component is disabled, it is usually due to a missing dependency.  A dependency is often a library that the component in question relies on for its functionality.  The following two dependencies are known trouble makers on most systems that aren't fairly recent: openssl, libxml2 and pcre or libpcre.  The former is usually available, but a not recent enough version.  The latter two often lack the development headers on systems.  You have to install missing dependencies either through the native package management system of your host system, or by compiling and installing them manually.  Since this in general differs from package to package, we cannot provide generalized help on this topic.  Make sure you install the xxx-dev versions on binary distributions of the necessary packages as well!

Make

After configure has successfully found all it is looking for, make usually churns away for some time, producing quite some output.  It can happen that make aborts:

make[3]: *** [something] Error 1

The real error is usually right above this message.  If you're looking for help, or want to report a bug, always copy at least about 20 lines above this message, as it gives the necessary context to investigate on the problem.

If you passed --enable-strict to configure, then try to configure with --disable-strict and try make again.  This can allow certain type of problems to be ignored, since they are typically non-critical.

Make can fail on a compilation or linking step.  The former usually manifests itself by a message from the compiler like

somefile.c:12: 'rl' may be used uninitialized in this function

the latter typically manifest itself by "undefined references".  For compilation issues Googling is usually the best action to get a clue on what's going on.  The undefined references during linking are often missing libraries, due to missing -L/path/to/dir flags in LDFLAGS.  If you have libraries in use that are in a non-standard location (for example because you installed them yourself), make sure you have the proper includes (-I/path/to/include/dir) in your CPPFLAGS environment variable, and the library search paths (-L/path/to/lib/dir and -rpath=/path/to/lib/dir on ELF-based systems e.g. Linux) in your LDFLAGS such that they can be properly found by configure, libtool, the compiler and the linker.

OS X Install

OS X Install dnedev Tue, 02/25/2020 - 16:39

Install Xcode

To be able to build (almost) any software package on OS X, one needs to have Apple Xcode installed, or at least the Xcode command-line tools. Xcode comes with a C/C++ compiler toolset, which (as of the more recent versions) is based on Clang/LLVM and not GCC.

If you want the most recent version of Xcode you can only obtain it from the Apple Developer Website (if you are a registered Apple developer) or directly from the Apple Mac App Store here.

Run the following command in Terminal in order to install the Xcode command-line tools:

xcode-select --install

Beware, OS X usually ships with a BSD toolset, unlike most Linux distribution that make use of the GNU toolset. This affects not only the compiler, but also tools (such as make, pkgconfig, aws, sed) used in the build scripts of MonetDB.

Installing required packages

There are two community supported package managers that one can use to install the dependencies for building MonetDB on OS X: Homebrew or MacPorts. At this point you would need to pick which package manager to use and install only one of them.

Using Homebrew

To install Homebrew please follow the up-to-date instructions for your OS X version. You can find these on their website found here.

Minimal modules build-set

After Homebrew is installed, you should install the required packages for building MonetDB absolute minimal set of modules. Run the command below to install the required packages (and their dependencies in the brackets): pkg-config; pcre

brew install pkg-config pcre openssl

Default modules build-set

Keep reading/installing if you want to build the other MonetDB modules included in the default build-set.

* JDBC & control
Install JDK 7, which you can get from the Oracle website. Note: You need the JDK, not the JRE.
After JDK in installed, set the JAVA_HOME variable and add the JDK bin directory to your shell's PATH. To do this, add the lines below to your .profile, .bashrc or .bash_profile (or the corresponding config file for your shell).

export JAVA_HOME=$(/usr/libexec/java_home) export PATH=${JAVA_HOME}/bin:$PATH

Install Apache Ant

brew install ant

* ODBC
Install unixODBC

brew install unixodbc

* GEOM
Install geos

brew install geos

Other

* sphinxclient
Install libsphinxclient

brew install libsphinxclient

* GSL
Install gsl

brew install gsl

* FITS
Install cfitsio

brew install cfitsio

Before building

You most likely need to install automake, autoconf, libtool, gettext and readline  from Homebrew as well. These will also be required to run the bootstrap script (see below).

brew install autoconf automake libtool gettext readline

Putting it all together

brew install autoconf automake libtool gettext readline pkg-config pcre openssl unixodbc geos gsl cfitsio

Using MacPorts

As an alternative to Homebrew, you can also use MacPorts to obtain the required packages. To install MacPorts please follow the up-to-date instructions for your OS X version. You can find these on their website here. Do not forget to restart your command prompt shell (since MacPorts will add new extensions to your PATH) and run:

sudo port -v selfupdate

Minimal modules build-set

After MacPorts is installed, you should install the required packages for building MonetDB absolute minimal set of modules. Run the command below to install the required packages (and their dependencies in the brackets): pkgconfig (libiconv); openssl (zlib); pcre (bzip2, ncurses, libedit); libxml2 (expat, gettext, xz).

sudo port install pkgconfig openssl pcre libxml2

Default modules build-set

Keep reading/installing if you want to build the other MonetDB modules included in the default build-set.

* JDBC & control
Install JDK 7, which you can get from the Oracle website. Note: You need the JDK, not the JRE.
After JDK in installed, set the JAVA_HOME variable and add the JDK bin directory to your shell's PATH. To do this, add the lines below to your .profile, .bashrc or .bash_profile (or the corresponding config file for your shell).

export JAVA_HOME=$(/usr/libexec/java_home) export PATH=${JAVA_HOME}/bin:$PATH

Install Apache Ant, which can be downloaded its website [https://ant.apache.org/bindownload.cgi]. Unpack the package in a directory in your home space, e.g. in /Users/<username>/tools/apache-ant. After that set the ANT_HOME variables and update the PATH like this:

export ANT_HOME=/Users/<username>/tools/apache-ant export PATH=${ANT_HOME}/bin:$PATH

* ODBC
Install unixODBC (libtool, readline)

sudo port install unixodbc

Add the line below when configuring the build to point the tool to the library (see Configuring below).

--with-unixodbc =/opt/local --with-readline=/opt/local

* GEOM
Install geos

sudo port install geos

Other

* curl
Install curl (curl-ca-bundle, libind)

sudo port install curl

* libmicrohttpd
Install libmicrohttpd (gmp, libtasn1, nettle, libffi, glib2, popt, desktop-file-utils, libxslt, p11-kit, gnutls, libgpg-error, libgcrypt)

sudo port install libmicrohttpd

* liburiparser
Install uriparser

sudo port install uriparser

* sphinxclient

Install libsphinxclient

sudo port install libsphinxclient

Add the line below when configuring the build to point the tool to the library (see Configuring below).

--with-sphinxclient=/opt/local

* GSL
Install gsl

sudo port install gsl

* FITS
Install cfitsio

sudo port install cfitsio

Before building

You will probably need to install automake (gdbm, perl5.16, perl5) and autoconf from MacPorts as well. These will also be required to run the bootstrap script (see below).

sudo port install automake autoconf

Putting it all together

sudo port install automake autoconf pkgconfig openssl pcre libxml2 unixodbc bison geos gsl cfitsio curl libmicrohttpd uriparser libsphinxclient

Obtaining sources and bootstrapping

Sources of the latest released and testing versions on MonetDB can be obtained from the repository: released or testing. Download the selected file and unpack it in a directory.

Cloning MonetDB Mercurial repository

Optionally, bleeding egde sources can be directly obtained from the MonetDB Mercurial repository. To do that make sure you have Mercurial installed on your system first - latest version available here. The clone the MonetDB repository. Note this can take some time and a lot of data will be transferred.

hg clone http://dev.monetdb.org/hg/MonetDB/

Bootstrapping

Sources check-out from Mercurial need to be bootstrapped first, before configure can be run. To do that, go in the directory you in check you cloned the repository and run:

./bootstrap

* If you are using Homebrew, you will also need to set M4DIRS to the directory where the Homebrew gettext M4 macros are in. The same holds for the OpenSSL libs and includes:

export M4DIRS=/usr/local/opt/gettext/share/aclocal
export LDFLAGS=-L/usr/local/opt/openssl/lib
export CPPFLAGS=-I/usr/local/opt/openssl/include

Configuring build

With Homebrew

Before starting the configure tool, go in the directory where the MonetDB sources are, and create a subdirectory BUILD. Go in the BUILD directory and run the following command, where prefix is the location where you would like MonetDB installed.

../configure --prefix=<full-path>

With MacPorts

Before starting the configure tool, go in the directory where the MonetDB sources are, and create a subdirectory BUILD. Go in the BUILD directory and run the following command, where prefix is the location where you would like MonetDB installed.

../configure --prefix=<full-path> --with-libiconv-prefix=/opt/local

* If you want to build the ODBC driver or enable sphinxclient support, do not forget to add the appropriate arguments, e.g.:

../configure --prefix=<full-path> --with-libiconv-prefix=/opt/local --with-sphinxclient=/opt/local --with-unixodbc =/opt/local --with-readline=/opt/local

Note: If you don't have root/admin permission for your system you will not be able to deploy the RubyGem. If you don't need the RubyGem simply disable with with the option --without-rubygem

Building

To build the configured modules, simply run from the <tt>BUILD</tt> directory, using the -j<number> argument for a parallel build. E.g.

make -j4

Once the build process finishes, install MonetDB in the prefix location set during the configure process.

make install

Note: If you have built the RubyGem, on OS X you will need to use sudo make install to deploy it.

Windows Install

Windows Install giulia Tue, 02/25/2020 - 16:37

On Windows the first step is to initialize a MonetDB server by simply opening: 'Start -> Programs -> MonetDB -> Start server'. 

Subsequently you can start the textual interface (mclient) by opening: 'Start -> Programs -> MonetDB -> SQL client'.  The commands entered there are identical to those found on other platforms. To stop the server, you can simply close the MonetDB SQL Server window. 

Note the server by default only accepts connections originating from the local host. If you need other machines to access your database, change the configuration file by setting mapi_open=yes.

Exploring the wealth of functionality offered by MonetDB/SQL is best started using a toy database. For this we use the VOC database which provides a peephole view into the administrative system of an early multi-national company, the VOC (or Dutch East India Company).

Download the VOC dataset voc_dump.zip (542K) gz (519K) bz2 (371K) which is a compressed file with SQL statements (SQL database dump). After the file has been extracted, load its contents into MonetDB using the mclient.exe file. From within mclient run:

sql> \< voc_dump.sql

Since the SQL interface is the same on all platforms, from here on you can follow the later part of the regular SQL tutorial.

Changing database location

If you need (or want) to change the location of the database then editing of the monetdb script is required. Change the mclient.bat and M5server.bat files in 'C:\Program Files\MonetDB\MonetDB5'. In the M5server.bat file you would need to change the setting of the MONETDBFARM variable, and in mclient.bat you would need to add a -d option to the call of mclient.exe.  Note: you upgrade such changed files will be overwritten.

About the VOC dataset

The data for the tutorial is published in the book J.R. Bruijn, F.S. Gaastra and I. Schaar Dutch-Asiatic Shipping in the 17th and 18th Centuries, which gives an account of the trips made to the East and ships returned safely (or wrecked on the way) by the VOC (or Dutch East India Company). A total of 8000 records are provided. They include information about ship name and type, captain, the arrival/departure of harbors along the route, personnel accounts, and anecdotal information. You can find more about the VOC on Wikipedia.

Server Setup and Configuration

Server Setup and Configuration mk Tue, 02/18/2020 - 11:03

Starting a MonetDB database server is as simple as starting the deamon program monetdbd.  At the time of this writing, monetdbd is only available on UNIX-like platforms. The MonetDB daemon program encapsulates an arbitrary number of database servers, and takes care of their configuration.  The general concept of the MonetDB daemon program is that it builds upon a given directory, often referred to as the dbfarm.  There can be at most one instance of monetdbd monitoring a dbfarm, but multiple monetdbd processes can be running on the same system.

The next (optional) step is to create a .monetdb configuration file with information to ease using the Monetdb client interface, called mclient.

shell> cat ~/.monetdb
user=<yourname>
password=<yourpassword>
language=<"sql" or "mal">

In the simple case, monetdbd is started by the system through some init script, and databases are created by the local DBA, and assigned to users.  A more flexible approach for users, however, is the case where a user runs monetdbd itself.  Because a dbfarm can be anywhere in the filesystem, and needs nothing more than just being an ordinary directory, a user can use monetdbd to initialise his/her own dbfarm in an accessible place.  For example:

% monetdbd create ~/my-dbfarm
% monetdbd get all ~/my-dbfarm
property         value
hostname         vomac.ins.cwi.nl
dbfarm           /Users/fabian/my-dbfarm
status           no monetdbd is serving this dbfarm
mserver          unknown (monetdbd not running)
logfile          merovingian.log
pidfile          merovingian.pid
sockdir          /tmp
port             50000
exittimeout      60
forward          proxy
discovery        true
discoveryttl     600
control          no
passphrase       <unknown>
mapisock         /tmp/.s.monetdb.50000
controlsock      /tmp/.s.merovingian.50000

After creation of the personal dbfarm directory, the get command can be used to inspect the dbfarm-wide settings of the newly created environment.  Note that the above output differs per system, per user and per dbfarm.  Important setting is the port to use to open up services for clients.  In particular for a user installation, these often may conflict with another running monetdbd, hence it is wise to change these to a private port in situations where such conflicts may arise:

% monetdbd set port=54321 ~/my-dbfarm
% monetdbd get all ~/my-dbfarm
property         value
hostname         vomac.ins.cwi.nl
dbfarm           /Users/fabian/my-dbfarm
status           no monetdbd is serving this dbfarm
mserver          unknown (monetdbd not running)
logfile          merovingian.log
pidfile          merovingian.pid
sockdir          /tmp
port             54321
exittimeout      60
forward          proxy
discovery        true
discoveryttl     600
control          no
passphrase       <unknown>
mapisock         /tmp/.s.monetdb.54321
controlsock      /tmp/.s.merovingian.54321

Note that control defaults to no.  For remote control connections, control has to be set to yes, and a passphrase has to be set as well.  When set, it will show as a hash in get's output.

When the settings look ok, the next step is to simply start monetdbd:

% monetdbd start ~/my-dbfarm

If the above command returned without any message, the monetdbd process was successfully started.  The logfile (default merovingian.log) should show that the daemon has started, bound some ports, and discovered itself.

From this part on, monetdbd can be given commands through the monetdb command (without the trailing 'd').  The monetdb client utility knows commands much like monetdbd, but it is much more focussed towards giving information about what is going on inside monetdbd.  The monetdb client utility is the main tool to use to manage the dbfarm.

The dbfarm we created is empty, as such there are no databases inside it.  With monetdb we can create a new database:

% monetdb create my-first-db
created database in maintenance mode: my-first-db

This simple command created the necessary bits inside the dbfarm for a database called my-first-db.  It is created in maintenance mode, since otherwise anyone could access it before it is properly setup.  If the command gives an access-denied error, most probably another monetdbd is running on the system.  In that case try adding -p54321(the port) as arguments to monetdb in order to tell it which monetdbd to contact as in the status example.  In the status view of monetdb the newly created database will show up as a locked database:

% monetdb -p54321 status
     name        state     uptime       health
my-first-db     locked             

Albeit the database is locked, it can be started, such that the monetdb superuser can access the database:

% monetdb start my-first-db
starting database 'my-first-db'... done

% monetdb status
     name        state     uptime       health
my-first-db     locked           18s  100%,  0s

The database remains locked, but the uptime field indicates the database is actuallty running.  If setting up the database is ready, the database can be made available for other users (if any, see MonetDB SQL tutorial) to connect to it using the release command:

% monetdb release my-first-db
taken database out of maintenance mode: my-first-db

% monetdb status
     name        state     uptime       health
my-first-db     running       2m 48s  100%,  0s

Next thing, making a connection to the database using mclient requires the -d argument to specify which database to connect to.  It will ask for username and password when none given or provided via a .monetdb file.  The default username/password is monetdb/monetdb:

% mclient -dmy-first-db
user(fabian):monetdb
password:<monetdb>
Welcome to mclient, the MonetDB/SQL interactive terminal (Dec2011-SP1)
Database: MonetDB v11.7.7 (Dec2011-SP1), 'mapi:monetdb://vomac.ins.cwi.nl:54321/my-first-db'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>

Like for the monetdb command, if this fails with some database does not exist error, try giving the port (not the controlport!) to mclient such that it finds the correct monetdbd, e.g.:

% mclient -p54321 -dmy-first-db

It may be necessary to stop individual databases, or even make them unavailable.  The monetdb commands stopand lockcan perform these tasks.  To completely stop the monetdbd daemon process for a dbfarm, stop it using the stop command of monetdbd:

% monetdbd stop ~/my-dbfarm

This should stop the monetdbd and all the databases that are started inside it.  To verify that monetdbd has indeed stopped, review the end of the merovingian.log file.  It should report all databases being shut down, and the deamon shutting down.

Cluster Management

Cluster Management mk Thu, 03/18/2010 - 17:55

Topic How to set up a cluster of three MonetDB servers belonging to the same group.

Building a distributed system using the MonetDB infrastructure requires a number of machines that run monetdbd (see its man-page for details). Assume we have three worker machines, called cruquius, lijnden and leeghwater. Provided MonetDB/SQL has been compiled and installed on each of the workers, perform the following steps on each worker to get monetdbd running:

worker% monetdbd create /fullpath/to/mydbfarm
worker% monetdbd set passphrase=mypassphrase /fullpath/to/mydbfarm
worker% monetdbd start /fullpath/to/mydbfarm

This creates the location for the database on each worker and sets the private control passphrase for the MonetDB daemon to "mypassphrase". The final step is to activate the daemon itself. This is basically all you need to initialise a cluster of MonetDB instances.

Assume you have likewise initialized MonetDB on your desktop, named willem, from which you will control the MonetDB instances. On the desktop machine you can use the following command to show which machines guarded with monetdbd can be reached:

willem% monetdb discover

Amongst others it will show you the workers you just initialized. From our machine willem we can further work with the cluster by addition of a database called pump on each node as follows:

willem% monetdb -h cruquius -P mypassphrase create pump
willem% monetdb -h lijnden -P mypassphrase create pump
willem% monetdb -h leeghwater -P mypassphrase create pump

The discover command will not show them yet, as they are still in maintenance mode.

Beware, these three identically named databases are not related to each other in any way, i.e. access control and database content can be different for each individual database pump. They are logical names to be refined using role tags to designate their function. Consider we use two server instances to act as harvey[1,2] and one as fox. How these functionalities behave is beyond the control of the MonetDB daemon. It suffices to setup the shared property tag to designate them all to belong to the same group.

willem% monetdb -h cruquius -P mypassphrase set shared=pump/2/harvey pump
willem% monetdb -h lijnden -P mypassphrase set shared=pump/2/fox pump
willem% monetdb -h leeghwater -P mypassphrase set shared=pump/1/harvey pump

Now that we have set them up, they can be made available for use:

willem% monetdb -h cruquius -P mypassphrase release pump
willem% monetdb -h lijnden -P mypassphrase release pump
willem% monetdb -h leeghwater -P mypassphrase release pump

An mclientconnection can now go to any of the hosts directly, or use a pattern to find the right machine. Given the example above with our tags, for instance the following two mclient calls connect to lijndenand round-robin to cruquius and leeghwater respectively:

willem% mclient -d '*/pump/2'
willem% mclient -d '*/pump/*/harvey'

Client Authentication

Client Authentication giulia Tue, 02/18/2020 - 13:50

Under construction

MonetDB supports client authentication based on password.

Database Roles, Privileges, Sessions

Database Roles, Privileges, Sessions giulia Tue, 02/18/2020 - 14:16

The user access grants are organized by authorization groups.

sys.auths
name type references description
"id" INTEGER   The unique authorization identifier.
"name" VARCHAR   The SQL authorization name of the user or role.
"grantor" INTEGER sys.auths.id when grantor > 0 The corresponding authorization grantor.

 

sys.users   and   sys.db_user_info
name type references description
"name" VARCHAR sys.auths.name The unique user login name.
"fullname" VARCHAR   Full name of user.
"default_schema" INTEGER sys.schemas.id The default schema for this user.

 

sys.roles
name type references description
"id" INTEGER   The unique role identifier.
"name" VARCHAR   The role name.
"grantor" INTEGER sys.auths.id when grantor > 0 The corresponding authorization grantor.

 

sys.user_role
name type references description
"login_id" INTEGER sys.auths.id The corresponding user identifier.
"role_id" INTEGER sys.roles.id The corresponding role identifier.

 

sys.privileges
name type references description
"obj_id" INTEGER sys.ids.id The identifier of the object for which privilege(s) are given to a certain user or role.
"auth_id" INTEGER sys.auths.id The id of the user or role.
"privileges" INTEGER sys.privilege_codes.privilege_code_id Privileges classification code, see table sys.privilege_codes.
"grantor" INTEGER sys.auths.id when grantor > 0 The corresponding grantor authorization identifier.
"grantable" INTEGER   Permission to pass privilege rights to other users or roles.
sys.privilege_codes
name type references description
"privilege_code_id" INTEGER   The unique internal code of a privilege or combination of privileges.
"privilege_code_name" VARCHAR   A description of the privilege(s) associated with this code.

 

sys.sessions
name type references description
"user" VARCHAR sys.users.name The user who created the session.
"login" TIMESTAMP   The login start timestamp of the session.
"sessiontimeout" BIGINT   The session timeout time in seconds, 0 means timeout is disabled.
"lastcommand" TIMESTAMP   The start timestamp of the last issued command.
"querytimeout" BIGINT   The query execution timeout time in seconds, 0 means timeout is disabled.
"active" BOOLEAN   Whether the session is active or not.

Distributed Query Processing

Distributed Query Processing zhang Sun, 07/19/2015 - 17:43

Support for distributed query processing has been gradually introduced since the Jul2015 release of MonetDB. It adopts a straightforward master - worker architecture according to the following general concepts:

  • Users can transparently query data that is spread (either replicated or sharded) over multiple MonetDB instances in a local cluster. 
  • Every instance in the cluster is a full-fledged MonetDB server, so they have equal functionality. The system can be set up in such a way that a MonetDB client can establish a connection to any one of those instances and run all supported queries.
  • Distributed queries are queries which involve data on remote MonetDB instances. The instance on which the query was started automatically becomes the master (for this particular query); the instances containing remote data addressed by this query are workers.
  • The master is responsible for parsing the query, generating distributed query plan, sending the subqueries to the workers and merging subquery results into final results.
  • Every instance can act as a master and a worker, depending on data placement and where a distributed query has been started.

To identify data distributed over multiple MonetDB instances, we have extended MonetDB with three new types of SQL TABLEs:

  • Remote table: one can create a REMOTE TABLE on one MonetDB server to refer to a table that physically exists on another (i.e. remote) MonetDB server by giving the URL of the remote server and the name of the referred table on the remove server.
  • Replica table: one can create a REPLICA TABLE and add both local tables (i.e. normal SQL tables that were created on the same MonetDB server as this REPLICA TABLE) and remote tables (i.e. the aforementioned REMOTE TABLEs) to it. The keyword REPLICA indicates that all members are an exact replica of each other and the system can use whichever member is most convenient.
    • NB: for performance reasons, MonetDB does not check if the replica tables are indeed identical. This responsibility is currently left to the database users.
  • Merge table: one can create a MERGE TABLE and add both local and remote tables to it to form a single big table. The result is a UNION ALL of all partition tables in this MERGE TABLE and the system will potentially have to access all members to answer a query. More details about merge tables can be found in Data Partitioning.

With these extensions, users now have different ways to query data on a remote server: directly on the remote server as a normal table, or through the local server as a REMOTE TABLE or as part of a REPLICA TABLE/MERGE TABLE.

Queries involving REMOTE TABLEs are automatically split into subqueries by the master and executed on remote workers. The combination of MERGE, REPLICA and REMOTE tables enables fine control of how to distribute data and query workloads to maximally benefit from the available CPU and RAM resources.

Examples

Here we show several simple examples of how distributed queries on REPLICA and MERGE tables containing both normal local and REMOTE tables work.

Step 1: set up a small cluster of three MonetDB servers mdb1, mdb2 and mdb3 running on ports 60001, 60002 and 60003, respectively.

$ mserver5 --dbpath=/tmp/mdb1 --set mapi_port=60001 --set monet_daemon=yes &
$ mserver5 --dbpath=/tmp/mdb2 --set mapi_port=60002 --set monet_daemon=yes &
$ mserver5 --dbpath=/tmp/mdb3 --set mapi_port=60003 --set monet_daemon=yes &

Step 2: connect to the MonetDB servers to create and populate two simple tables on each of them. The tables s1, s2 and s3 will be used in a REPLICA TABLE, so we insert the same data into them. The tables t1, t2 and t3 will be used in a MERGE TABLE, so we insert different values into them:

$ mclient -d mdb1 -u monetdb -p 60001
...
sql>CREATE TABLE s1 (i INT);
operation successful
sql>INSERT INTO s1 VALUES (23), (42);
2 affected rows
sql>CREATE TABLE t1 (s VARCHAR(10));
operation successful
sql>INSERT INTO t1 VALUES ('abc'), ('efg');
2 affected rows
$ mclient -d mdb2 -u monetdb -p 60002
...
sql>CREATE TABLE s2 (i INT);
operation successful
sql>INSERT INTO s2 VALUES(23), (42);
2 affected rows
sql>CREATE TABLE t2 (s varchar(10));
operation successful
sql>INSERT INTO t2 VALUES ('foo'), ('bar');
2 affected rows
$ mclient -d mdb3 -u monetdb -p 60003
...
sql>CREATE TABLE s3 (i INT);
operation successful
sql>INSERT INTO s3 VALUES (23), (42);
2 affected rows
sql>create table t3 (s varchar(10));
operation successful
sql>insert into t3 values ('baz'), ('qux');
2 affected rows

Example 1: create and query REMOTE TABLEs

We choose mdb3 as the master, and mdb1 and mdb2 as the workers. So we continue using the connection to mdb3, and create remote tables there to refer to the tables we have created earlier on mdb1 and mdb2.

Important things to know about REMOTE TABLEs:

  • The format of the URL of a REMOTE TABLE is: mapi:monetdb://<host>:<port>/<dbname>, where all three parameters (i.e. host, port and dbname) are compulsory.
  • The declaration of a REMOTE TABLE must match exactly the signature of its counterpart in the remote database, i.e., the same table name, the same columns names and the same column data types.
  • Currently, at the creation time of a remote table, the remote database server is not contacted to verify the existence of the table. When a CREATE REMOTE TABLE report “operation successful”, it merely means that the information about the new remote table has been added to the local SQL catalogue. The check at the remote database server is delayed until the first actual query on the remote table.
sql>CREATE REMOTE TABLE s1 (i int) on 'mapi:monetdb://localhost:60001/mdb1';
operation successful
sql>CREATE REMOTE TABLE t1 (s varchar(10)) on 'mapi:monetdb://localhost:60001/mdb1';
operation successful
sql>CREATE REMOTE TABLE s2 (i int) on 'mapi:monetdb://localhost:60002/mdb2';
operation successful
sql>CREATE REMOTE TABLE t2 (s VARCHAR(10)) ON 'mapi:monetdb://localhost:60002/mdb2';
operation successful
sql>SELECT * FROM s1;
+------+
| i    |
+======+
|   23 |
|   42 |
+------+
2 tuples
sql>SELECT * FROM t1;
+------+
| s    |
+======+
| abc  |
| efg  |
+------+
2 tuples
sql>SELECT * FROM s2;
+------+
| i    |
+======+
|   23 |
|   42 |
+------+
2 tuples
sql>SELECT * FROM t2;
+------+
| s    |
+======+
| foo  |
| bar  |
+------+
2 tuples

Example 2: create and query a REPLICA TABLE

Continue using the connection to mdb3, we create a REPLICA TABLE repS, and add s1, s2 and s3 into it. So, repS contains 3 replicas, one local replica and two remote replicas. Selecting from repS returns results as if it was a single table. The logical query plan shows that only one replica is used. (NB: the local replica should have been chose. This is a performance bug, see also https://www.monetdb.org/bugzilla/show_bug.cgi?id=6620)

sql>CREATE REPLICA TABLE repS (i INT);
operation successful
sql>ALTER TABLE repS ADD TABLE s2;
operation successful
sql>ALTER TABLE repS ADD TABLE s1;
operation successful
sql>ALTER TABLE repS ADD TABLE s3;
operation successful
sql>SELECT * FROM repS;
+------+
| i    |
+======+
|   23 |
|   42 |
+------+
2 tuples
sql>PLAN SELECT * FROM repS;
+--------------------------------------------------------------------------------------+
| rel                                                                                  |
+======================================================================================+
| project (                                                                            |
| | table                                                                              |
| | | REMOTE(sys.s2)                                                                   |
| | | |   [ "s2"."i" as "reps"."i", "s2"."%TID%" NOT NULL as "reps"."%TID%" ]          |
| | | |   REMOTE mapi:monetdb://localhost:60002/mdb2                                   |
| | | |     [ "reps"."i", "reps"."%TID%" NOT NULL ]                                    |
| ) [ "reps"."i" ]                                                                     |
+--------------------------------------------------------------------------------------+
4 tuples

Example 3: create and query a MERGE TABLE

Continue using the connection to mdb3, we create a MERGE TABLE mrgT, and add t1, t2 and t3 into it. So, mrgT contains 3 partitions, one local partition and two remote partitions. Selecting from mrgT returns a union of all data in its partition tables without duplicate elimination. The logical query plan shows that the WHERE condition is properly pushed down to the workers.

sql>CREATE MERGE TABLE mrgT (s VARCHAR(10));
operation successful
sql>ALTER TABLE mrgT ADD TABLE t2;
operation successful
sql>ALTER TABLE mrgT ADD TABLE t1;
operation successful
sql>ALTER TABLE mrgT ADD TABLE t3;
operation successful
sql>SELECT * FROM mrgT WHERE s <> 'bla';
+------+
| s    |
+======+
| foo  |
| bar  |
| abc  |
| efg  |
| baz  | 
| qux  | 
+------+
6 tuples
sql>PLAN SELECT * FROM mrgT WHERE s <> 'bla';
+-----------------------------------------------------------------------------------+
| rel                                                                               |
+===================================================================================+
| union (                                                                           |
| | union (                                                                         |
| | | table                                                                         |
| | | | project (                                                                   |
| | | | | select (                                                                  |
| | | | | | REMOTE(sys.t2) [ "t2"."s" as "mrgt"."s" ] COUNT                         |
| | | | | ) [ "mrgt"."s" != varchar(10) "bla" ]                                     |
| | | | ) [ "mrgt"."s" ] REMOTE mapi:monetdb://localhost:60002/mdb2 [ "mrgt"."s" ], |
| | | table                                                                         |
| | | | project (                                                                   |
| | | | | select (                                                                  |
| | | | | | REMOTE(sys.t1) [ "t1"."s" as "mrgt"."s" ] COUNT                         |
| | | | | ) [ "mrgt"."s" != varchar(10) "bla" ]                                     |
| | | | ) [ "mrgt"."s" ] REMOTE mapi:monetdb://localhost:60001/mdb1 [ "mrgt"."s" ]  |
| | ) [ "mrgt"."s" ],                                                               |
| | project (                                                                       |
| | | select (                                                                      |
| | | | table(sys.t3) [ "t3"."s" as "mrgt"."s" ] COUNT                              |
| | | ) [ "mrgt"."s" != varchar(10) "bla" ]                                         |
| | ) [ "mrgt"."s" ]                                                                |
| ) [ "mrgt"."s" ]                                                                  |
+-----------------------------------------------------------------------------------+
21 tuples

Example 4: join a MERGE TABLE with a REPLICA TABLE

Continue using the connection to mdb3, we conduct a join between repS and mrgT. It returns a cross product of one set of values in repS and all values in mrgT. The logical query plan shows that the replicas s1 and s2 that are respectively local to t1 and t2 are joined with t1 and t2. (NB: t3 should have been joined with s3, instead of s2. This is a performance bug, see also https://www.monetdb.org/bugzilla/show_bug.cgi?id=6620)

sql>SELECT * FROM repS, mrgT;
+------+------+
| i    | s    |
+======+======+
|   23 | foo  |
|   23 | bar  |
|   42 | foo  |
|   42 | bar  |
|   23 | abc  |
|   23 | efg  |
|   42 | abc  |
|   42 | efg  |
|   23 | baz  |
|   23 | qux  |
|   42 | baz  |
|   42 | qux  |
+------+------+
12 tuples
sql>plan select * from repS, mrgT;
+--------------------------------------------------------------------------------------------------------------+
| rel                                                                                                          |
+==============================================================================================================+
| union (                                                                                                      |
| | union (                                                                                                    |
| | | table                                                                                                    |
| | | | project (                                                                                              |
| | | | | crossproduct (                                                                                       |
| | | | | | REMOTE(sys.s2) [ "s2"."i" as "reps"."i", "s2"."%TID%" NOT NULL as "reps"."%TID%" ],                |
| | | | | | project (                                                                                          |
| | | | | | | REMOTE(sys.t2) [ "t2"."s" as "mrgt"."s" ] COUNT                                                  |
| | | | | | ) [ "mrgt"."s" ]                                                                                   |
| | | | | ) [  ]                                                                                               |
| | | | ) [ "reps"."i", "mrgt"."s" ] REMOTE mapi:monetdb://localhost:60002/mdb2 [ "reps"."i", "mrgt"."s" ],    |
| | | table                                                                                                    |
| | | | project (                                                                                              |
| | | | | crossproduct (                                                                                       |
| | | | | | REMOTE(sys.s1) [ "s1"."i" as "reps"."i", "s1"."%TID%" NOT NULL as "reps"."%TID%" ],                |
| | | | | | project (                                                                                          |
| | | | | | | REMOTE(sys.t1) [ "t1"."s" as "mrgt"."s" ] COUNT                                                  |
| | | | | | ) [ "mrgt"."s" ]                                                                                   |
| | | | | ) [  ]                                                                                               |
| | | | ) [ "reps"."i", "mrgt"."s" ] REMOTE mapi:monetdb://localhost:60001/mdb1 [ "reps"."i", "mrgt"."s" ]     |
| | ) [ "reps"."i", "mrgt"."s" ],                                                                              |
| | project (                                                                                                  |
| | | crossproduct (                                                                                           |
| | | | table                                                                                                  |
| | | | | REMOTE(sys.s2) [ "s2"."i" as "reps"."i", "s2"."%TID%" NOT NULL as "reps"."%TID%" ] REMOTE mapi:monet |
: db://localhost:60002/mdb2 [ "reps"."i", "reps"."%TID%" NOT NULL ],                                           :
| | | | project (                                                                                              |
| | | | | table(sys.t3) [ "t3"."s" as "mrgt"."s" ] COUNT                                                       |
| | | | ) [ "mrgt"."s" ]                                                                                       |
| | | ) [  ]                                                                                                   |
| | ) [ "reps"."i", "mrgt"."s" ]                                                                               |
| ) [ "reps"."i", "mrgt"."s" ]                                                                                 |
+--------------------------------------------------------------------------------------------------------------+
31 tuples

 

Data Partitioning

Data Partitioning zhang Sun, 07/19/2015 - 16:48

The prime method to enable finer control of locality of data access during query evaluation is to horizontally partition the tables in a database. This is made possibly by introducing the concept of MERGE TABLE, which allows a virtual table to be defined as the union of its partitions.

Creating MERGE TABLE-s

The SQL statements below show how to create a MERGE TABLE mt containing two partitions t1 and t2. Then the partition tables can be queried both individually and jointly.

    -- Create partition tables with identical column types, and add some values.
    -- The column names are not significant.
    CREATE TABLE t1 (i int);
    CREATE TABLE t2 (j int);
    INSERT INTO t1 VALUES (11), (13);

    INSERT INTO t2 VALUES (23), (27);

    -- Create a MERGE TABLE
    CREATE MERGE TABLE mt1 (t int);

    -- Add the partition tables into the MERGE TABLE
    ALTER TABLE mt1 ADD TABLE t1;
    ALTER TABLE mt1 ADD TABLE t2;

    sql> -- sanity check
    sql> SELECT count(*) FROM t1;
    +------+
    | L1   |
    +======+
    |    2 |
    +------+
    1 tuple (2.231ms)
    sql> SELECT count(*) FROM t2;
    +------+
    | L1   |
    +======+
    |    2 |

    +------+
    1 tuple (2.226ms)
    sql> SELECT count(*) FROM mt1;
    +------+
    | L1   |
    +======+
    |    4 |
    +------+
    1 tuple (3.455ms)

Two MERGE TABLE can contain overlapping partition tables. A MERGE TABLE can also contain other MERGE TABLE-s:

    CREATE TABLE t3 (k int);
    INSERT INTO t3 VALUES (31), (37);

    CREATE TABLE t4 (l int);
    INSERT INTO t4 VALUES (41), (47);

    -- An overlapping MERGE TABLE (with mt1)
    CREATE MERGE TABLE mt2 (t int);

    ALTER TABLE mt2 ADD TABLE t1;
    ALTER TABLE mt2 ADD TABLE t3;

    -- A MERGE TABLE of MERGE TABLE
    CREATE MERGE TABLE mt3 (t int);
    ALTER TABLE mt3 ADD TABLE mt1;

    ALTER TABLE mt3 ADD TABLE t4;

    sql> -- sanity check
    sql> SELECT * FROM mt2;

    +------+
    | t    |
    +======+
    |   11 |
    |   13 |
    |   31 |
    |   37 |
    +------+
    4 tuples (2.677ms)
    sql> SELECT * FROM mt3;
    +------+
    | t    |
    +======+
    |   11 |
    |   13 |
    |   23 |
    |   27 |
    |   41 |
    |   47 |
    +------+
    6 tuples (2.496ms)

Updating MERGE TABLE-s

As of release Apr2019 (11.33.3) we support updatable merge tables, see updatable-merge-tables for details.

Previously, a MERGE TABLE could not be directly updated. All updates had to be done on the individual partition tables:

    sql> -- All updates must go directly into the individual partition table
    sql> INSERT INTO mt1 VALUES(1);
    INSERT INTO: cannot insert into merge table 'mt1'
    sql> UPDATE mt1 SET t=10 where t=11;
    UPDATE: cannot update merge table 'mt1'
    sql> UPDATE t1 SET i=10 where i=11;
    1 affected row (4.081ms)

One can remove a partition table from a MERGE TABLE using an ALTER TABLE statement. This also enables dropping a partition table, while keep the merge table.

    -- Remove a partition table from a MERGE TABLE
    ALTER TABLE mt1 DROP TABLE t1;

    sql> -- sanity check
    sql> SELECT count(*) FROM mt1;
    +------+
    | L1   |
    +======+
    |    2 |
    +------+
    1 tuple (1.455ms)

When dropping a partition table, all MERGE TABLE-s that depend on it must be dropped first:

    sql> -- Drop table must happen in the correct order of dependency
    sql> DROP TABLE t2;
    DROP TABLE: unable to drop table t2 (there are database objects which depend on it)
    sql> DROP TABLE mt1;
    DROP TABLE: unable to drop table mt1 (there are database objects which depend on it)
    sql> DROP TABLE mt3;
    operation successful (3.048ms)
    sql> DROP TABLE mt1;
    operation successful (1.948ms)
    sql> DROP TABLE t2;
    operation successful (2.329ms)

Improving performance of search queries on MERGE TABLE-s

When performing for example a simple SELECT ... WHERE query on a merge table, logically the query has to be forwarded to each partition table. However based on the predicates in the WHERE clause it is possible for the query optimizer to decide which partition tables have to be scanned or can be skipped when the optimizer has knowledge of certain statistical properties of the partition tables of a MERGE TABLE. To prepare these statistics, the user has to run the ANALYZE command over a MERGE TABLE.

Database Sampling

Database Sampling mk Sat, 10/08/2011 - 09:09

Sampling a database is an essential step to improve the response time for database exploration. In the context of the SciBORQ project, we have introduced a number of sampling techniques in the MonetDB software stack. Our goal is to provide methods for performing sampling over a) the result of a query, b) the base tables, and c) the entire database schema. Sampling can be performed during query execution, as well as during data loading in the case of predefined sampling indexes. Eventually, we want to support both uniform and weighted sampling, but in the existing released, all sampling methods are uniform. The sampling methods included in the distribution are described below.

Sampling operator
A new SQL operator SAMPLE has been added to support sampling the result of a query.

sample_column:
    SELECT ... FROM ... WHERE ... SAMPLE <expr>

If <expr> is a non-negative integer literal, it defines the number of rows to be included in the sample. If <expr> is a real literal between [ 0.0, 1.0 ]  it refers to the percentage of the result set to be sampled. For example,  if <expr> is 0.3, then the sample will contain 30% of the rows in the query result.

Sampling base tables or subquery results
Because SAMPLE primarily operates on query results, it is treated as the same type of operator as the LIMIT clauses, which according to the SQL:2003 standard, may only be used in the outer most SELECT clause. So, before the Jul2017 release, SAMPLE is not allowed in a subquery; in addition, the SAMPLE operator does not operates on query input data. However, both restrictions can be circumvented using a table producing function, for example

CREATE FUNCTION mysample ()
RETURNS TABLE(col a,...)
BEGIN
   RETURN
     SELECT a,...
     FROM name_table
     SAMPLE 100;
end;

Then one can use the function mysample() to create a sampled table, for instance:

INSERT INTO sample_table (SELECT * FROM mysample());

In this way, we can apply SAMPLE on base tables, before running the actual query.

Uniform sampling implementation
The current sampling methods all use uniform sampling, which is based on the algorithm A as described in the paper "Faster Methods for Random Sampling" by Jeffrey Scott Vitter [1]. Algorithm A is not the fastest one, but it only makes <expr> number of calls in function random() and it is simpler than the other more complex and CPU intensive algorithms in the literature. Instead of performing one random experiment for each row to decide if it should be included in the sample or not, Algorithm A skips <expr> rows and includes the next row found. The algorithm scans the input relation sequentially and maintains the uniqueness and sort properties. The sample is without replacement.

Localization

Localization giulia Tue, 02/18/2020 - 14:30

Under construction

Database migration

Database migration mk Sun, 10/13/2013 - 14:19

The easiest method to move a MonetDB to a new location, or to create a copy for back-up purposes, runs as follows. First stop the database server by calling the monetdb management tool

bash> monetdb lock dbname

When the server is stopped, you can still connect as a database administrator to ensure that all (update) queries have ceased execution (see system monitor). Once satisfied that the activity has ceased, you can either follow the safe route to dump/restore a database, or perform a file-level RSYNC or RCP to copy the complete <dbfarm>/<dbname> directory to another dbfarm location. The MonetDB daemon process attached to the remote <dbfarm> will pick it up and makes it visible for connections. Finally, perform the following operation to continue processing user requests at the orginal server.

bash> monetdb release dbname

Loading Bulk Data

Loading Bulk Data sjoerd Fri, 03/02/2012 - 13:52

There are many ways in which you can insert data into an SQL table.  Using the best method for your application makes a difference between a slow and pleasantly fast experience. The pros and cons of the various methods supported by MonetDB are explained here.

Simple value insertion

The easiest way to insert data into a table defined over three integer columns is to use the INSERT INTO SQL query:

INSERT INTO table_name VALUES (1, 2, 3);

This method is well suited for low volume inserts. However, this is a seriously inefficient way of doing things in MonetDB when you intend to load thousands or even millions of tuples at the same time. MonetDB, by default, runs in auto-commit mode. This means that each SQL insert query is a full-fledged database transaction. This means that this query changes the database and necessarily causes a write to the disk to safeguard the change. This can become a serious bottleneck for hard disk drives. A typical 7,200 rpm SATA drive can do about 70 to 100 I/O operations per second (See Wikipedia). Since each INSERT is at least one I/O operation (depending on, among others, the underlying operating system), this limits throughput severely. Using a SSD instead improves the situation somewhat.

A much better way to do lots of inserts is to make sure the inserts are bundled into a single transaction. This can easily be done by surrounding the inserts with a START TRANSACTION; and COMMIT; commands, as in:

START TRANSACTION;
INSERT INTO table VALUES (1, 2, 3);
INSERT INTO table VALUES (7, 8, 9);
...
COMMIT;

Another way to accomplish the same result is to clear the auto-commit mode in the server. How this is done depends on how you connect to the server. In ODBC this can be done with a call to the function SQLSetConnectAttr(), in JDBC it can be done with a call to the setAutoCommit()method. In the mclient tool, it can be done when using it interactively with the \acommand and by using the -a command line option.

Note that the SQL variant

INSERT INTO table VALUES (1, 2, 3), (3, 4, 5), (6, 7, 8);

is also supported by MonetDB. However, this variant is not to be recommended for large numbers of tuples either. It uses a lot of stack space on the server, and stack space is a very limited commodity. In other words, if you try to insert too many tuples in one go, the query will fail.

A major cost factor in all insertion statements is to decipher the call structure and turning the literals into the internal format. The overhead of the former can be removed with a slightly faster way of inserting data using a prepared query. In ODBC you would first call:

SQLPrepare(hdl, "INSERT INTO table VALUES (?, ?, ?)", SQL_NTS);

Then you need to call SQLBindParameter() for each of the parameters in the query, and finally call

SQLExecute(hdl);

for each insert.  As before, all insert queries should be done in a single transaction.

In JDBC a similar sequence using a PreparedStatement can be used. Also consider using the JDBC batching functionality.

Parallel data insertion

The methods discussed so far share another disadvantage: even if the server has multiple CPUs/cores, the server will only use one at a time for this simple query.  A better way is to use the  COPY INTO statement.  COPY INTO is a single query that takes a complete file (or stream) of comma-separated values (CSV or user defined delimiters) and inserts the data in one go using system cores in parallel.  Since COPY INTO is a single query, whether or not auto-commit mode is enabled doesn't matter. What matters is the intended use of the bulk loaded data. If you plan to use the data only during a single session or in-memory processing then use a CREATE LOCAL TEMPORARY TABLE ... ON COMMIT PRESERVE ROWS. It avoids the IO to securely store the data on disk for later use. For most cases, though, using a CREATE TABLE despite being slower upon first load, gives great benefits when you continue querying in the future.

COPY INTO comes in many variants.  The easiest form is:

COPY INTO table FROM 'file';

The 'file' must be an SQL string that contains the absolute path name of the file to be inserted. The file is read directly by the server, and so has to be accessible to the server's file-system.  By default, the field delimiter is '|', i.e. a vertical bar, and the record delimiter is E'\n', i.e. a newline. The delimiters can be changed:

COPY INTO table from 'file' USING DELIMITERS ',', E'\n';

Often CSV-encoded files use quote characters to surround field values. By default, no quote characters are recognized, but by using the following form, they will be recognized:

COPY INTO table from 'file' USING DELIMITERS ',', E'\n', '"';

When using quote characters, the other delimiters can be part of the value, and the quote character itself can be part of the value if it is preceded by a \ (backslash) or if it is doubled.  The backslash itself must also be doubled.

Often NULL values need to be inserted as well.  By default, a NULL value is indicated by using NULL in the CSV file, but often the actual value used is different, e.g. 'no value'.  This can be indicated by using:

COPY INTO table from 'file' USING DELIMITERS ',', E'\n', '"' NULL AS '';

When the data is not directly accessible to the server but is accessible to the client, the 'file' in the above queries can be replaced by STDIN:

COPY INTO table FROM STDIN;

When using STDIN, the contents of the CSV file should follow the query.  This form is not directly usable in ODBC or JDBC, but see SQLcopyinto.java for an example of how to use this from Java.  If using the mclient tool and the CSV data is in a file by itself, you can use the following command (also see the mclient manual):

mclient -d database  -s  "COPY  INTO  table  FROM  STDIN  USING  DELIMITERS ',',E'\\n','\"'" - < file

The COPY INTO forms mentioned so far share one important drawback.  When a lot of data is to be inserted, the server doesn't know how much memory to pre-allocate for the tables, and so will likely allocate too little.  This means that during the insertion process, the server has to grow the allocated memory area.  This can be an expensive operation.  Therefore, it is better to give the server a count of how many records are to be inserted:

COPY n RECORDS INTO table FROM 'file';

Here n should be a number that is at least as large as the actual number of records to be inserted.  If the file contains more than n records, only n will be inserted, if the file contains fewer, all values will be inserted.  Giving a higher number is especially useful if multiple COPY INTO queries are to be done on the same table.  The first COPY INTO, when the table is still empty, should be the total count of to-be-inserted values so that the server will allocate enough memory when the tables are first created (they are only really created once data is inserted). This form is the fastest way of inserting data into MonetDB.

One more elaboration of the COPY INTO form is to specify an offset into the CSV file.  Often CSV files contain header information that is not part of the actual data.  This header information can be skipped by using the form

COPY n OFFSET m RECORDS INTO table FROM 'file';

Here n is the number of records to insert and m is the first record to be inserted where the first record in the file is called 1 (i.e. counting is 1 based).

For completeness, if the number of records is unknown but there is a header that needs to be skipped, the form is

COPY OFFSET m INTO table FROM 'file';

Error handling

Unfortunately CSV data files may contain errors. They are collected in a predefined system table sys.rejects that contains a copy of the failed line and line number in the source file. The sys.rejects table should be explicitly cleared via sys.clearrejects() before you load a new file.

SELECT * from sys.rejects;

CALL sys.clearrejects();

CSV Bulk Loads

CSV Bulk Loads zhang Fri, 09/01/2017 - 15:01

The COPY INTO command enables fast insertion of multiple tuples from a csv/tsv text file or the standard input. Each tuple in the input file is terminated by a record-separator (default '\n') and fields are separated by the field separator (default '|'). The field values should adhere to the syntax for value literals. Alternative separators should adhere to the SQL lexical syntax for string values. A different NULL value representation can be specified using the NULL AS null_string option. Furthermore, the fields are optionally enclosed with a user defined quote character. The text file should use character data encoded in UTF-8 if specified as file_name, and the same encoding as mclient is used if data are read FROM STDIN.

The input files must be accessible by the database server. Therefore, they must reside on or be accessible to the machine on which the database server is running. Also, the input files must be identified with an absolute path name. The STDIN file designator reads data streaming from the client application. The end of file (EOF) character signifies the end of the sequence.

The full syntax for the COPY INTO command is the following:

COPY [ int_val [ OFFSET int_val ] RECORDS ]
INTO [ schema_name . ] table_name
     [ '(' column_name [ ',' ... ] ')' ]
FROM { file_name [ ',' ... ] | STDIN }
     [ '(' header [ STRING ] [ ',' ... ] ')' ]
     [ ON { CLIENT | SERVER } ]
     [ [ USING ] DELIMITERS  field_separator
       [',' record_separator [ ',' string_quote ] ] ]
     [ NULL [AS] null_string ]
     [ LOCKED ]
     [ BEST EFFORT ]
     [ NO CONSTRAINT ]
     [ FWF '(' pos [ ',' pos [ ... ] ] ')' ]

By default, fields are separated by the pipe character ('|'), records by the new line ('\n') character.

Note that the ON CLIENT and ON SERVER options are introduced in Apr2019 (11.33.3) release and is currently only supported by the mclient program and not supported in other programs or APIs (ODBC, JDBC, python DBI, etc.). The default behavior is ON SERVER.

Note that the various features must be specified in the order shown above. For instance the query:

sql>COPY INTO tbl FROM STDIN BEST EFFORT NULL AS '';
syntax error, unexpected sql NULL, expecting SCOLON in: "copy into tbl from stdin best effort null"

is not accepted by the SQL parser because the BEST EFFORT feature is specified before the NULL AS feature.

COPY INTO from different input sources

COPY INTO is able to read data from different input sources, including standard input, a single CVS file, multiple CSV files and compressed CSV files. For each example, we create the following table to start with:

sql>CREATE TABLE tbl (i INT, s STRING, d DECIMAL(5,2));
operation successful

From standard input

The most straightforward way is to read data from the standard input STDIN:

sql>COPY INTO tbl FROM STDIN;
more>1|abc|0.1
more>2|def|0.2
more>3|ghi|0.3
more>^D
3 affected rows

sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | abc  |    0.10 |
|    2 | def  |    0.20 |
|    3 | ghi  |    0.30 |
+------+------+---------+
3 tuples

From a CSV file

Assume data are stored in a file named /tmp/example1.csv:

$ cat /tmp/example1.csv
1|dfs|3.1
2|ewe|3.23
3|erw|2.43

Note that the csv file does not contain a header, only 3 data rows.

We can load this data using the following COPY INTO query:

sql>COPY INTO tbl FROM '/tmp/example1.csv';
3 affected rows
sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
+------+------+---------+
3 tuples

From multiple CSV files

If the data are distributed across multiple files and have the same schema, one can load them all at once. Assume we have the following example files /tmp/example2.1.csv and /tmp/example2.2.csv:

$ cat /tmp/example2.1.csv
1|dfs|3.1
2|ewe|3.23
3|erw|2.43
$ cat /tmp/example2.2.csv
4|dfs|3.2
5|ewe|3.3
6|erw|2.3

we can use them in the COPY INTO command as follows:

sql>COPY INTO tbl FROM '/tmp/example2.1.csv', '/tmp/example2.2.csv';
6 affected rows
sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
|    4 | dfs  |    3.20 |
|    5 | ewe  |    3.30 |
|    6 | erw  |    2.30 |
+------+------+---------+
6 tuples

From compressed CSV files

One can pass compressed files to COPY INTO to be loaded. Supported compressions include bzip2, gzip, xz and lz4.

First, let us compress our example CSV file:

$ bzip2 -k /tmp/example1.csv
$ gzip -k /tmp/example1.csv
$ xz -k /tmp/example1.csv
$ lz4 -q /tmp/example1.csv
$ ls /tmp/example1.csv.*
/tmp/example1.csv.bz2  /tmp/example1.csv.gz  /tmp/example1.csv.lz4  /tmp/example1.csv.xz

Now, we can load compressed files in one COPY INTO statement:

sql>COPY INTO tbl FROM '/tmp/example1.csv.bz2', '/tmp/example1.csv.gz', '/tmp/example1.csv.lz4', '/tmp/example1.csv.xz';
12 affected rows
sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
+------+------+---------+
12 tuples

Specifying the number of records

It is strongly recommended to specify the maximum number of records to be inserted as this allows the database server to allocate enough space for the table in advance, so that it can avoid guessing and extend the table space repeatably, which may involve potentially expensive copying. If the exact number is unknown, a (slight) overestimation is generally better for performance than an underestimation. Tip: On linux you can use command: wc -l example1.csv to count and output the number of new lines in the CSV file.

If the number of actual records in the input source is smaller than the number of records specified, COPY INTO will simply stop at the end of input.

If the number of actual records in the input source is larger than the number of records specified, COPY INTO will only load the number of specified records.

The example below shows how to specify the number of records (the earlier example CSV file is used). First we give an overestimation, then we give an underestimation:

sql>COPY 4 RECORDS INTO tbl FROM '/tmp/example1.csv';
3 affected rows
sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
+------+------+---------+
3 tuples
sql>DELETE FROM tbl;
3 affected rows
sql>COPY 2 RECORDS INTO tbl FROM '/tmp/example1.csv';
2 affected rows
sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
+------+------+---------+
2 tuples

Specifying an offset

A portion of the input file can be skipped using the OFFSET feature. Specifying an offset n directs COPY INTO to start reading at record n. The offset of the first record is 1. More formally, the offset directive counts n unquoted record separators. Here we use a file named /tmp/example2.csv:

$ cat /tmp/example2.csv
i|s|d
1|dfs|3.1
2|ewe|3.23
3|erw|2.43

Since the first line of the file (representing the header) does not have the same schema as the table, it must be ignored. So, we specify that the COPY INTO should start at the second record (i.e. 'OFFSET 2'). OFFSET can be used either stand-alone or in combination with RECORDS:

sql>DELETE FROM tbl;
2 affected rows
sql>COPY OFFSET 2 INTO tbl FROM '/tmp/example2.csv';
3 affected rows
sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | dfs  |    3.10 |
|    2 | ewe  |    3.23 |
|    3 | erw  |    2.43 |
+------+------+---------+
3 tuples
sql>DELETE FROM tbl;
3 affected rows
sql>COPY 1 OFFSET 2 RECORDS INTO tbl FROM '/tmp/example2.csv';
1 affected rows
sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | dfs  |    3.10 |
+------+------+---------+
1 tuple

Specifying columns to load and their order

By default, a CSV file must contain data for all columns specified in a table, and in the same order as when the table was created. Additional syntax is supported to load data from CSV files that do not satisfy these requirements. Here we show with several examples how this feature works.

Scenario 1: skip column in the input file

Assume we have the following input file:

$cat /tmp/cars.csv
Year|Make|Model
1997|Ford|E350
2000|Mercury|Cougar

However, when loading this file, we want to skip the second column containing the "Make" information:

sql>CREATE TABLE cars (y STRING, make STRING, model STRING);
operation successful
sql>COPY 2 OFFSET 2 RECORDS INTO cars(y, model) FROM '/tmp/cars.csv';
2 affected rows
sql>SELECT * FROM cars;
+------+------+--------+
| y    | make | model  |
+======+======+========+
| 1997 | null | E350   |
| 2000 | null | Cougar | 
+------+------+--------+
2 tuples

Scenario 2: missing column and out of order

In the following CSV file, data for the column "Model" is missing, while the data for the columns "Year" and "Make" are in the reverse order as specified when the table "cars" was created:

$ cat /tmp/cars2.csv
Make|Year
Ford|1997
Mercury|2000

The following example shows how we can tell COPY INTO which columns are contained in the input CSV file and in which order:

sql>COPY 2 OFFSET 2 RECORDS INTO cars(y,make) FROM '/tmp/cars2.csv'(make,y);
2 affected rows
sql>SELECT * FROM cars;
+------+---------+-------+
| y    | make    | model |
+======+=========+=======+
| 1997 | Ford    | null  |
| 2000 | Mercury | null  | 
+------+------+----------+
2 tuples

Scenario 3: specify a format string for date, time or timestamp

The following example shows how we can tell COPY INTO how to interpret the input strings as timestamp values (note that ^D represents the CTRL-D key combination input):

sql>CREATE TABLE t (ts TIMESTAMP);
operation successful
sql>COPY INTO t FROM STDIN(ts '%d/%m/%Y %H:%M:%S');
more>22/12/2017 6:50:23
more>23/12/2017 0:46:47
more>^D
more>2 affected rows
sql>SELECT * FROM t;
+----------------------------+
| ts                         |
+============================+
| 2017-12-22 06:50:23.000000 |
| 2017-12-23 00:46:47.000000 |
+----------------------------+
2 tuples

Customise delimiters

The COPY INTO command is versatile enough to accept records in many formats. The basic idea is that each field of the record is separated from other records using a specified character string, and each record is separated from other records using another character string. Moreover since string values can contain either of those characters strings, they might need to be quoted, so as to be read correctly.

The user can specify the format of the file with

[ [ USING ] DELIMITERS field_separator [ , record_separator [ , string_quote ] ] ]

syntax. The input syntax should then comply to the following grammar:

[ [ [ quote ] [ [ escape ] char ]* [ quote ] ] feldspar ]* record separator

Quote characters in quoted fields may be escaped with a backslash. Field and record separators can be embedded in quoted fields.

By default, fields are separated by the pipe character ('|'), records by the new line ('\n') character.

The queries below show how to
i) change the default field separator with a single character;
ii) change the default field separator with a string of two characters; and
iii) change the default string quote:

sql>COPY INTO tbl FROM STDIN DELIMITERS ',';
more>1,abc,0.1
more>2,def,0.2
more>^D
more>2 affected rows

sql>COPY INTO tbl FROM STDIN DELIMITERS ';;';
more>3;;ghi;;0.03
more>4;;jkl;;0.04
more>^D
more>2 affected rows

sql>COPY INTO tbl FROM STDIN DELIMITERS ';;',E'\n','|';
more>5;;|klm|;;0.05
more>6;;|opq|;;0.06
more>^D
more>2 affected rows

sql>SELECT * FROM tbl;
+------+------+---------+
| i    | s    | d       |
+======+======+=========+
|    1 | abc  |    0.10 |
|    2 | def  |    0.20 |
|    3 | ghi  |    0.03 |
|    4 | jkl  |    0.04 |
|    5 | klm  |    0.05 |
|    6 | opq  |    0.06 |
+------+------+---------+
6 tuples

Customise representation of the NULL value

By default the character sequence "NULL" (case insensitive) represent the NULL value:

sql>CREATE TABLE str (s STRING);
operation successful
sql>COPY INTO str FROM STDIN;
more>NULL
more>null
more>Null
more>nUll
more>^D
more>4 affected rows

sql>SELECT * FROM str;
+------+
| s    |
+======+
| null |
| null |
| null |
| null |
+------+
4 tuples
sql>SELECT * FROM str WHERE s = 'null';
+---+
| s |
+===+
+---+
0 tuples
sql>SELECT * FROM str WHERE s IS NULL;
+------+
| s    |
+======+
| null |
| null |
| null |
| null |
+------+
4 tuples

The NULL AS feature allows a user to specify an alternative string as the representation of the NULL value. Whenever the alternative string is read during the loading process, the NULL value is inserted into the table. Any valid string can be specified as the representation of the NULL value.

In the example below, the first COPY INTO treats an empty string ('') as the NULL value, while the second COPY INTO uses a long text to represent the NULL value, which typically can be used to prevent any text in the input to be treated as the NULL value. The keyword AS is optional.

sql>CREATE TABLE nullas (i INT, s STRING);
operation successful
sql>COPY INTO nullas FROM STDIN NULL AS '';
more>1|
more>2|null
more>3|"NuLL"
more>^D
3 affected rows
sql>COPY INTO nullas FROM STDIN NULL 'this-string-does-not-exist-in-the-input';
more>4|
more>5|null
more>6|NULL
more>^D
3 affected rows

sql>SELECT * FROM nullas;
+------+--------+
| i    | s      |
+======+========+
|    1 | null   |
|    2 | null   |
|    3 | "NuLL" |
|    4 |        |
|    5 | null   |
|    6 | NULL   |
+------+--------+
3 tuples
sql>SELECT * FROM nullas WHERE s IS NULL;
+------+------+
| i    | s    |
+======+======+
|    1 | null |
+------+------+
1 tuple
sql>SELECT * FROM nullas WHERE s = 'null';
+------+------+
| i    | s    |
+======+======+
|    2 | null |
|    5 | null |
+------+------+
1 tuple
sql>SELECT * FROM nullas WHERE s = '"NuLL"';
+------+--------+
| i    | s      |
+======+========+
|    3 | "NuLL" |
+------+--------+
1 tuple
sql>SELECT * FROM nullas WHERE s = 'NULL';
+------+------+
| i    | s    |
+======+======+
|    6 | NULL |
+------+------+
1 tuple

Ignore errors in the input with the BEST EFFORT mode

Normally if there is an error in the input, such as a missing field, COPY INTO will fail. Using the BEST EFFORT mode, COPY INTO continues parsing the input, and writes all the records that failed in the sys.rejects table.

sql>CREATE TABLE cars (y STRING, make STRING, model STRING);
operation successful
sql>COPY INTO cars FROM STDIN DELIMITERS ',',E'\n' BEST EFFORT;
more>2000,
more>2004,Honda,Pilot
more>2001,Honda,Civic
more>,,,
more>3 affected rows
sql>SELECT * FROM cars;
+------+-------+-------+
| y    | make  | model |
+======+=======+=======+
| 2004 | Honda | Pilot |
| 2001 | Honda | Civic |
|      |       |       |
+------+-------+-------+
3 tuples

Above example shows that only the valid records are retained. In this scenario the first record is omitted.

sql>SELECT * FROM rejects;
+-------+-------+------------------------+--------+
| rowid | fldid | message                | input  |
+=======+=======+========================+========+
|     0 |     1 | Column value 2 missing | 2000,  |
+-------+-------+------------------------+--------+

Tips to boost performance

Here are several tips to possibly further improve the performance of bulk data loading:

  • Add constraints after data loading: if one needs to load multiple chunks of data, it is advisable to add the integrity constraints to the table after all files of this bunch have been loaded. The ALTER statements perform bulk integrity checking, which is often more efficient than conducting the check after each small amount of data are loaded. When adding more data to an existing table, one can temporarily remove existing constraints on the table, load the new data and finally re-add the constraints.
  • LOCKED mode for single user: in many bulk loading situations, the original data file can be saved as a backup or recreated for disaster handling. This relieves the database system from having to prepare for recovery and to save significant storage space. The LOCKED qualifier can be used in this situation to skip the logging operation that is done by default. (WARNING: this feature must be used with great care. When using the LOCKED mode, it is the responsibility of the database user to make sure that there is currently only a single client connection accessing the database. The database server will not check this.)
  • For various reasons it's not a very good idea to do multiple COPY INTO queries concurrently.
    Each COPY INTO query uses multiple threads (up to the number of cores in the system) and they do not take into account as to what might be running in parallel.
    Each COPY INTO necessarily works on all columns of the table in parallel (that's how the data is provided), so the complete table is basically in memory. If there are multiple large tables being worked on in parallel it puts a lot of pressure on the memory subsystem.
    In addition to this, if you COPY INTO the same table sequentially, the columns likely have to be resized. A resize operation can be expensive in that it may result in having to copy the data (and then you temporarily have two copies in memory).
    Our advise: do it sequentially and for the first COPY INTO provide a records count of the total number of rows (not just the number of rows in the first file) so that the columns can be preallocated once.
    If you make sure that all COPY INTO queries are the only queries running, you can also use the LOCKED keyword which may result in less copying of the data.

Binary Bulk Loads

Binary Bulk Loads mk Sun, 03/28/2010 - 23:29

A database is often populated with the SQL COPY statement using an ASCII representation of a table with column and record separators, e.g. a CSV file. This scheme is preferred for almost all situations. Its performance is mostly determined by parsing cost of turning ASCII value representation into the native binary one. The MonetDB copy process is highly optimized for multi-core systems, where several threads in parallel handle variable length records efficiently.

When large tables are migrated between MonetDB instances, or when large tables are produced by a trusted external program, a slightly faster loading track can be used instead. This saves rendering of data into ASCII and subsequent parsing of the data being exchanged. For this to work, the user creates the binary version of the BAT storage and 'attach' it to the SQL catalog using a variation of the COPY command.

To illustrate, consider the following snippet:

create table Tmp( i integer, f real, s string); 
copy binary into Tmp from ('path_to_file_i', 'path_to_file_f', 'path_to_file_s');

Each attachment file is produced by a program that writes the binary image of the BAT directly, i.e. a binary dump of an C-array. For the SQL types tinyint (8-bits), smallint (16-bits),  int/integer (32 bits), and bigint (64 bits) correspond with the C types directly; real and double are mapped onto type float and double in C, respectively. For variable length strings, the file must have one C-based string value per line, terminated by a newline, and it is processed without escape character conversion. Fixed length strings are handled the same way. MonetDB assumes that all files are aligned, i.e. the i-th value in each file corresponds to the i-th record in the table.

All other types (including UTF8 and escaped characters) should be handled using the default COPY INTO language construct.

The file names must be absolute paths to the column input files.

In the past, some of the files would be "moved", but nowadays, all data is copied, so the original files remain unchanged.

Apache Pig

Apache Pig mk Fri, 07/19/2013 - 10:31

In many cases, the data that is supposed to be stored in MonetDB is created by a large-scale data aggregation process. Apache Pig is a popular tool for this task. There is a possibility of directly generating MonetDB binary column files using the MonetDB-Pig-Layer.

Inside a Pig script, this is used as follows:

STORE raw INTO './results/' USING nl.cwi.da.monetdb.loader.hadoop.MonetDBStoreFunc;
 
This will generate a SQL schema file, a set of column files, and a load command. Keep in mind that currently, only primitive Java types such as Integer, Long, Float, Double, String etc. are supported.

Availability, Load Balancing and Replication

Availability, Load Balancing and Replication giulia Mon, 02/24/2020 - 10:40

Under construction

Exporting Bulk Data

Exporting Bulk Data zhang Fri, 09/01/2017 - 16:11

The COPY INTO command with a file name argument allows for fast dumping of a result set into an UTF-8 text file, possibly compressed. The file must be accessible by the server and a full path name may be required. The special file STDOUT can be used to direct the result to the primary output channel.

The delimiters and NULL AS arguments provide control over the layout required.

COPY select_query INTO { file_name | STDOUT }
     [ [USING] DELIMITERS field_separator [ ',' record_separator [ ',' string_quote ] ] ]
     [ NULL AS null_string ]

For the output 'file_name', one can specify a file name extension .bz2 or .gz or .xz or.lz4 to use the specific compression library (if available). Note: extension .zip is not supported, also not on Windows, use .gz instead.

sql>SELECT * FROM cars;
+------+-------+--------+
| yr   | brand | model  |
+======+=======+========+
| 2000 | Ford  | Focus  |
| 2001 | Honda | Accord |
| 2004 | Honda | Pilot  |
+------+-------+--------+
3 tuples
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out.csv' USING DELIMITERS ',' , '\n' , '"';
3 affected rows
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out3.csv.bz2' DELIMITERS ',','\n';
3 affected rows
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out2.csv.gz' DELIMITERS ',','\n';
3 affected rows
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out4.tsv.xz' DELIMITERS '\t','\n';
3 affected rows
sql>COPY SELECT * FROM cars INTO '/tmp/cars_out4.tsv.lz4' DELIMITERS '\t','\n';
3 affected rows

Let us check some of the results:

$ cat /tmp/cars_out.csv
2000,"Ford","Focus"
2001,"Honda","Accord"
2004,"Honda","Pilot"
$ bzip2 -d cars_out2.csv.bz2 ; cat cars_out2.csv
2000,"Ford","Focus"
2001,"Honda","Accord"
2004,"Honda","Pilot"

Monitoring

Monitoring giulia Mon, 02/24/2020 - 10:42

 

 

Systems Monitor

Systems Monitor giulia Mon, 05/04/2020 - 17:17

The MonetDB kernel maintains an active queue of all running queries. This queue is available for all users to inspect the status of his own queries. The system administrator can inspect it to overlook the complete workload on the system. The queue is made visible as a table producing function, called sys.queue().

sql>select * from sys.queue();
+------+---------+----------------------------+----------+----------+---------+------------+----------------------------+
| qtag | user    | started                    | estimate | progress | status  | tag        | query                      |
+======+=========+============================+==========+==========+=========+============+============================+
|  902 | monetdb | 2013-03-30 10:23:39.000000 | null     |     null | running | 15168688@0 | select * from sys.queue(); |
+------+---------+----------------------------+----------+----------+---------+------------+----------------------------+
1 tuple (0.446ms)

The schema structure is largely self-explanatory. If the query (template) is ran multiple times, then the system can derive a progress indicator and calculate an estimated time of completion. The 'tag' column references the query log tables, provided this facility has been turned on. The initial column 'qtag' provides a key to each active query .

One can use the SQL functions PAUSE, RESUME and STOP to control the execution of (long) running queries.

NOTE that PAUSE and STOP takes effect at the first safe point within the query plan, which often is after the current MAL instruction has been finished. For complex queries over large database this may take seconds up to minutes.

sql> select * from sys.queue;
+------+---------+----------------------------+----------+----------+---------+-----------+--------------------------+
| qtag | user    | started                    | estimate | progress | status  | tag       | query                    |
+======+=========+============================+==========+==========+=========+===========+==========================+
|   52 | monetdb | 2013-12-23 12:34:50.000000 | null     |     null | running | 1349763@0 | select * from sys.queue; |
+------+---------+----------------------------+----------+----------+---------+-----------+--------------------------+
1 tuple (2.242ms)

--- start a long running query in another window
sql>select * from sys.queue;
+------+---------+----------------------------+----------+------+---------+-----------+-------------------------------------------------------------+
| qtag | user    | started                    | estimate | prog | status  | tag       | query                                                       |
:      :         :                            :          : ress :         :           :                                                             :
+======+=========+============================+==========+======+=========+===========+=============================================================+
|   56 | monetdb | 2013-12-23 12:35:03.000000 | null     | null | running | 1353354@0 | select                                                      |
:      :         :                            :          :      :         :           :  sum(l_extendedprice * l_discount) as revenue               :
:      :         :                            :          :      :         :           : from                                                        :
:      :         :                            :          :      :         :           :  lineitem                                                   :
:      :         :                            :          :      :         :           : where                                                       :
:      :         :                            :          :      :         :           :  l_shipdate >= date \'1994-01-01\'                          :
:      :         :                            :          :      :         :           :  and l_shipdate < date \'1994-01-01\' + interval \'1\' year :
:      :         :                            :          :      :         :           :  and l_discount between 0.06 - 0.01 and 0.06 + 0.01         :
:      :         :                            :          :      :         :           :  and l_quantity < 24;                                       :
|   57 | monetdb | 2013-12-23 12:35:10.000000 | null     | null | running | 1349763@0 | select * from sys.queue;                                    |
+------+---------+----------------------------+----------+------+---------+-----------+-------------------------------------------------------------+
2 tuples (1.404ms)
sql>call sys.pause(56);
sql>select * from sys.queue; 
------+---------+----------------------------+-----------+------+---------+-----------+-------------------------------------------------------------+
| qtag | user    | started                    | estimate | prog | status  | tag       | query                                                       |
:      :         :                            :          : ress :         :           :                                                             :
+======+=========+============================+==========+======+=========+===========+=============================================================+
|   56 | monetdb | 2013-12-23 12:35:03.000000 | null     | null | paused  | 1353354@0 | select                                                      |
:      :         :                            :          :      :         :           :  sum(l_extendedprice * l_discount) as revenue               :
:      :         :                            :          :      :         :           : from                                                        :
:      :         :                            :          :      :         :           :  lineitem                                                   :
:      :         :                            :          :      :         :           : where                                                       :
:      :         :                            :          :      :         :           :  l_shipdate >= date \'1994-01-01\'                          :
:      :         :                            :          :      :         :           :  and l_shipdate < date \'1994-01-01\' + interval \'1\' year :
:      :         :                            :          :      :         :           :  and l_discount between 0.06 - 0.01 and 0.06 + 0.01         :
:      :         :                            :          :      :         :           :  and l_quantity < 24;                                       :
|   65 | monetdb | 2013-12-23 12:36:03.000000 | null     | null | running | 1349763@0 | select * from sys.queue;                                    |
+------+---------+----------------------------+----------+------+---------+-----------+-------------------------------------------------------------+
2 tuples (3.082ms)
sql>call sys.resume(56);
sql>select * from sys.queue; 
------+---------+----------------------------+-----------+------+---------+-----------+-------------------------------------------------------------+
| qtag | user    | started                    | estimate | prog | status  | tag       | query                                                       |
:      :         :                            :          : ress :         :           :                                                             :
+======+=========+============================+==========+======+=========+===========+=============================================================+
|   56 | monetdb | 2013-12-23 12:35:03.000000 | null     | null | running | 1353354@0 | select                                                      |
:      :         :                            :          :      :         :           :  sum(l_extendedprice * l_discount) as revenue               :
:      :         :                            :          :      :         :           : from                                                        :
:      :         :                            :          :      :         :           :  lineitem                                                   :
:      :         :                            :          :      :         :           : where                                                       :
:      :         :                            :          :      :         :           :  l_shipdate >= date \'1994-01-01\'                          :
:      :         :                            :          :      :         :           :  and l_shipdate < date \'1994-01-01\' + interval \'1\' year :
:      :         :                            :          :      :         :           :  and l_discount between 0.06 - 0.01 and 0.06 + 0.01         :
:      :         :                            :          :      :         :           :  and l_quantity < 24;                                       :
|   65 | monetdb | 2013-12-23 12:36:03.000000 | null     | null | running | 1349763@0 | select * from sys.queue;                                    |
+------+---------+----------------------------+----------+------+---------+-----------+-------------------------------------------------------------+
2 tuples (12.320ms)
sql>call sys.stop(56);
sql>select * from sys.queue;
+------+---------+----------------------------+----------+----------+---------+-----------+--------------------------+
| qtag | user    | started                    | estimate | progress | status  | tag       | query                    |
+======+=========+============================+==========+==========+=========+===========+==========================+
|   67 | monetdb | 2013-12-23 12:36:22.000000 | null     |     null | running | 1349763@0 | select * from sys.queue; |
+------+---------+----------------------------+----------+----------+---------+-----------+--------------------------+
1 tuple (2.799ms)

Table Statistics

Table Statistics giulia Mon, 05/04/2020 - 17:15

Statistics gathered over tables in the database can be found in the system table sys.statistics. This table is initially empty and explicitly filled or updated using the ANALYZE command:

ANALYZE schemaname [ '.' tablename [ '('columnname , ...')' ] ]
      [ SAMPLE size ]
      [ MINMAX ]

You can gather statistics for a) all tables in a schema or b) all columns of a specific table or c) a list of specific columns of one table. You can only gather statistics of tables with real physical column data, so not for views.

Since statistics gathering involves accessing and profiling all table columns data, it can take considerable time, especially if the tables are large or you analyze all tables in a schema. You may consider to add an optional MINMAX which directs exclusion of the expensive unique count operation. Likewise, a SAMPLE size can be used to provide a quick, but imprecise impression.

You can remove statistics data via SQL command: DELETE FROM sys.statistics. For example removing statistics for table: my_schema.my_table:

DELETE FROM sys.statistics
 WHERE column_id IN (SELECT c.id FROM sys.columns c
 WHERE c.table_id IN (SELECT t.id FROM sys.tables t
 WHERE t.schema_id IN (SELECT s.id FROM sys.schemas s
 WHERE s.name = 'my_schema') AND t.name = 'my_table'));

Query timing

Query timing mk Sun, 10/13/2013 - 14:28

Timing a query execution is supported in multiple ways, but largely depends on what you want to measure. Point-to-point wall clock time, or the actual behavior of kernel operations.

(1) The baseline is to use simple command line tools, such at TIME on Linux to assess the performance of running a script against mclient. Beware that /bin/time and /usr/bin/time are not the same, they mainly measure and report the wall-clock time spent by the given command/process. See their respective man pages for details.

(2) The next approach is to use the "--interactive" option of the mclient tool, which will report on the timing of each individual SQL query in a script in easy human consumable terms. It returns the wall-clock time between sending the query to the server and receiving the first block of answers. Its rendering can be controlled (see mclient).

(3) The query history can also be maintained in a separate log for post analysis. (see description)

Thus, (1) includes everything from loading the mclient binary and starting the mclient process, parsing the query in mclient, sending to the server, having the server execute the query and serialize the result, sending the result back to the client, to the client receiving, parsing and rendering the result, and sending the result to /dev/null ("for free"), to a file (I/O), or to a terminal (scrolling). (2) merely includes the time the server spends on receiving and executing the query and creating the result. The abovementioned costs on the client side to receive, parse, render, etc. the result are excluded. The same holds for (3)

A detailed time of an SQL query can be obtained with prepending the query with the modifier TRACE. It will produce a queryable table with a break down of all relational algebra operations (see TRACE command). The profiling tools stethoscope and tomograph provide further details for those interested in the inner working of the system. It provides a hook to many system parameters, e.g. input/output, CPU cycles, and threads' activities.

Timing a database query should be done with care. Often you will notice differences in response time for the same query ran multiple times. The underlying cause can be that the data itself resides on disk (slow) or is already avaiable in the memory caches (fast), a single user runs queries (fast) or has to compete with other users (slow), including competing with other processes on your box fighting over cpu, memory, and IO resources. As a precaution you might want to flush the system caches. The Windows tool flushes the cache. You'll need to press the "Flush Cache WS" and "Flush All Standby" buttons. On Linux you have to create a little job that consumes all memory.

For more general information on running experiments and measuring time, see our performance tutorial.

Query History

Query History mk Sat, 03/09/2013 - 17:36

The SQL implementation comes with a simple query profiler to detect expensive queries. It is centered around two predefined internal tables that store the definitions of all executed SQL queries and their execution time.

Query logging can be started by calling the procedure querylog_enable(), which saves some major compilation information of a query in the 'querylog_catalog' table:

sys.querylog_catalog
name type references description
"id" OID    
"owner" STRING   The SQL user who has executed this query.
"defined" TIMESTAMP   Time when the query was started
"query" STRING   The query that has been executed
"pipe" STRING   The optimiser pipe line that has been used
"plan" STRING   Name of its MAL plan
"mal" INTEGER   Size of its MAL plan in the number of statements
"optimize" BIGINT   Time in microseconds for the optimiser pipeline

Query logging can be stoped by calling procedure querylog_disable().

The query performance is stored in the table  'querylog_calls'. The owner of the query definition is also the one who is referenced implicitly by the 'id' of a call event. The key timing attributes are 'run', i.e. the time to execute the query to produce the result set, and 'ship', i.e. the time to render the result set and sent it to the client. All times are in microseconds.

The remaining parameters illustrate resource claims. The 'tuples' attribute denotes the size of the result set in the number of rows. The 'cpu' load is derived from the operating system statistics (Linux only) and is given as a percentage. The same holds for the 'io' waiting time.

sys.querylog_calls
name type references description
"id" OID sys.querylog_catalog.id  
"start" TIMESTAMP   time the statement was started
"stop" TIMESTAMP   time the statement was completely finished
"arguments" STRING   actual call structure
"tuples" BIGINT   number of tuples in the result set
"run" BIGINT   time spent (in usec) until the result export
"ship" BIGINT   time spent (in usec) to ship the result set
"cpu" INT   average cpu load percentage during execution
"io" INT   percentage time waiting for IO to finish

The view 'sys.querylog_history' includes some useful information from both tables:

create view sys.querylog_history as
    select qd.*, ql."start",ql."stop", ql.arguments, ql.tuples, ql.run, ql.ship, ql.cpu, ql.io
    from sys.querylog_catalog() qd, sys.querylog_calls() ql
    where qd.id = ql.id and qd.owner = user;

The following code snippet illustrates its use:

sql>call sys.querylog_enable();
sql>select 1;
sql>select 1;
sql>select * from sys.querylog_catalog;
sql>select * from sys.querylog_calls;
sql>select * from sys.querylog_history;
sql>select id, query, avg(run) from sys.querylog_history group by id,query;
sql>call sys.querylog_disable();
sql>call sys.querylog_empty();

The 'sys.querylog_enable()' function also take a parameter, 'threshold', which is an integer in millisecond. When the query log is enabled with this parameter, it will only log those queries whose execution times are longer than the threshold.  This feature can be handy to prevent the database from being swarmed by too many short running queries, hence reduce the overhead incurred by the query log (see below), while helping the DBA detecting expensive queries.

Disabling the query log will not remove existing query logs; it only prevents subsequent queries to be logged. Once the query log is re-enabled, information of subsequently executed queries will be appended to the existing query logs.

Query logs are stored in persistent tables, ie they will survice a MonetDB server restart. They can only be removed 'sys.querylog_empty()'. A downside of this implementation is its relative high overhead because every read query will trigger a write transaction.

Storage Model

Storage Model mk Fri, 10/26/2012 - 13:00

The storage footprint for any given database schema can be obtained by inspecting the table producing function storage().  To illustrate below we see the storage characteristics of the lineitem table in TPCH SF-10. The column width for variable length strings, e.g. l_shipinstruct and l_comment, represents the average length found using a sample over the underlying column storage.

To estimate the storage footprint for a variation of this scheme, we first construct a relation with the model input. Thereafter it can be updated to reflect the expected database size and varwidth properties. The footprint then becomes available as table producing function storagemodel(). Also be aware that dictionary encoding may have taken place, which leads to a much less storage footprint (see l_shipinstruct). Dictionary encoding is currently a runtime feature,  it depends on insertion sequence and the dictionary size. It may lead to an overshoot in the estimated size. Therefore, the size of varchar columns should be taken with a grain of salt.

sql>call storagemodelinit();
sql>update storagemodelinput set  count = 1000000 where "table"='lineitem';
sql>update storagemodelinput set  "distinct" = 1000 where "table"='lineitem' and "type"='varchar';
sql>update storagemodelinput set  "distinct" = 330000 where "table"='lineitem' and "column"='l_comment';
sql>select * from storagemodel() where "table" = 'lineitem';

 

When data updates are done on columns, the changes are first stored in so-called delta structures and later merged (in bulk) into the column storages. For analysis, monitoring and debugging purposes we provide as of release Nov2019 (11.35.3) three new table producing functions:
    sys.deltas("schema" string, "table" string, "column" string)
    sys.deltas("schema" string, "table" string)
    sys.deltas("schema" string)

Each return a table with 7 output columns:

name type description
id int reference to the sys.columns.id
cleared boolean whether the column's upper table is cleared (true) or not (false)
immutable bigint number of the RDONLY deltas of the column
inserted bigint number of the RD_INS deltas of the column
updates bigint number of the RD_UPD_ID deltas of the column
deletes bigint number of deleted values of the column's table
level int the level of the current transaction in the transaction level tree

Some examples to query these table producing functions:
    SELECT * FROM sys.deltas('sys','statistics','minval');
    SELECT id, cleared, immutable, inserted, updates, deletes, level
      FROM sys.deltas('sys','statistics');
    SELECT * FROM sys.deltas('tmp');
    SELECT s.name as "schemanm", t.name as "tablenm", c.name as "columnnm", d.*
      FROM sys.schemas s
      JOIN sys.tables t ON s.id = t.schema_id
      JOIN sys.columns c ON t.id = c.table_id
      JOIN sys.deltas('sys') d ON c.id = d.id;

System monitor

System monitor mk Sun, 03/17/2013 - 20:39

The MonetDB kernel maintains an active queue of all running queries. This queue is available for all users to inspect the status of his own queries. The system administrator can inspect it to overlook the complete workload on the system. The queue is made visible as a table producing function, called sys.queue() or via system view sys.queue which selects from sys.queue().

sql>select * from sys.queue();
+------+-----------+----------+----------------------------+---------+----------------------------+----------+---------+--------+
| tag  | sessionid | username | started                    | status  | query                      | progress | workers | memory |
+======+===========+==========+============================+=========+============================+==========+=========+========+
|   50 |         1 | monetdb  | 2013-12-23 12:31:50.000000 | running | select * from sys.queue(); |        0 |       0 |      0 |
+------+-----------+----------+----------------------------+---------+----------------------------+----------+---------+--------+
1 tuple

The initial column 'tag' provides a key to each active query. For more information on the result columns see Query Catalog.

One can use the SQL functions sys.pause(tag#), sys.resume(tag#) and sys.stop(tag#) to control the execution of (long) running queries.

NOTE that sys.pause(tag#) and sys.stop(tag#) takes effect at the first safe point within the query plan, which often is after the current MAL instruction has been finished. For complex queries over large database this may take seconds up to minutes.

sql> select * from sys.queue;
+------+-----------+----------+----------------------------+---------+--------------------------+----------+---------+--------+
| tag  | sessionid | username | started                    | status  | query                    | progress | workers | memory |
+======+===========+==========+============================+=========+==========================+==========+=========+========+
|   50 |         1 | monetdb  | 2013-12-23 12:34:50.000000 | running | select * from sys.queue; |        0 |       0 |      0 |
+------+-----------+----------+----------------------------+---------+--------------------------+----------+---------+--------+
1 tuple

--- start a long running query in another session window

sql>select * from sys.queue;
+------+-----------+----------+----------------------------+---------+--------------------------------+----------+---------+--------+
| tag  | sessionid | username | started                    | status  | query                          | progress | workers | memory |
+======+===========+==========+============================+=========+================================+==========+=========+========+
|   56 |         2 |  monetdb | 2013-12-23 12:35:03.000000 | running | select                                                       |
:      :           :          :                            :         :  sum(l_extendedprice * l_discount) as revenue                :
:      :           :          :                            :         :  from                                                        :
:      :           :          :                            :         :   lineitem                                                   :
:      :           :          :                            :         :  where                                                       :
:      :           :          :                            :         :   l_shipdate >= date \'1994-01-01\'                          :
:      :           :          :                            :         :   and l_shipdate < date \'1994-01-01\' + interval \'1\' year :
:      :           :          :                            :         :   and l_discount between 0.06 - 0.01 and 0.06 + 0.01         :
:      :           :          :                            :         :   and l_quantity < 24;                                       :
|   57 |         1 | monetdb  | 2013-12-23 12:36:11.000000 | running | select * from sys.queue;       |        0 |       0 |      0 |
+------+-----------+----------+----------------------------+---------+--------------------------------+----------+---------+--------+
2 tuples

sql>call sys.pause(56);
sql>select * from sys.queue;
+------+-----------+----------+----------------------------+---------+--------------------------------+----------+---------+--------+
| tag  | sessionid | username | started                    | status  | query                          | progress | workers | memory |
+======+===========+==========+============================+=========+================================+==========+=========+========+
|   56 |         2 |  monetdb | 2013-12-23 12:35:03.000000 | paused  | select                                                       |
:      :           :          :                            :         :  sum(l_extendedprice * l_discount) as revenue                :
:      :           :          :                            :         :  from                                                        :
:      :           :          :                            :         :   lineitem                                                   :
:      :           :          :                            :         :  where                                                       :
:      :           :          :                            :         :   l_shipdate >= date \'1994-01-01\'                          :
:      :           :          :                            :         :   and l_shipdate < date \'1994-01-01\' + interval \'1\' year :
:      :           :          :                            :         :   and l_discount between 0.06 - 0.01 and 0.06 + 0.01         :
:      :           :          :                            :         :   and l_quantity < 24;                                       :
|   59 |         1 | monetdb  | 2013-12-23 12:37:21.000000 | running | select * from sys.queue;       |        0 |       0 |      0 |
+------+-----------+----------+----------------------------+---------+--------------------------------+----------+---------+--------+
2 tuples

sql>call sys.resume(56);
sql>select * from sys.queue;
+------+-----------+----------+----------------------------+---------+--------------------------------+----------+---------+--------+
| tag  | sessionid | username | started                    | status  | query                          | progress | workers | memory |
+======+===========+==========+============================+=========+================================+==========+=========+========+
|   56 |         2 |  monetdb | 2013-12-23 12:35:03.000000 | running | select                                                       |
:      :           :          :                            :         :  sum(l_extendedprice * l_discount) as revenue                :
:      :           :          :                            :         :  from                                                        :
:      :           :          :                            :         :   lineitem                                                   :
:      :           :          :                            :         :  where                                                       :
:      :           :          :                            :         :   l_shipdate >= date \'1994-01-01\'                          :
:      :           :          :                            :         :   and l_shipdate < date \'1994-01-01\' + interval \'1\' year :
:      :           :          :                            :         :   and l_discount between 0.06 - 0.01 and 0.06 + 0.01         :
:      :           :          :                            :         :   and l_quantity < 24;                                       :
|   62 |         1 | monetdb  | 2013-12-23 12:38:44.000000 | running | select * from sys.queue;       |        0 |       0 |      0 |
+------+-----------+----------+----------------------------+---------+--------------------------------+----------+---------+--------+
2 tuples

sql>call sys.stop(56);
sql>select * from sys.queue;
+------+-----------+----------+----------------------------+---------+--------------------------+----------+---------+--------+
| tag  | sessionid | username | started                    | status  | query                    | progress | workers | memory |
+======+===========+==========+============================+=========+==========================+==========+=========+========+
|   64 |         1 | monetdb  | 2013-12-23 12:39:55.000000 | running | select * from sys.queue; |        0 |       0 |      0 |
+------+-----------+----------+----------------------------+---------+--------------------------+----------+---------+--------+
1 tuple

Table statistics

Table statistics mk Tue, 09/16/2014 - 14:53

Statistics gathered over tables in the database can be found in the system table sys.statistics. This table is initially empty and explicitly filled or updated using the ANALYZE command:

ANALYZE schemaname [ '.' tablename [ '('columnname , ...')' ] ]
      [ SAMPLE size ]
      [ MINMAX ]

You can gather statistics for a) all tables in a schema or b) all columns of a specific table or c) a list of specific columns of one table. You can only gather statistics of tables with real physical column data, so not for views.

Since statistics gathering involves accessing and profiling all table columns data, it can take considerable time, especially if the tables are large or you analyze all tables in a schema. You may consider to add an optional MINMAX which directs exclusion of the expensive unique count operation. Likewise, a SAMPLE size can be used to provide a quick, but imprecise impression.

You can remove statistics data via SQL command: DELETE FROM sys.statistics. For example removing statistics for table: my_schema.my_table:

DELETE FROM sys.statistics
 WHERE column_id IN (SELECT c.id FROM sys.columns c
 WHERE c.table_id IN (SELECT t.id FROM sys.tables t
 WHERE t.schema_id IN (SELECT s.id FROM sys.schemas s
 WHERE s.name = 'my_schema') AND t.name = 'my_table'));

 

Date and Time functionality

Date and Time functionality zhang Tue, 09/05/2017 - 12:38

MonetDB offers many functionalities for storing and working with date and time values.

For the example here we have extended the voyages table in our VOC data set with randomly filled time columns, and also created timestamp columns by appending the random times to the already existing dates. The following columns from the voyages table are used:

Column name Data type In original data set Description
departure_date date yes The data at which the voyage started
departure_time time no The time at which the voyage started
departure_timestamp timestamp no Combination of departure_date and departure_time
arrival_date date yes The date at which the voyage ended
arrival_time time no The time at which the voyage ended
arrival_timestamp timestamp no Combination of arrival_date and arrival_time

The additional columns are created and populated with the following queries:

ALTER TABLE voyages ADD COLUMN departure_time TIME;
ALTER TABLE voyages ADD COLUMN departure_timestamp TIMESTAMP;
ALTER TABLE voyages ADD COLUMN arrival_time TIME;
ALTER TABLE voyages ADD COLUMN arrival_timestamp TIMESTAMP;

UPDATE voyages SET departure_time = STR_TO_TIME(RAND(), '%s');
UPDATE voyages SET arrival_time   = STR_TO_TIME(RAND(), '%s');

UPDATE voyages SET departure_timestamp = 
  str_to_timestamp(date_to_str(departure_date, '%Y-%m-%d') || ' ' || 
  time_to_str(departure_time, '%H:%M'), '%Y-%m-%d %H:%M');
UPDATE voyages SET arrival_timestamp   = 
  str_to_timestamp(date_to_str(arrival_date, '%Y-%m-%d')   || ' ' || 
  time_to_str(arrival_time, '%H:%M'), '%Y-%m-%d %H:%M');

Temporal types

In order to store and work with date and time values, MonetDB defines the following temporal data types:

  • DATE: calendar date in the Gregorian calendar, e.g. 1999-12-31.
  • TIME: time of day (24-hour clock) with a certain (configurable) precision and time zone.
  • TIMESTAMP: date and time data type combined.
  • INTERVAL <interval_qualifier>: temporal interval, e.g. 1023.5 seconds, 50 days, 31 months.

Using the current date or time

There are some standard SQL keywords that return the current date, time and timestamp:

Query Result type With time zone Example result
SELECT CURRENT_DATE; date no 2015-05-22
SELECT CURRENT_TIME; time yes 14:18:17.780330+02:00
SELECT CURRENT_TIMESTAMP; timestamp yes 2015-05-22 14:18:17.780331+02:00
SELECT NOW; timestamp yes 2015-05-22 14:18:17.780332+02:00
SELECT LOCALTIME; time no 14:18:17.780333
SELECT LOCALTIMESTAMP; timestamp no 2015-05-22 14:18:17.780334

These functions are very useful for many use cases, for example, when we need to store the time a tuple was inserted into MonetDB. In that case we can create a table like this:

CREATE TABLE t (a INT, b TIMESTAMP DEFAULT NOW);

If we then insert data into this table as follows (with a short delay between executing the two statements):

INSERT INTO t (a) VALUES (3), (5);
INSERT INTO t (a) VALUES (1);

Then the tuples in table t will have recorded the timestamp they were inserted into MonetDB (for how to bulk load a date, time or timestamp column from a CSV file, please see CSV Bulk Loads):

SELECT * FROM t;
+------+----------------------------+
| a    | b                          |
+======+============================+
| 3    | 2015-11-26 09:17:03.328368 |
| 5    | 2015-11-26 09:17:03.328368 |
| 1    | 2015-11-26 09:17:07.795224 |
+------+----------------------------+

A more interesting use case is counting how many voyages started more than 400 years ago:

SELECT COUNT(*) FROM voyages WHERE departure_timestamp < NOW - INTERVAL '400' YEAR;
+------+
| L1   |
+======+
| 319  |
+------+

The above query uses date computation to subtract 400 years from the current date and time. For more information about date computations, see the section below on date computation.

Generating dates and times from strings

There are many use cases where we want to have a date or a time that is not at all related to the current date or time. In that case, we can generate date and time types for MonetDB by passing it a string, along with a second string that tells MonetDB how the first string should be interpreted. This is done by using so called date/time format specifiers.

SELECT str_to_date('23-09-1987', '%d-%m-%Y') AS "date",
       str_to_time('11:40', '%H:%M') AS "time",
       str_to_timestamp('23-09-1987 11:40', '%d-%m-%Y %H:%M') AS "timestamp";
+------------+----------+----------------------------+
| date       | time     | timestamp                  |
+============+==========+============================+
| 1987-09-23 | 11:40:00 | 1987-09-23 11:40:00.000000 |
+------------+----------+----------------------------+

For a complete list of date/time format specifiers, including their meaning, see the section below on date/time format specifiers.

Extracting information from dates and times

Dates and times are stored in such a way that MonetDB can do efficient computations on them. However, sometimes this native format is not what we desire. For example, we might want to create strings from date or time values that is formatted in a way that we can define. This is where the date/time to string conversion functions come in:

SELECT departure_timestamp,
       date_to_str(departure_date, '%Y-%m-%d') AS "date",
       time_to_str(departure_time, '%H:%M') AS "time",
       timestamp_to_str(departure_timestamp, '%Y-%m-%d %H:%M') AS "timestamp"
FROM voyages LIMIT 3;
+----------------------------+------------+-------+------------------+
| departure_timestamp        | date       | time  | timestamp        |
+============================+============+=======+==================+
| 1595-04-02 10:34:46.000000 | 1595-04-02 | 10:34 | 1595-04-02 10:34 |
| 1595-04-02 00:52:57.000000 | 1595-04-02 | 00:52 | 1595-04-02 00:52 |
| 1595-04-02 08:01:55.000000 | 1595-04-02 | 08:01 | 1595-04-02 08:01 |
+----------------------------+------------+-------+------------------+

You can also extract certain portions from date, time and timestamp fields, using the following keywords in combination with the EXTRACT(<keyword> FROM <date/time/timestamp field>) function:

  • YEAR
  • MONTH
  • DAY
  • HOUR
  • MINUTE
  • SECOND
  • CENTURY
  • DECADE
  • QUARTER
  • WEEK
  • DOW
  • DOY

We can use this to compute how many years ago the first and the last recorded departure occurred:

SELECT
  MAX(EXTRACT(YEAR FROM CURRENT_DATE) - EXTRACT(YEAR FROM departure_date)) AS first,
  MIN(EXTRACT(YEAR FROM CURRENT_DATE) - EXTRACT(YEAR FROM departure_date)) AS last
FROM voyages;
+-------+------+
| first | last |
+=======+======+
| 420   | 219  |
+-------+------+

So from the year of this writing (2015), the recorded voyage departures were between 219 and 420 years ago.

Now let us use MonetDB to compute something really interesting. The table "total" contains statistics about the number of deaths on every voyage (in the columns "death_at_cape" and "death_during_voyage"). We can join this table with the voyages table and then use the EXTRACT function to group the total number of deaths by the year in which the voyage started. This result can then be used to create a top 10 of the departure years in which the most deaths occurred.

SELECT EXTRACT(YEAR FROM departure_date) AS "year",
       SUM(death_at_cape + death_during_voyage) AS deaths
  FROM voyages JOIN total
    ON voyages.number = total.number
   AND voyages.number_sup = total.number_sup
 GROUP BY "year"
 ORDER BY deaths DESC
 LIMIT 10;
+------+--------+
| year | deaths |
+======+========+
| 1771 | 2993   |
| 1772 | 1094   |
| 1767 | 1016   |
| 1773 | 380    |
| 1766 | 307    |
| 1775 | 298    |
| 1774 | 158    |
| 1726 | 154    |
| 1619 | 151    |
| 1671 | 149    |
+------+--------+

Apparently, between 1766 and 1776, most deaths occurred during voyages.

Date computation

MonetDB can do native computation on two similar date or time fields, and on a date or time field and an interval. Subtracting or adding two similar date or time fields is done by doing the computation on its operands converted to days in case of a date field and converted to seconds otherwise. For example:

SELECT CURRENT_TIME - departure_time AS timediff_sec
     , arrival_date - departure_date AS journey_days
FROM voyages LIMIT 3;
+--------------+--------------+
| timediff_sec | journey_days |
+==============+==============+
| 7227.000     | 431          |
| 42136.000    | 431          |
| 16398.000    | 431          |
+--------------+--------------+

We can use this to get the minimum and the maximum days taken by a voyage.

SELECT MIN(arrival_date - departure_date) AS voyage_min_days,
       MAX(arrival_date - departure_date) AS voyage_max_days
FROM voyages;
+-----------------+-----------------+
| voyage_min_days | voyage_max_days |
+=================+=================+
| -218944         | 73237           |
+-----------------+-----------------+

This result leads us to believe there must be inconsistencies in the data, since negative journey times should not occur and a voyage that took more than 20 years might be unlikely as well.

For other computations, an interval is taken as a second argument and the result is of the same type as the first argument.

We can use this to count the number of voyages that have finished within a year:

SELECT COUNT(*)
FROM voyages
WHERE arrival_date < departure_date + INTERVAL '1' YEAR;
+------+
| L1   |
+======+
| 7339 |
+------+

Time zones

Every MonetDB connection can have an associated time zone:

SELECT NOW;
+----------------------------------+
| current_timestamp                |
+==================================+
| 2015-11-26 13:46:17.643209+01:00 |
+----------------------------------+

This shows the time, assuming we are in time zone GMT +01:00.

We can change our current time zone as follows:

SET TIME ZONE INTERVAL '+03:00' HOUR TO MINUTE;

After changing the time zone, the same query now gives a different result:

SELECT NOW;
+----------------------------------+
| current_timestamp                |
+==================================+
| 2015-11-26 15:46:18.378875+03:00 |
+----------------------------------+

As you can see, it adds two hours to the displayed time compared to the previous time we executed this query.

Interval conversions

As you might have noticed in the previous section, there is an interval conversion function that takes an 'hour' string and converts it to minutes. Strictly speaking, this is an interval with arguments, and the following interval/argument combinations are possible:

  • YEAR TO MONTH
  • DAY TO SECOND
  • HOUR TO MINUTE
  • HOUR TO SECOND

Example usage:

SELECT INTERVAL '2015-09-23' YEAR TO MONTH AS y2m,
       INTERVAL '+01:00' HOUR TO MINUTE AS h2m,
       INTERVAL '+01:30:15' HOUR TO SECOND AS h2s,
       INTERVAL '1 01:30:15' DAY TO SECOND AS d2s;
+-------+----------+----------+-----------+
| y2m   | h2m      | h2s      | d2s       |
+=======+==========+==========+===========+
| 24189 | 3600.000 | 5415.000 | 91815.000 |
+-------+----------+----------+-----------+

Date/time format specifiers

All possible format specifiers are listed below (taken from Linux Programmer’s Manual)

 %a   The abbreviated name of the day of the week according to the current locale.
 %A   The full name of the day of the week according to the current locale.
 %b   The abbreviated month name according to the current locale.
 %B   The full month name according to the current locale.
 %c   The preferred date and time representation for the current locale.
 %C   The century number (year/100) as a 2-digit integer. (SU)
 %d   The day of the month as a decimal number (range 01 to 31).
 %D   Equivalent to %m/%d/%y. (Yecch—for Americans only. Americans should note that in other countries %d/%m/%y is rather common. This means that in international context this format is ambiguous and should not be used.) (SU)
 %e   Like %d, the day of the month as a decimal number, but a leading zero is replaced by a space. (SU)
 %E   Modifier: use alternative format, see below. (SU)
 %F   Equivalent to %Y-%m-%d (the ISO 8601 date format). (C99)
 %G   The ISO 8601 week-based year (see NOTES) with century as a decimal number. The 4-digit year corresponding to the ISO week number (see %V). This has the same format and value as %Y, except that if the ISO week number belongs to the previous or next year, that year is used instead. (TZ)
 %g   Like %G, but without century, that is, with a 2-digit year (00-99). (TZ)
 %h   Equivalent to %b. (SU)
 %H   The hour as a decimal number using a 24-hour clock (range 00 to 23).
 %I   The hour as a decimal number using a 12-hour clock (range 01 to 12).
 %j   The day of the year as a decimal number (range 001 to 366).
 %k   The hour (24-hour clock) as a decimal number (range 0 to 23); single digits are preceded by a blank. (See also %H.) (TZ)
 %l   The hour (12-hour clock) as a decimal number (range 1 to 12); single digits are preceded by a blank. (See also %I.) (TZ)
 %m   The month as a decimal number (range 01 to 12).
 %M   The minute as a decimal number (range 00 to 59).
 %n   A newline character. (SU)
 %O   Modifier: use alternative format, see below. (SU)
 %p   Either "AM" or "PM" according to the given time value, or the corresponding strings for the current locale. Noon is treated as "PM" and midnight as "AM".
 %P   Like %p but in lowercase: "am" or "pm" or a corresponding string for the current locale. (GNU)
 %r   The time in a.m. or p.m. notation. In the POSIX locale this is equivalent to %I:%M:%S %p. (SU)
 %R   The time in 24-hour notation (%H:%M). (SU) For a version including the seconds, see %T below.
 %s   The number of seconds since the Epoch, 1970-01-01 00:00:00 +0000 (UTC). (TZ)
 %S   The second as a decimal number (range 00 to 60). (The range is up to 60 to allow for occasional leap seconds.)
 %t   A tab character. (SU)
 %T   The time in 24-hour notation (%H:%M:%S). (SU)
 %u   The day of the week as a decimal, range 1 to 7, Monday being 1. See also %w. (SU)
 %U   The week number of the current year as a decimal number, range 00 to 53, starting with the first Sunday as the first day of week 01. See also %V and %W.
 %V   The ISO 8601 week number (see NOTES) of the current year as a decimal number, range 01 to 53, where week 1 is the first week that has at least 4 days in the new year. See also %U and %W. (SU)
 %w   The day of the week as a decimal, range 0 to 6, Sunday being 0. See also %u.
 %W   The week number of the current year as a decimal number, range 00 to 53, starting with the first Monday as the first day of week 01.
 %x   The preferred date representation for the current locale without the time.
 %X   The preferred time representation for the current locale without the date.
 %y   The year as a decimal number without a century (range 00 to 99).
 %Y   The year as a decimal number including the century.
 %z   The +hhmm or -hhmm numeric timezone (that is, the hour and minute offset from UTC). (SU)
 %Z   The timezone name or abbreviation.
 %+   The date and time in date(1) format. (TZ) (Not supported in glibc2.)
 %%   A literal '%' character.

 

Lazy Logical Replication

Lazy Logical Replication mk Mon, 05/01/2017 - 21:55

Lazy logical replication has been added to the source code to realize an asynchronous logical replication management scheme using change set forwarding. Simplicity and ease of end-user control have been the driving arguments in its development.

What is Lazy Logical Replication?

In a data analytics environment the workload on a database is largely read only. Applications grind the data for business insights and summarizations through visual dashboards. Updates are often collected as (micro-) batches and injected into the data warehouse at regular intervals.

If the ingestion rate increases, updates become more complex, or the number of concurrent data analysis applications rises, it becomes mandatory to create a master/replica infrastructure. The master instance is responsible for handling all updates to the database, and replica instances are created to satisfy the responsiveness required by the applications.

A key observation for this common business scenario is that the replicas may lag a little behind. Because data analysts often look at long term patterns using statistical summarizations, and therefore the outcome is less dependent on what happened during the last minute. Furthermore, the replicated data warehouse is likely to run in a Cloud setting, or a cluster with a shared global filesystem. This creates room to simplify the synchronisation between instances, relying on the services provided by the filesystem. In particular, master and replicas share the same <dbfarm> directory.

The Lazy Logical Replication technique rests on detecting change sets in the persistent tables at the master instance, which are collected in a transactional safe way, and replayed at the replica. Replay can be interrupted to obtain a time-warped copy of the database.

When to consider Lazy Logical Replication?

The goal of this extension module is to ease backup and replication of a complete master database with a time-bounded delay. This means that both the master and the replicas run at a certain beat (eg in seconds) by which information is made available by the master or read by the replicas. Such an instance can be freely used for query workload sharing, database versioning, and (re-)partitioning. For example, a replica can be used to support a web application which also keeps application specific data in the same instance, eg session information.

Tables taken from a master can be protected against updates and inspections in a replica instance using the schema access policies defined by the master. Furthermore, updates against replicated tables are not automatically forwarded to the master. Any transaction change set replay that fails stops the cloning process. By default, only persistent tables are considered for replication, and all constraints maintained by the master are carried over to the replicas. Updates under the 'tmp' schema, ie temporary tables, are ignored.

For a backup, we need either all update logs for the entire lifetime of a database, or a binary database snapshot with a collection of logs that have recorded all changes since the snapshot was created. For a replication, also called a database clone, we take a snapshot and the log files that reflect the recent changes. Then, the logged updates are replayed against the snapshot until a specific point in time or transaction id is reached, as identified by the clone itself.

The underlying assumption of the techniques deployed is that the database resides on a proper (global/distributed) file system to guarantees recovery from most storage system related failures, eg using RAID disks or Log-Structured-File systems.

How to set up a master instance?

The safest way to create a master/replica instance is to start with an empty master database. Alternatively, one can stop the master instance, take a binary copy of the <dbfarm>/<dbname> directory and save it for initialization of the replica. A database instance can be set into the 'master' mode only once using the SQL command:

        CALL wlc.master();

An optional path to the log record directory can be given to reduce the IO latency, eg using a nearby SSD, or where there is ample of space to keep a long history, such as an HDD or a cold storage location. By default, the command creates a directory /<path-to>/<dbfarm>/<dbname>/wlc_logs to hold all logs, and a configuration file /<path-to>/<dbfarm>/<dbname>/wlc.config to hold the state of the transaction logs (WLC stands for Work Load Capture). It contains the following <key>=<value> pairs:

        snapshot=<path to a snapshot directory>
        logs=<path to the wlc log directory>
        state=<1: started, 2: stopped>
        batches=<next available batch file to be applied>
        beat=<maximal delay between log files, in seconds>
        write=<timestamp of the last transaction recorded>

A missing snapshot path denotes that we can start a replica from an empty database. The log files are stored as <dbname>_<batchnumber> in the "wlc_logs" directory. They belong to the snapshot. Each WLC log file contains a serial logs of committed compound transactions. The log records are represented as ordinary MAL statement blocks, which are executed in serial mode. Each transaction is identified by a unique id, its starting time, and the responsible database user. The log records must end with a COMMIT to be allowed for re-execution. Log records with a ROLLBACK tag are merely for off-line analysis by the DBA.

A transaction log file is created by the master using a heartbeat in seconds. A new transaction log file is published, after the system has been collecting transaction records for some time. The beat can be set using the SQL command:

        CALL wlc.beat(<duration>);

Setting the master heartbeat to zero leads to one log file per transaction, and this may lead to a log directory with potentially a large amount of files. A default of 5 minutes should balance the polling overhead in most practical situations. The log file is shared within 'beat' seconds after the first transaction record was written into it.

The final step in the life time of a master instance is to stop transaction logging with the SQL command:

        CALL wlc.stop();

This marks the end-of-life time for a snapshot. For example, when planning to do a large bulk load of the database, stopping logging avoids a double write into the database. The database can only be brought back into the master mode using a fresh snapshot.

One of the key challenges for a DBA is to keep the log directory manageable, because it grows with the speed in which updates are applied to the database. This calls for regularly checking for their disk footprint, and taking a new snapshot as a frame of reference. A master instance has no knowledge about the number of clones and their whereabouts. To ensure transaction ACID properties, the log records are stored on disk within the transaction brackets, which may cause extra I/O pressure. This can be alleviated by storing the database and logs files on an SSD or a Non-Volatile-Memory (NVM) device.

How to make a replica instance?

Every clone starts off with a copy of the binary snapshot identified by 'snapshot'. A fresh database can be turned into a clone using the call:

        CALL wlr.replicate('mastername')

It will grab the latest snapshot of the master and applies all available log files before releasing the database. Progress of the replication can be monitored using the -fraw option in mclient.

The clone process will iterate in the background through the log files, applying all updating transactions. An optional timestamp or transaction id can be passed to the replicate() command to apply the logs until a specific moment or transaction. This is particularly useful with an unexpected disastrous user action, e.g. dropping a persistent table, has to be recovered from. CALL wlr.replicate('mastername');

        ...
        CALL wlr.replicate(NOW()); -- partial roll forward until timestamp
        ...
        CALL wlr.replicate();      -- continue nondisturbed synchronisation
        CALL wlr.replicate(N); -- continue until you have handled transaction N
        SELECT wlr.beat(); -- returns refresh rate. 
        SELECT wlr.clock(); -- returns the timestamp of the last replicated transaction.
        SELECT wlr.tick();  -- returns the transaction id of the last replicated transaction.
        SELECT wlr.accept();  -- skip the last failed transaction.

Any failure encountered during a change set replay terminates the replication process, leaving a message in the merovingian log.

Features to be considered beyond the Alpha release

  • Include master/replica management commands into the monetdb control program.
  • Turn off logging of the replicated tables in the replica instance, since we already have them.
  • Roll forward of SQL session variables, i.e. optimizer_pipe. For updates we don't need special care for this.
  • Agents (message buss) to ship the log and snaphot files between different file systems to improve local access speed.
  • Selective replication by partitioning the log files based on predicates.
  • Deploy parallel updates on the change set files.

 

Backup

Backup giulia Mon, 05/04/2020 - 14:15

MonetDB offers different options for backup and dump of the database

SQL dump and restore

SQL dump and restore mk Sun, 03/28/2010 - 13:56

An SQL dump of a database is a common method to safely store away a snapshot of the database for archival purposes or to migrate data between database instances, e.g. between two major system releases. The content of an SQL dump is a large collection of SQL statements as text. Running the script will recreate the database in the same state as it was when the dump was created. Since MonetDB does not provide global locking schemes, a time-consuming dump operation may become invalidated by a concurrent update query.

The primary tool to consider for making a database SQL dump is msqldump, which includes a wide variety of options:.

Usage: msqldump [ options ] [ dbname ]

Options are:
 -h hostname | --host=hostname    host to connect to
 -p portnr   | --port=portnr      port to connect to
 -u user     | --user=user        user id
 -d database | --database=database  database to connect to
 -f          | --functions        dump functions
 -t table    | --table=table      dump a database table
 -D          | --describe         describe database
 -N          | --inserts          use INSERT INTO statements
 -q          | --quiet            don't print welcome message
 -X          | --Xdebug           trace mapi network interaction
 -?          | --help             show this usage message
--functions and --table are mutually exclusive

msqldump program acts as a normal MonetDB client application, e.g. it runs concurrently with all other user sessions, and dumps the generated SQL statements onto standard output. Safe this to a file (by using redirection of stdout) for a later restore session by calling mclient with the saved file as input argument. Details on the various arguments can be found in the manual page for mclient.

For example, consider you have already installed the SQL tutorial database voc on a Linux platform and wishes to transport it to another machine. Then the following command line option generates the dump file.

shell> msqldump --database=voc >/tmp/voc.sql

You can inspect the file /tmp/voc.sql to confirm that indeed a readable SQL database dump is available. If storage space is a problem, you can readily pipe the output of this command to a compression tool, e.g. gzip, or sent it directly to another machine in your environment using conventional Linux tools, e.g. scpy. As a baseline for recovery, move the file over to the new machine. Then the monetdb tool can be used to create the database on the recipient machine. Once done, it suffices to feed the dump file to mclient to populate the database.

Creation of the dump respects your MonetDB server login credentials, which means you only can dump the tables you have access to.

Migration of a database from other database systems follow the same route, but be aware that SQL dialects often differ. A manual or scripted patch of a foreign SQL dump is often needed. An overview of the various SQL dialects can be found in the SQL dialects book and general background on database compatibility on Wikipedia.

shell> monetdb release demo

Windows platforms

On Windows platforms we advise to always dump the old database into a mydbdump.sql file before upgrading to a new MonetDB software release. Subsequently stop the MonetDB server and rename/remove the dbfarm/demo. After upgrade of MonetDB software and restart of MonetDB server, the dump can be restored.

To dump the SQL database, start the MonetDB SQL Client program (mclient.bat) and type the command:

\>C:\...\mydbdump.sql
\D
\>

The path after \> should be an absolute path name (i.e. start with a drive letter) and be in a save location. By default the database is located in %APPDATA%\MonetDB5. After having made a database dump it can be removed. This folder is located inside the dbfarm\demo folder.

Restoring an SQL database can be done using the MonetDB SQL Client program (mclient.bat) with the following command

\<C:\...\mydbdump.sql

Hot backup

Hot backup dinther Thu, 07/30/2020 - 20:21

Creating a full database backup online

Use system procedure: sys.hot_snapshot(full_path_to_tar_file string)
to quickly create a binary copy of all persisted data files of the current connected database into one tar file.
To activate it, simply execute SQL statement:

call sys.hot_snapshot('/tmp/db_backup_2020_07_27.tar');

or on MS Windows (note the R string value prefix to interpret the string characters as Raw):

call sys.hot_snapshot(R'D:\Temp\db_backup_2020_07_27.tar');

This will create a tar file to the given location and name on the server.

Warning: the resulting tar file may become very big (TB's) depending on the size of your database on the file system. Check your database size (in the dbfarm directory) and assure your file system has enough free disk space to store a copy of it as tar file.

Warning: always include the .tar part in the file name as done in the examples above as some tools (on Windows) use it to associate the proper program to open it.

Note: Procedure sys.hot_snapshot() is introduced in release Jun2020 (11.37.7) and not available in older releases.

 

Optionally the tar file can be compressed by adding an extra file name suffix: .lz4 or .gz or .bz2 or .xz.

call sys.hot_snapshot('/tmp/db_backup_2020_07_27.tar.lz4');

or on MS Windows:

call sys.hot_snapshot(R'D:\Temp\db_backup_2020_07_27.tar.gz');

It is recommended to use no or fast lz4 compression so the tar file is written as fast as possible. This is desired to keep the db-lock as short as possible during the copying of the data files.
Compressions gz and bz2 take (much) more time. Compression xz compresses best but also takes most time to complete and thus keeps the database locked the longest. It can be useful when you want the smallest backup file size and no other users are connected to the database server.

Note: for releases Jun2020 and Jun2020-SP1 on MS Windows, only .gz and .bz2 compression formats were supported. With Oct2020 release now also .xz and .lz4 compression formats are supported on MS Windows. We recommend to use no or gz or xz compression on Windows.

 

By default only monetdb user (the admin) may execute this system procedure. If needed the monetdb user may grant execute privilege to other users of the database, for instance to user mr_backup via SQL:

grant execute on procedure sys.hot_snapshot to mr_backup;

 

Restoring a database from the (compressed) tar file

The tar file expands to a single directory with the same name as the database that was snapshotted.
This directory can be passed directly as the --dbpath startup argument of mserver5 or it can be copied into an existing dbfarm and started from monetdbd.
For example, on the machine with the database:

$ monetdb status
name  state   health               remarks
tpch  R  8m  100%  0s  mapi:monetdb://hank:50000/tpch
$ mclient -d tpch -s "call sys.hot_snapshot('/tmp/tpch.tar.gz')"
$ ls -l /tmp/tpch.tar.gz
-rw------- 1 jvr jvr 13337600 Dec  4 11:16 /tmp/tpch.tar.gz

On the other machine:

$ monetdbd create $HOME/myfarm
$ monetdbd start $HOME/myfarm
$ tar -C $HOME/myfarm -axf /tmp/tpch.tar.gz
$ monetdb status
name  state   health               remarks
tpch  S                 mapi:monetdb://hank:50000/tpch
$ mclient -d tpch -s "select count(*) from lineitem"
+-------+
| %1    |
+=======+
| 60175 |
+-------+
1 tuple

To expand the tar file into a directory with a different name, for example because there already is a database with the original name, use the `--strip-components` option of (GNU) tar:

$ mkdir $HOME/myfarm/tpch2
$ tar -C $HOME/myfarm/tpch2 --strip-components=1 -axf /tmp/tpch.tar.gz
$ monetdb status
name   state   health               remarks
tpch   R 11m  100%  0s  mapi:monetdb://hank:50000/tpch
tpch2  S                mapi:monetdb://hank:50000/tpch2

In this example, we manually create the directory tpch2 in the dbfarm. The -C option makes tar unpack in this directory. The --strip-components=1 option removes the original directory name tpch/ from the extract files.
Alternatively, we can avoid the mkdir using the even more nonstandard --one-top-level option.

$ tar -C $HOME/myfarm --one-top-level=tpch3 --strip-components=1 -axf /tmp/tpch.tar.gz

 

On MS Windows we recommend users to install 7-zip utlity to open the (compressed) tar file and extract the database files.

 

File system level backup

File system level backup mk Mon, 03/24/2014 - 22:14

Aside from the SQL dumps, one can also take the database server out of production using the following commands. Be careful in using shutdown(), because there may be users running transactions. This can be seen using the sys.sessions table or the system monitor.

shell> mclient -d demo
select * from sys.sessions;
call sys.shutdown(10);
<cntrl-d>
shell> monetdb lock demo
shell> monetdb stop demo

The "lock" step is necessary to prevent monetdbd from automatically restarting the database "demo" upon receiving another client connection request.

After the database has been securely stopped, we can create a copy of the database directory (including all subdirectories) in the dbfarm and put it aside in a safe place. You may want to use a archiver (e.g. tar) and compression program (gzip, 7-zip) to reduce the size of the copied directory. Alternatively, incremental file-system dumps can be used to reduce the time and storage space for a recovery point. Finally, the database is released for production again using:

shell>monetdb release <dbname>