User guide

A MonetDB database instance manages a single database on a single system. It is perfectly possible to have multiple database instances run concurrently on a single system, but its resource claim may negatively affect the performance perceived.

This section takes you from the easy installation to running a tutorial.

Installation

The MonetDB code is available in easy to install packages for system administrators. The platforms supported with their detailed requirements are shown on the download menu. A private install can be obtained by downloading and compiling the source distribution.  You don't need super-user or system administrator priviliges for this.

After installation of MonetDB software you will find a number of programs in the bin sub-directory of your installation directory. Below is an overview of the server and client programs distributed with MonetDB.

Server program Description Documentation
monetdbd The MonetDB Database Server daemon.Once started it runs in the background and facilitates, amongst others, management of local (and remote) mserver5's. The daemon is controlled by the client application monetdb. Note: monetdbd is currently only available on UNIX-like platforms. [See manpage]
mserver5 The MonetDB server, version 5. It runs as a console program. Client programs (except monetdb) connect to this server process. [See manpage]
Client program Description Documentation
monetdb A command-line program to control the monetdbd deamon server by executing a single command. It allows you for instance to start & stop an mserver5 process or create a new database. [See manpage]
mclient A command-line program to interactively communicate to a running mserver5 process. The program is intended for technical persons, e.g. DBA's, SQL developers and system adminstrators. [See manpage]
jdbcclient.jar A Java command-line program to interactively communicate to a running mserver5 process via the MonetDB JDBC driver. The program is intended for technical persons, e.g. SQL developers and DBA's. Note: This program is a separately downloaded. [See manpage]
msqldump A command-line program to export the structure and content of a database or a specific table to a file. This allows you to quickly export and re-create the database or table on a different server. [See manpage]

Please note that all these programs are started from a command shell (such as bash or cmd.exe) or invoked from a script file. They do not have a Graphical User Interface (GUI)

For a user you may prefer to use more user-friendly GUI-programs. Many good GUI-programs are available. Those generic GUI-programs typically allow you to connect to any DBMS via standard database connectivity API such as ODBC or JDBC. MonetDB provides both an ODBC driver and a JDBC driver. The JDBC driver is usable for programs written in Java. There are also MonetDB server connectivity API libraries available for populair programming languages such as Python, PHP, Perl. [More].

If you need the MonetDB ODBC driver or JDBC driver, you will need to download, install and configure them at the computer OS where the GUI-program is installed or started. For the ODBC driver you will need system administrator privileges to install the MonetDB ODBC driver components. For the JDBC driver no sys-admin privileges are needed, you only have to download the single jar (Java ARchive) library file (such as monetdb-jdbc-2.19.jar) from the download area and configure in your GUI-program where the jar file is stored (or add it to the CLASSPATH environment setting or java -cp startup parameter). The MonetDB JDBC driver is a 100% pure Java (type 4) driver, which means the same JDBC driver can be deployed on all platforms which support the Java Runtime Environment (JRE) version 1.7 or higher.

Tutorial

 VOCThe MonetDB/SQL tutorial is based on the material published in the book J.R. Bruijn, F.S. Gaastra and I. Schaar, Dutch-Asiatic Shipping in the 17th and 18th Centuries, which gives an account of the trips made to the East and ships returned safely (or wrecked on the way). A total of 8000 records are provided. They include information about ship name and type, captain, the arrival/departure of harbors along the route, personnel accounts, and anecdotal information.

Get going quickly on Linux involves starting the MonetDB daemon monetdbd, code name meriovingian. It runs in the background and facilitates, amongst others, management of local/remote servers.  The daemon is controlled by the application monetdb. See its documentation for all details. The monetdbd is not available on Windows, but these systems a default database called demo is made ready by default (Windows tutorial is available here). 
 
Here we demonstrate a simple session involving creation of a database, loading data, and querying. 
shell> monetdbd create /path/to/mydbfarm
shell> monetdbd start /path/to/mydbfarm
shell> monetdb create voc
shell> monetdb release voc
shell> mclient -u monetdb -d voc
password:<monetdb>
Welcome to mclient, the MonetDB/SQL interactive terminal (Oct2014-SP1)
Database: MonetDB v11.19.7 (Oct2014-SP1), 'mapi:monetdb://localhost:50000/voc'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>SELECT 'hello world';
+---------------+
| single_value  |
+===============+
| hello world   |
+---------------+
1 tuple (0.530ms)
sql>\q

The command \q or end-of-file <Ctrl-d> signal terminates the connection with the server. Exploring the wealth of functionality offered by MonetDB/SQL is best started using a toy database. For this we use the VOC database which provides a peephole view into the administrative system of an early multi-national company, the Vereenigde geoctrooieerde Oostindische Compagnie (VOC for short - The (Dutch) East Indian Company) established on March 20, 1602. Download the VOC dataset voc_dump.zip (542K) gz (519K) bz2 (371K) which is a compressed file with SQL statements. After the file has been extracted, load its contents into MonetDB using mclient. Before you load the VOC data set, it is advised to introduce a user different from the omnipresent default monetdb. The new user is given his own schema for the MonetDB database. Assuming you have started MonetDB with SQL module, proceed as follows:

shell> mclient -u monetdb -d voc
password:<monetdb>
sql>CREATE USER "voc" WITH PASSWORD 'voc' NAME 'VOC Explorer' SCHEMA "sys";
sql>CREATE SCHEMA "voc" AUTHORIZATION "voc";
sql>ALTER USER "voc" SET SCHEMA "voc";
sql>\q

To illustrate the use of the newly created account and schema, the following example connects and creates a table, which is dropped afterwards by aborting the transaction.

shell> mclient -u voc -d voc
password:<voc>
sql>START TRANSACTION;
auto commit mode: off

sql>CREATE TABLE test (id int, data varchar(30));
operation successful (0.722ms)

sql>\d
TABLE  voc.test

sql>\d test
CREATE TABLE "voc"."test" (
    "id"   INTEGER,
    "data" VARCHAR(30)
);
sql>CREATE TABLE "voc"."test" ("id" int, "data" varchar(30));
sql>ROLLBACK;
auto commit mode: on

sql>\d

Importing the voc_dump.sql file into the database can be done using the textual client interface. Some alternative ways are as follows:

1: shell> mclient -u voc -d voc voc_dump.sql
password:<voc>
2:shell> mclient -u voc -d voc < voc_dump.sql
password:<voc>
3:shell> mclient -u voc -d voc
password:<voc>
sql> \< voc_dump.sql

The VOC data set contains data for around 8000 voyages.
 

sql>\d
TABLE  voc.craftsmen
TABLE  voc.impotenten
TABLE  voc.invoices
TABLE  voc.passengers
TABLE  voc.seafarers
TABLE  voc.soldiers
TABLE  voc.total
TABLE  voc.voyages

Let's check out the data we just loaded. First, count the number of rows the voyages table.

sql>SELECT count(*) FROM voyages;
+------+
| L1   |
+======+
| 8131 |
+------+

The set consists of 8 tables, which are all bound to each other using FOREIGN KEY relationships. The voyages table is the main table, which all others refer to. Every table, except invoices has a PRIMARY KEY defined over the columns number and number_sup. Since the invoices table holds zero or more invoices per voyage (identified by number, number_sub) a PRIMARY KEY constraint is not possible. The tables craftsmen, impotenten, passengers, seafarers, and soldiers all share the same columns. We can define a VIEW that combines them all into one big table, to make them easier to access.

sql>CREATE VIEW onboard_people AS
SELECT * FROM (
SELECT 'craftsmen' AS type, craftsmen.* FROM craftsmen
UNION ALL
SELECT 'impotenten' AS type, impotenten.* FROM impotenten
UNION ALL
SELECT 'passengers' AS type, passengers.* FROM passengers
UNION ALL
SELECT 'seafarers' AS type, seafarers.* FROM seafarers
UNION ALL
SELECT 'soldiers' AS type, soldiers.* FROM soldiers
UNION ALL
SELECT 'total' AS type, total.* FROM total
) AS onboard_people_table;

The new view will show up and we can just use it as a normal table, to for instance calculate the number of records for each group of people:

sql>\d
TABLE  voc.craftsmen
TABLE  voc.impotenten
TABLE  voc.invoices
VIEW   voc.onboard_people
TABLE  voc.passengers
TABLE  voc.seafarers
TABLE  voc.soldiers
TABLE  voc.total
TABLE  voc.voyages
sql>SELECT type, COUNT(*) AS total 
FROM onboard_people GROUP BY type ORDER BY type;
+------------+-------+
| type       | total |
+============+=======+
| craftsmen  |  2349 |
| impotenten |   938 |
| passengers |  2813 |
| seafarers  |  4468 |
| soldiers   |  4177 |
| total      |  2467 |
+------------+-------+

Let's see how many impotenten we have in the dataset:

sql>SELECT count(*) FROM impotenten;
+------+
| L1   |
+======+
|  938 |
+------+

It is possible to play with the set in many ways, to find out several things that took place during the voyages of the ships, or the money that was earned. A few examples are shown below:

sql>SELECT COUNT(*) FROM voyages
WHERE particulars LIKE '%_recked%';
+------+
| L1   |
+======+
|  358 |
+------+
sql>SELECT chamber, CAST(AVG(invoice) AS integer) AS average
FROM invoices
WHERE invoice IS NOT NULL
GROUP BY chamber
ORDER BY average DESC;
+---------+---------+
| chamber | average |
+=========+=========+
| A       |  282996 |
| Z       |  259300 |
| H       |  150182 |
| R       |  149628 |
| D       |  149522 |
| E       |  149518 |
| null    |   83309 |
+---------+---------+
sql>CREATE VIEW extended_onboard AS
SELECT number, number_sup, trip, trip_sup,
    onboard_at_departure, death_at_cape,
    left_at_cape, onboard_at_cape,
    death_during_voyage, onboard_at_arrival,
    death_during_voyage - left_at_cape AS death_at_arrival
FROM onboard_people;
sql>\d
TABLE  voc.craftsmen
VIEW   voc.extended_onboard
TABLE  voc.impotenten
TABLE  voc.invoices
VIEW   voc.onboard_people
TABLE  voc.passengers
TABLE  voc.seafarers
TABLE  voc.soldiers
TABLE  voc.total
TABLE  voc.voyages

Windows tutorial

On Windows the first step is to initialize a MonetDB server by simply opening: 'Start -> Programs -> MonetDB -> Start server'. 

Subsequently you can start the textual interface (mclient) by opening: 'Start -> Programs -> MonetDB -> SQL client'.  The commands entered there are identical to those found on other platforms. To stop the server, you can simply close the MonetDB SQL Server window. 

Note the server by default only accepts connections originating from the local host. If you need other machines to access your database, change the configuration file by setting mapi_open=yes.

Exploring the wealth of functionality offered by MonetDB/SQL is best started using a toy database. For this we use the VOC database which provides a peephole view into the administrative system of an early multi-national company, the VOC (or Dutch East India Company).

Download the VOC dataset voc_dump.zip (542K) gz (519K) bz2 (371K) which is a compressed file with SQL statements (SQL database dump). After the file has been extracted, load its contents into MonetDB using the mclient.exe file. From within mclient run:

sql> \< voc_dump.sql

Since the SQL interface is the same on all platforms, from here on you can follow the later part of the regular SQL tutorial.

Changing database location

If you need (or want) to change the location of the database then editing of the monetdb script is required. Change the mclient.bat and M5server.bat files in 'C:\Program Files\MonetDB\MonetDB5'. In the M5server.bat file you would need to change the setting of the MONETDBFARM variable, and in mclient.bat you would need to add a -d option to the call of mclient.exe.  Note: you upgrade such changed files will be overwritten.

About the VOC dataset

The data for the tutorial is published in the book J.R. Bruijn, F.S. Gaastra and I. Schaar Dutch-Asiatic Shipping in the 17th and 18th Centuries, which gives an account of the trips made to the East and ships returned safely (or wrecked on the way) by the VOC (or Dutch East India Company). A total of 8000 records are provided. They include information about ship name and type, captain, the arrival/departure of harbors along the route, personnel accounts, and anecdotal information. You can find more about the VOC on Wikipedia.

MonetDB.R Tutorial

The MonetDB.R tutorial is designed to familiarise R users with the functionality of the MonetDB.R connector and dplyr library for data manipulation in R. 

Before stating with this tutorial, make sure you have a running instance on MonetDB to which you can connect. As such, the MonetDB/SQL tutorial is a recommended reading, even more, given that this page was adapted from the SQL tutorial.

To start, in case you have not done it yet, create a DB farm using the MonetDB daemon monetdbd and a new database called 'voc'. The daemon is controlled by the application monetdb. See its documentation for all details. Note: The monetdbd is not available on Windows. For that reason, on these systems a default database called demo is available by default (Windows-specific MonetDB tutorial is available here). 

From that moment on, you can start your favourite R shell, setup a MonetDB.R connection and process your data in R.
 
Now, install MonetDB.R from your R shell.
> install.packages("MonetDB.R")

At this time you will be prompted to select a mirror.

After MonetDB.R is installed you must load it.

> library(MonetDB.R)
Loading required package: DBI
Loading required package: digest

Now, let's create a connection to the database.

Note that by default MonetDB runs on port 50000, but MonetDB.R already knows that, thus there is no need to set the port. Also, in the tutorial we've use the demo database, which is the default on Windows.

> conn <- dbConnect(MonetDB.R(), host="localhost", dbname="demo", user="monetdb", password="monetdb")

Let's execute a dummy 'hello world' SQL query and in our database directly from R.

> dbGetQuery(conn,"SELECT 'hello world'")
  single_value
1  hello world

To illustrate the use of the newly created account and schema, the following example connects and creates a table, which is dropped afterwards by aborting the transaction.

> dbTransaction(conn)
> dbSendUpdate(conn,"CREATE TABLE test (id int, data varchar(30))")
> dbListTables(conn)
[1] "test"
> dbExistsTable(conn,"test")
[1] TRUE
> dbRollback(conn)
[1] TRUE
> dbListTables(conn)
character(0)

For the purpose of the tutorial we have prepared a TSV (tab-separated values) version of the VOC dataset, which contains data for around 8000 voyages. After unpacking the .zip you can import the data into the directly from R.
Using the monetdb.read.csv function we can load the specified file as a new table 'craftsmen', specifying the upper bound on rows to import (2500), the delimiter (\t). The last parameter indicates that at least 1474 rows need to be read, before before the function can safely auto-deretmine the datatype of the columns.

> file <- "voc_craftsmen.tsv"
> monetdb.read.csv(conn, file, "craftsmen", 2350, delim="\t", nrow.check=1474)
voc_craftsmen.tsv /Users/monetdb/voc_craftsmen.tsv 
    L1
1 2348

Now, let's repeat the same for the other files in the dataset.

> file <- "voc_impotenten.tsv"
monetdb.read.csv(conn, file, "impotenten", 1000, delim="\t", nrow.check=160)
file <- "voc_invoices.tsv"
monetdb.read.csv(conn, file, "invoices", 4000, delim="\t", nrow.check=1000)
file <- "voc_passengers.tsv"
monetdb.read.csv(conn, file, "passengers", 3000, delim="\t", nrow.check=1000)
file <- "voc_seafarers.tsv"
monetdb.read.csv(conn, file, "seafarers", 4500, delim="\t", nrow.check=4500)
file <- "voc_soldiers.tsv"
monetdb.read.csv(conn, file, "soldiers", 4500, delim="\t", nrow.check=4000)
file <- "voc_total.tsv"
monetdb.read.csv(conn, file, "total", 2500, delim="\t", nrow.check=2000)
file <- "voc_voyages.tsv"
monetdb.read.csv(conn, file, "voyages", 8200, delim="\t", nrow.check=5000)

dplyr

Next, let's install and load dplyr, which is designed to facilitate efficient datasets manipulation in R.

> install.packages("dplyr")
> library(dplyr)
Attaching package: ‘dplyr’
The following objects are masked from ‘package:stats’:
    filter, lag
The following objects are masked from ‘package:base’:
    intersect, setdiff, setequal, union

Make a new connection to the database for dplyr, specifying that the source is a MonetDB instance.

> monetdb_conn <- src_monetdb("demo")

Now, lets create non-matirialized views for all tables from the database we are going to need later in R. Note that none of these will actaully move data from MonetDB into R, rather we are just creating a reference to the tables.

> craftsmen <- tbl(monetdb_conn, "craftsmen")
impotenten <- tbl(monetdb_conn, "impotenten")
invoices <- tbl(monetdb_conn, "invoices")
passengers <- tbl(monetdb_conn, "passengers")
seafarers <- tbl(monetdb_conn, "seafarers")
soldiers <- tbl(monetdb_conn, "soldiers")
total <- tbl(monetdb_conn, "total")
voyages <- tbl(monetdb_conn, "voyages")

Let's check out the data we just loaded. First, let's see the dimensions and the column names of the voyages table.

> dim(voyages)
[1] 8131   22
> colnames(voyages)
 [1] "number"            "number_sup"        "trip"             
 [4] "trip_sup"          "boatname"          "master"           
 [7] "tonnage"           "type_of_boat"      "built"            
[10] "bought"            "hired"             "yard"             
[13] "chamber"           "departure_date"    "departure_harbour"
[16] "cape_arrival"      "cape_departure"    "cape_call"        
[19] "arrival_date"      "arrival_harbour"   "next_voyage"      
[22] "particulars"      

As you can see, the respose is actaully an R array.

Since we have more than 8000 rows, let's have a look at the first few.

> head(voyages)

If you are interested in what queries are executed in MonetDB you can turn on the query debug

> options(monetdb.debug.query=T)
> head(voyages)
QQ: 'SELECT "number", "number_sup", "trip", "trip_sup", "boatname", "master", "tonnage", "type_of_boat", "built", "bought", "hired", "yard", "chamber", "departure_date", "departure_harbour", "cape_arrival", "cape_departure", "cape_call", "arrival_date", "arrival_harbour", "next_voyage", "particulars"
FROM "voyages"
LIMIT 6'
QQ: Query result for query 29 with 6 rows and 22 cols, 6 rows.

As you can see, the connector uses the native LIMIT SQL function for the head query.

Disable the debug view for now.

> options(monetdb.debug.query=F)

Let's see how many impotenten we have in the dataset:

> summarise(impotenten, count = n())

   count
1  938
.. ...

It is possible to play with the set in many ways, to find out several things that took place during the voyages of the ships, or the money that was earned. A few examples are shown below:

> invoices_grouped_per_chamber <- group_by(invoices, chamber)
> average_invoices_per_chamber <- summarise(invoices_grouped_per_chamber, average = mean(invoice))
> arrange(average_invoices_per_chamber, desc(average))
Arrange: desc(average) 

   chamber  average
1        A 282996.6
2        Z 259300.5
3        H 150182.4
4        R 149628.3
5        D 149522.8
6        E 149518.0
7       NA  83309.0
..     ...      ...

See also

If you want to lear more about MonetDB.R connector or the dplyr, follow the links below:

The MonetDB also supports embedded R queries in the database. This feature was introduced in the Oct2014 release. Check out this blog post for more information.

About the VOC dataset

VOCThe data for the tutorial is published in the book J.R. Bruijn, F.S. Gaastra and I. Schaar Dutch-Asiatic Shipping in the 17th and 18th Centuries, which gives an account of the trips made to the East and ships returned safely (or wrecked on the way) by the VOC (or Dutch East India Company). A total of 8000 records are provided. They include information about ship name and type, captain, the arrival/departure of harbors along the route, personnel accounts, and anecdotal information. You can find more about the VOC on Wikipedia.

MonetDB Server

Starting a MonetDB database server is as simple as starting the deamon program monetdbd.  At the time of this writing, monetdbd is only available on UNIX-like platforms. The MonetDB daemon program encapsulates an arbitrary number of database servers, and takes care of their configuration.  The general concept of the MonetDB daemon program is that it builds upon a given directory, often referred to as the dbfarm.  There can be at most one instance of monetdbd monitoring a dbfarm, but multiple monetdbd processes can be running on the same system.

The next (optional) step is to create a .monetdb configuration file with information to ease using the Monetdb client interface, called mclient.

shell> cat ~/.monetdb
user=<yourname>
password=<yourpassword>
language=<"sql" or "mal">

In the simple case, monetdbd is started by the system through some init script, and databases are created by the local DBA, and assigned to users.  A more flexible approach for users, however, is the case where a user runs monetdbd itself.  Because a dbfarm can be anywhere in the filesystem, and needs nothing more than just being an ordinary directory, a user can use monetdbd to initialise his/her own dbfarm in an accessible place.  For example:

% monetdbd create ~/my-dbfarm
% monetdbd get all ~/my-dbfarm
property         value
hostname         vomac.ins.cwi.nl
dbfarm           /Users/fabian/my-dbfarm
status           no monetdbd is serving this dbfarm
mserver          unknown (monetdbd not running)
logfile          merovingian.log
pidfile          merovingian.pid
sockdir          /tmp
port             50000
exittimeout      60
forward          proxy
discovery        true
discoveryttl     600
control          no
passphrase       <unknown>
mapisock         /tmp/.s.monetdb.50000
controlsock      /tmp/.s.merovingian.50000

After creation of the personal dbfarm directory, the get command can be used to inspect the dbfarm-wide settings of the newly created environment.  Note that the above output differs per system, per user and per dbfarm.  Important setting is the port to use to open up services for clients.  In particular for a user installation, these often may conflict with another running monetdbd, hence it is wise to change these to a private port in situations where such conflicts may arise:

% monetdbd set port=54321 ~/my-dbfarm
% monetdbd get all ~/my-dbfarm
property         value
hostname         vomac.ins.cwi.nl
dbfarm           /Users/fabian/my-dbfarm
status           no monetdbd is serving this dbfarm
mserver          unknown (monetdbd not running)

logfile          merovingian.log
pidfile          merovingian.pid
sockdir          /tmp
port             54321
exittimeout      60
forward          proxy
discovery        true
discoveryttl     600
control          no
passphrase       <unknown>
mapisock         /tmp/.s.monetdb.54321
controlsock      /tmp/.s.merovingian.54321

Note that control defaults to no.  For remote control connections, control has to be set to yes, and a passphrase has to be set as well.  When set, it will show as a hash in get's output.

When the settings look ok, the next step is to simply start monetdbd:

% monetdbd start ~/my-dbfarm

If the above command returned without any message, the monetdbd process was successfully started.  The logfile (default merovingian.log) should show that the daemon has started, bound some ports, and discovered itself.

From this part on, monetdbd can be given commands through the monetdb command (without the trailing 'd').  The monetdb client utility knows commands much like monetdbd, but it is much more focussed towards giving information about what is going on inside monetdbd.  The monetdb client utility is the main tool to use to manage the dbfarm.

The dbfarm we created is empty, as such there are no databases inside it.  With monetdb we can create a new database:

% monetdb create my-first-db
created database in maintenance mode: my-first-db

This simple command created the necessary bits inside the dbfarm for a database called my-first-db.  It is created in maintenance mode, since otherwise anyone could access it before it is properly setup.  If the command gives an access-denied error, most probably another monetdbd is running on the system.  In that case try adding -p54321(the port) as arguments to monetdb in order to tell it which monetdbd to contact as in the status example.  In the status view of monetdb the newly created database will show up as a locked database:

% monetdb -p54321 status
     name        state     uptime       health
my-first-db     locked             

Albeit the database is locked, it can be started, such that the monetdb superuser can access the database:

% monetdb start my-first-db
starting database 'my-first-db'... done

% monetdb status
     name        state     uptime       health
my-first-db     locked           18s  100%,  0s

The database remains locked, but the uptime field indicates the database is actuallty running.  If setting up the database is ready, the database can be made available for other users (if any, see MonetDB SQL tutorial) to connect to it using the release command:

% monetdb release my-first-db
taken database out of maintenance mode: my-first-db

% monetdb status
     name        state     uptime       health
my-first-db     running       2m 48s  100%,  0s

Next thing, making a connection to the database using mclient requires the -d argument to specify which database to connect to.  It will ask for username and password when none given or provided via a .monetdb file.  The default username/password is monetdb/monetdb:

% mclient -dmy-first-db
user(fabian):monetdb
password:<monetdb>
Welcome to mclient, the MonetDB/SQL interactive terminal (Dec2011-SP1)
Database: MonetDB v11.7.7 (Dec2011-SP1), 'mapi:monetdb://vomac.ins.cwi.nl:54321/my-first-db'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>

Like for the monetdb command, if this fails with some database does not exist error, try giving the port (not the controlport!) to mclient such that it finds the correct monetdbd, e.g.:

% mclient -p54321 -dmy-first-db

It may be necessary to stop individual databases, or even make them unavailable.  The monetdb commands stopand lockcan perform these tasks.  To completely stop the monetdbd daemon process for a dbfarm, stop it using the stop command of monetdbd:

% monetdbd stop ~/my-dbfarm

This should stop the monetdbd and all the databases that are started inside it.  To verify that monetdbd has indeed stopped, review the end of the merovingian.log file.  It should report all databases being shut down, and the deamon shutting down.

mclient man-page

NAME

mclient −
the MonetDB command-line tool

SYNOPSIS

mclient
[ options ] [ file or database [ file
... ] ]

mclient −−help

DESCRIPTION

MonetDB is a
database management system that is developed from a
main-memory perspective with use of a fully decomposed
storage model, automatic index management, extensibility of
data types and search accelerators, and a SQL frontend.

Mclient
is the command-line interface to the MonetDB server.

If the
−−statement=query (−s
query) option is given, the query is executed. If any
files are listed after the options, queries are read from
the files and executed. The special filename
refers to standard input. Note that if there is both a
−−statement option and filename
arguments, the query given with
−−statement is executed first. If no
−−statement option is given and no files
are specified on the command line, mclient reads
queries from standard input.

When reading
from standard input, if standard input is a terminal or if
the −−interactive (−i)
option is given, mclient interprets lines starting
with \ (backslash) specially. See the section
BACKSLASH COMMANDS below.

Before
mclient starts parsing command line options, it reads
a .monetdb file. If the environment variable
DOTMONETDBFILE is set, it reads the file pointed to
by that variable instead. When unset, mclient
searches for a .monetdb file in the current working
directory, and if that doesn’t exist, in the current
user’s home directory. This file can contain defaults
for the flags user, password, language,
save_history, format, and width. For
example, an entry in a .monetdb file that sets the
default language for mclient to mal looks like this:
language=mal. To disable reading the .monetdb
file, set the variable DOTMONETDBFILE to the empty
string in the environment.

OPTIONS

General
Options

−−help
(−?)

Print usage information and
exit.

−−version
(−v)

Print version information and
exit.

−−encoding=encoding
(−E encoding)

Specify the character encoding
of the input. The option applies to both the standard input
of mclient and to the argument of the
−−statement (−s) option but
not to the contents of files specified on the command line
(except for which refers to standard input)
or files specified using the \< command (those
must be encoded using UTF-8). The default encoding is taken
from the locale.

−−language=language
(−l language)

Specify the query language. The
following languages are recognized: mal and
sql. A unique prefix suffices. When the
−−language option is omitted, the default
of sql is assumed.

−−database=database
(−d database)

Specify the name or URI of the
database to connect to. The -d can be omitted if an
equally named file does not exist in the current directory.
As such, the first non-option argument will be interpreted
as database to connect to if the argument does not exist as
file. Valid URIs are as returned by ‘monetdb
discover‘, see monetdb(1), and look like
mapi:monetdb://hostname:port/database.

−−host=hostname
(−h hostname)

Specify the name of the host on
which the server runs (default: localhost). When the
argument starts with a forward slash (/), host is assumed to
be the directory where the UNIX sockets are stored for
platforms where these are supported.

−−port=portnr
(−p portnr)

Specify the portnumber of the
server (default: 50000).

−−interactive[=timermode]
(−i[timermode])

When reading from standard
input, interpret lines starting with \ (backslash)
specially. See the section BACKSLASH COMMANDS below. This is
the default if standard input is a terminal. The optional
timermode argument controls the format of the time
reported for queries. Note that no space is allowed between
−i and timermode. The default mode is
human which adjusts the time precision to the
measured value. The modes ms, s and m
force millisecond, second and minute + second precision
respectively.

−−user=user
(−u user)

Specify the user to connect as.
If this flag is absent, the client will ask for a user name,
unless a default was found in .monetdb file.

−−format=format
(−f format)

Specify the output format. The
possible values are sql, expanded, x,
csv, tab, raw, and xml.
csv is comma-separated values, tab is
tab-separated values, raw is no special formatting
(data is dumped the way the server sends it to the client),
sql is a pretty format which is meant for human
consumption where columns are clearly shown, expanded
and x are synonyms and are another pretty format
meant for human consumption where column values are printed
in full and below each other, and xml is a valid (in
the XML sense) document. In addition to plain csv,
two other forms are possible. csv=c uses
c as column separator; csv+c uses
c as column separator and produces a single header
line in addition to the data.

−−echo
(−e)

Echo the query. Note that using
this option slows down processing.

−−history
(−H)

Load and save the command line
history (default off).

−−log=logfile
(−L logfile)

Save client/server interaction
in the specified file.

−−statement=stmt
(−s stmt)

Execute the specified query.
The query is run before any queries from files specified on
the command line are run.

−−timezone
(−z)

Do not tell the client’s
timezone to the server.

−−Xdebug
(−X)

Trace network interaction
between mclient and the server.

−−pager=cmd
(−| cmd)

Send query output through the
specified cmd. One cmd is started for each
query. Note that the | will have to be quoted or else
the shell will interpret it.

SQL Options


−−null=
nullstr (−n
nullstr)

Set the string to be used as
NULL representation when using the sql, csv, or tab output
formats. If not used, NULL values are represented by the
string "null" in the sql output format, and as the
empty string in the csv and tab output formats. Note that an
argument is required, so in order to use the empty string,
use −n "" (with the space) or
−−null=.

−−autocommit
(−a)

Switch autocommit mode off. By
default, autocommit mode is on.

−−rows=nr
(−r nr)

If specified, query results
will be paged by an internal pager at the specified number
of lines.

−−width=nr
(−w nr)

Specify the width of the
screen. The default is the (initial) width of the
terminal.

−−dump
(−D)

Create an SQL dump.

−−inserts
(−N)

Use INSERT INTO statements
instead of COPY INTO + CSV values when dumping the data of a
table. This option can be used when trying to load data from
MonetDB into another database, or when e.g. JDBC
applications are used to reload the dump.

BACKSLASH COMMANDS

General
Commands

cellspacing="0" cellpadding="0">

\?

Show a help message explaining the backslash
commands.

\q

Exit mclient.

\< file

Read input from the named
file.

\> file

Write output to the named
file. If no file is specified, write to
standard output.

\| command

Pipe output to the given
command. Each query is piped to a new invocation of
the command. If no command is given, revert to
writing output to standard output.

cellspacing="0" cellpadding="0">

\h

Show the readline(3) history.

\L file

Log client/server interaction
in the given file. If no file is specified,
stop logging information.

cellspacing="0" cellpadding="0">

\X

Trace what mclient is doing. This is mostly for
debugging purposes.

\e

Echo the query in SQL formatting mode.

\f format

Use the specified format
mode to format the output. Possible modes the same as for
the −−format (−f)
option.

\w width

Set the maximum page width for
rendering in the sql formatting mode. If width
is −1, the page width is unlimited, when
width is 0, use the terminal width. If
width is greater than 0, use the given
width.

\r rows

Use an internal pager using
rows per page. If rows is −1,
stop using the internal pager.

SQL
Commands

cellspacing="0" cellpadding="0">

\D

Dump the complete database. This is equivalent to using
the program msqldump(1).

\D table

Dump the given
table.

cellspacing="0" cellpadding="0">

\d

Alias for \dvt.

\d[Stvsfn]+

List database objects of the
given type. Multiple type specifiers can be used at the same
time. The specifiers S, t, v, s,
f and n stand for System, table, view,
sequence, function and schema respectively. Note that
S simply switches on viewing system catalog objects,
which is orthogonal to the other specifiers.

\d[Stvsfn]+
object

Describe the given
object in the database using SQL statements that
reconstruct the object. The same specifiers as above can be
used, following the same rules. When no specifiers are
given, vt is assumed. The object can be given with or
without a schema, separated by a dot. The object name can
contain the wildcard characters * and _ that
represent zero or more, and exactly one character
respectively. An object name is converted to lowercase,
unless the object name is quoted by double quotes
("). Examples of this, are e.g.
*.mytable, tabletype*, or
"myschema.FOO". Note that wildcard
characters do not work in quoted objects. Quoting follows
SQL quoting rules. Arbitrary parts can be quoted, and two
quotes following each other in a quoted string represent the
quote itself.

cellspacing="0" cellpadding="0">

\A

Enable auto commit mode.

\a

Disable auto commit mode.

EXAMPLES

Efficiently
import data from a CSV (comma-separated values) file into a
table. The file must be readable by the server. $file
is the absolute path name of the file, $table is the
name of the table, $db is the name of the
database.

mclient -d $db
-s "COPY INTO $table FROM ’$file’ USING
DELIMITERS
’,’,’\\n’,’\"’"

Efficiently
import data from a CSV file into a table when the file is to
be read by mclient (e.g. the server has no access to the
file). $file is the (absolute or relative) path name
of the file, $table is the name of the table,
$db is the name of the database.

mclient -d $db
-s "COPY INTO $table FROM STDIN USING DELIMITERS
’,’,’\\n’,’\"’"
- < $file

Note that in
this latter case, if a count of records is supplied, it
should be at least as large as the number of records
actually present in the CSV file. This, because otherwise
the remainder of the file will be interpreted as SQL
queries.

See
http://www.monetdb.org/Documentation/Manuals/SQLreference/CopyInto
for more information about the COPY INTO query.

SEE ALSO

msqldump(1),
mserver5(1)

monetdb man-page

NAME

monetdb −
control a MonetDB Database Server instance

SYNOPSIS

monetdb
[ monetdb_options ] command [
command_options ] [ command_args ]

DESCRIPTION

Monetdb
allows an administrator of the MonetDB Database Server to
perform various operations on the databases in the server.
It relies on monetdbd(1) running in the background
for all operations.

OPTIONS

Monetdb_options
affect all commands and control the general behavior of
monetdb.

cellspacing="0" cellpadding="0">

−q

Suppresses all standard progress messages, only writing
output to stderr if an error occurred.

−h
hostname

Connect to hostname
instead of attempting a connection over the local UNIX
socket. This allows monetdb to connect to a remote
monetdbd(1). The use of this option requires
−P (see below).

−p port

Connects to the given
portnumber instead of the default (50000). Requires
−h to be given as option too.

−P
passphrase

Specifies the passphrase
necessary to login to a remote monetdbd(1). This
option requires −h to be given as well. A bad
passphrase causes monetdb to fail to login, and hence
fail to perform any remote action.

cellspacing="0" cellpadding="0">

−v

Show version, equal to monetdb version.

COMMANDS

The commands
for the monetdb utility are create,
destroy, lock, release, status,
start, stop, kill, set,
get, inherit, discover, help,
and version. The commands facilitate adding,
removing, maintaining, starting and stopping a database
inside the MonetDB Database Server.

For all
commands, database arguments can be glob-like expressions.
This allows to do wildcard matches. For details on the
syntax, see EXPRESSIONS.

create
[−m pattern] [−p
password] database
[database ...]

Initializes a new database in
the MonetDB Database Server. A database created with this
command makes it available under its database name, but not
yet for use by clients, as the database is put into
maintenance mode. This allows the database administrator to
perform initialization steps before releasing it to users,
unless the −p argument is supplied. See also
monetdb lock. The name of the database must match the
expression [A−Za−z0−9−_]+.

−m
pattern

With the −m flag,
instead of creating a database, a multiplex-funnel is
created. See section MULTIPLEX-FUNNEL in
monetdbd(1). The pattern argument is not fully the
same as a pattern for connecting or discovery. Each parallel
target for the multiplex-funnel is given as
username+password@pattern
sequence, separated by commas. Here the pattern is an
ordinary pattern as would be used for connecting to a
database, and can hence also be just the name of a
database.

−p
password

The −p flag allows
to create a database with the given password for the monetdb
user. Since this protects the database from being accessed
via well-known credentials, the created database is not
locked after creation. This way, a new database can be
created and used right away using the password supplied.

destroy
[−f] database [database ...]

Removes the given
database, including all its data and logfiles. Once
destroy has completed, all data is lost. Be careful when
using this command.

cellspacing="0" cellpadding="0">

−f

By default, a confirmation question is asked, however
the −f option, when provided, suppresses this
question and removal is executed right away. Note that
without this option you cannot destroy a running database,
bring it down first using the stop command.

lock database
[database ...]

Puts the given database in
maintenance mode. A database under maintenance can only be
connected to by an administrator account (by default the
monetdb account). A database which is under
maintenance is not started automatically by
monetdbd(1), the MonetDB Database Server, when
clients request for it. Use the release command to
bring the database back for normal usage. To start a
database which is under maintenance for administrator
access, the start command can be used.

release database
[database ...]

Brings back a database from
maintenance mode. A released database is available again for
normal use by any client, and is started on demand. Use the
lock command to take a database under
maintenance.

status
[−lc] [−s states]
[database ...]

Shows the state of the given
database, or, when none given, all known databases. Three
modes control the level of detail in the displayed output.
By default a condensed one-line output per database format
is used. This output resembles pretty much the output of
various xxxstat programs, and is ideal for quickly
gaining an overview of the system state. The output is
divided into four columns, name, state,
health, and remarks. The state column
contains two characters that identify the state of the
database, based on Booting (starting up), Running, Stopped,
Crashed and Locked (under maintenance). This is followed by
the uptime when running. The health column contains
the percentage of successful starts and stops, followed by
the average uptime. The remarks column can contain
arbitrary information about the database state, but usually
contains the URI the database can be connected to.

cellspacing="0" cellpadding="0">

−c

The −c flag shows the most used properties
of a database. This includes the state of the database
(running, crashed, stopped), whether it is under maintenance
or not, the crash averages and uptime statistics. The crash
average is the number of times the database has crashed over
the last 1, 15 or 30 starts. The lower the average, the
healthier the database is.

−l

Triggered by the −l flag, a long listing is
used. This listing spans many rows with on each row one
property and its value separated by a colon (:). The
long listing includes all information that is available.

−s

The −s flag controls which databases are
being shown, matching their state. The required argument to
this flag can be a combination of any of the following
characters. Note that the order in which they are put also
controls the order in which the databases are printed.
b, r, s, c, and l are
used to print a starting up (booting), started (running),
stopped, crashed and locked database respectively. The
default order which is used when the −s flag is
absent, is rbscl.

start [−a]
database [database ...]

stop
[−a] database [database
...]

kill
[−a] database [database
...]

Starts, stops or kills the
given database, or, when −a is supplied, all
known databases. The kill command immediately sends a
SIGKILL and should only be used as last resort for a
database that doesn’t respond any more. Killing a
database may result in (partial) data loss. It is more
common to use the stop command to stop a database. It
will first attempt to stop the database, waiting for
mero_exittimeout seconds and if that fails, kill the
database. When using the start command,
monetdb(1) will output diagnostic messages if the
requested action failed. When encountering an error, one
should always consult the logfile of monetdbd(1) for
more details. For the kill command a diagnostic
message indicating the database has crashed is always
emitted, due to the nature of that command. Note that in
combination with −a the return code of
monetdb(1) indicates failure if one of the databases
had a failure, even though the operation on other databases
was successful.

get <all |
property[,property[,..]]> [database
...]

Prints the requested
properties, or all known properties, for the given database.
For each property its source and value are printed. Source
indicates where the current value comes from, e.g. the
configuration file, or a local override.

set
property=value database
[database ...]

Sets property to value for the
given database. For a list of properties, run monetdb get
all
. Most properties require the database to be stopped
when set.

shared=<yes|no|tag>

Defines if and how the database
is being announced to other monetdbds or not. If not set to
yes or no the database is simply announced or
not. Using a string, called tag the database is
shared using that tag, allowing for more sophisticated
usage. For information about the tag format and use, see
section REMOTE DATABASES in the monetdbd(1)
manpage. Note that this property can be set for a running
database, and that a change takes immediate effect in the
network.

nthreads=number

Defines how many worker threads
the server should use to perform main processing. Normally,
this number equals the number of available CPU cores in the
system. Reducing this number forces the server to use less
parallelism when executing queries, or none at all if set to
1.

optpipe=string

Each server operates with a
given optimizer pipeline. While the default usually is the
best setting, for some experimental uses the pipeline can be
changed. See the mserver5(1) manpage for available
pipelines. Changing this setting is discouraged at all
times.

readonly=<yes|no>

Defines if the database has to
be started in readonly mode. Updates are rejected in this
mode, and the server employs some read-only optimizations
that can lead to improved performance.

nclients=number

Sets the maximum amount of
clients that can connect to this database at the same time.
Setting this to a high value is discouraged. A
multiplex-funnel may be more performant, see
MULTIPLEX-FUNNEL below.

inherit property
database
[database ...]

Like set, but unsets the
database-local value, and reverts to inherit from the
default again.

discover
[expression]

Returns a list of remote
monetdbds and database URIs that were discovered by
monetdbd(1). All databases listed can be connected to
via the local MonetDB Database Server as if it were local
databases using their database name. The connection is
redirected or proxied based on configuration settings. If
expression is given, only those discovered databases
are returned for which their URI matches the expression. The
expression syntax is described in the section
EXPRESSIONS. Next to database URIs the hostnames and
ports for monetdbds that allow to be controlled remotely can
be found in the discover list masked with an asterisk. These
entries can easily be filtered out using an expression (e.g.
"mapi:monetdb:*") if desired. The control entries
come in handy when one wants to get an overview of available
monetdbds in e.g. a local cluster. Note that for monetdbd to
announce its control port, the mero_controlport
setting for that monetdbd must be enabled in the
configuration file.

cellspacing="0" cellpadding="0">

−h

help
[command]

Shows general help, or short
help for a given command.

cellspacing="0" cellpadding="0">

−v

version

Shows the version of the
monetdb utility.

EXPRESSIONS

For various options, typically
database names, expressions can be used. These expressions
are limited shell-globbing like, where the * in any position
is expanded to an arbitrary string. The * can occur multiple
times in the expression, allowing for more advanced matches.
Note that the empty string also matches the *, hence
"de*mo" can return "demo" as match. To
match the literal ’*’ character, one has to
escape it using a backslash, e.g. "\*".

RETURN VALUE

The monetdb utility
returns exit code 0 if it successfully performed the
requested command. An error caused by user input or database
state is indicated by exit code 1. If an internal
error in the utility occurs, exit code 2 is
returned.

SEE ALSO

monetdbd(1),
mserver5(1)

monetdbd man-page

NAME

monetdbd
− the MonetDB Database Server daemon

SYNOPSIS

monetdbd
command [ command_args ] dbfarm

DESCRIPTION

Monetdbd
is the MonetDB Database Server daemon. The program is mainly
meant to be used as daemon, but it also allows to setup and
change the configuration of a dbfarm. The use of
monetdbd is either as user-oriented way to configure,
start and stop a database farm, or to be started from a
startup script, such as from /etc/init.d/ on Linux
systems or smf(5) on Solaris systems, as part of a
system startup.

Monetdbd
is the system formerly known as merovingian. It was
renamed to monetdbd since the name merovingian
proved to be confusing to most regular end-users.
Internally, monetdbd uses the name merovingian
at many places for historical reasons.

A
monetdbd instance manages one local cluster based,
which is a directory in the system, referred to as the
dbfarm. Nowadays, the dbfarm location always has to
be given as argument to monetdbd.

Within its
local cluster monetdbd takes care of starting up
databases when necessary, and stopping them either upon
request via monetdb(1) or when being shut down.
Client database connections are made against monetdbd
initially which redirects or proxies the client to the
appropriate database process, started on the fly when
necessary.

When started,
monetdbd runs by default in the background, sending
log messages to merovingian.log, until being sent a
stop, terminate or interrupt signal, possibly using the stop
command of monetdbd.

monetdbd
uses a neighbor discovery scheme to detect other
monetdbd processes running in the local network.
Databases from those remote instances are made available to
a locally connecting client. Remote databases never override
local databases, and their availability is controlled by the
remote monetdbd process. See also the sharing
capabilities of monetdb(1) and the REMOTE
DATABASES
section below.

COMMANDS

The commands
for monetdbd are create, start,
stop, get, set, version, and
help. The commands facilitate initializing a dbfarm,
starting and stopping the MonetDB Database Server, and
retrieving or setting options.

create dbfarm

Initializes a new database
farm, such that a MonetDB Database Server can be started on
that location. All necessary directories are attempted to be
created, and an initial properties file is created in the
directory itself. dbfarm must be a location
addressable in the local filesystem hierarchy.

start [−n]
dbfarm

Starts monetdbd, the
MonetDB Database Server, on the given dbfarm. When the
−n flag is given, monetdbd will not fork
into the background, but instead remain attached to the
calling environment, until given a stop signal.

stop dbfarm

Sends a stop signal to the
monetdbd process responsible for the given
dbfarm.

get <all |
property[,property[,..]]> dbfarm

Prints the requested
properties, or all known properties, for the given dbfarm.
For each property, its value is printed. Some properties are
virtual, and given for information purposes only, they
cannot be modified using the set command.

set
property=value dbfarm

Sets property to value for the
given database. For a list of properties, run monetdbd
get all
. Some properties require a restart of the
MonetDB Database Server in order to take effect. The
set command, will however always write the property,
and tell the running monetdbd to reload the
properties file (if running). For an explanation of the
properties, see the CONFIGURATION section below.

CONFIGURATION

Monetdbd
reads its properties from the .merovingian_properties
file inside the dbfarm. This file is created by the
create command. This file is not meant to be edited
manually, instead it should be updated using the set
command. The following properties can be set:

logfile

This property points to the
file where all log messages are written to. It is relative
to the dbfarm directory, but can be absolute to point to
e.g. another medium. Changing this property takes effect
immediately at runtime.

pidfile

Monetdbd stores the
process ID of the background server in the file pointed to
by this property. The same rules apply as for the
logfile property.

sockdir

For faster access,
monetdbd uses UNIX domain sockets for its control
mechanism and regular database connections. The sockets are
placed as files in the filesystem hierarchy. The
sockdir property controls in which directory they are
placed. In general this setting should not be changed.

cellspacing="0" cellpadding="0">

port

This property specifies which TCP port monetdbd
should listen to for connection requests. Defaults to
50000.

listenaddr

This property specifies an
address that is allowed to connect to the server. The user
can specify one IP(v6) address, or use the 0.0.0.0 notation
to allow connections from everywhere. Defaults to localhost
(127.0.0.1).

control

For remote management of
monetdbd, the control property specifies
whether or not to enable remote management. Note that for
remote management, a passphrase is required, see below. It
defaults to false for security reasons. Changing this
property takes effect immediately at runtime.

passphrase

To control monetdbd from
a remote machine, a passphrase is necessary, to be given to
monetdb(1). The passphrase can be either given as
hashed value prefixed by the hash type in curly braces (e.g.
{SHA512}xxx...) or as plain text value which will be hashed
automatically. Note that the only hash accepted is the one
specified at configure time, which is SHA512. Changing this
property takes effect immediately at runtime.

discovery

Specifies whether neighbor
discovery is to be enabled using UDP broadcasts or not. The
broadcasts are done on the same portnumber as the port
setting.

discoveryttl

Monetdbd publishes
locally available databases to others periodically. The
interval used here, defined in seconds, depends on the
time-to-live of the databases before they need to get
refreshed. The default is 600 seconds (10 minutes), which
should keep traffic in your network fairly low. Additions
and removals are processed immediately regardless of this
timeout. If you are in a network environment where physical
network links disappear often, you may want to decrease this
value to more quickly remove no longer reachable
databases.

exittimeout

mservers that were started by
the MonetDB Database Server are shut down when
monetdbd is shut down. Setting the exittimeout
property to a positive non-zero value will shut down each
running mserver with the given time-out in seconds. If the
time-out expires, the mserver process is killed using the
SIGKILL signal. A time-out value of 0 means no mservers will
be shut down, and hence they will continue to run after
monetdbd has shut down. Note that this particular
configuration is extremely inconvenient. The default
time-out is 60 seconds. If your databases are rather large
and find your databases consistently being killed by
monetdbd upon shutdown, you may want to increase this
time-out. Changing this property takes effect immediately at
runtime.

forward

monetdbd has two ways in
which it can "attach" a connecting client to the
target database. The first method, redirect, uses a
redirect sent to the client with the responsible mserver
process. The second method, proxy, proxies the client
to the mserver over monetdbd. While redirect
is more efficient, it requires the connecting client to be
able to connect to the mserver. In many settings this may be
undesirable or even impossible, since a wide range of open
ports and routing are necessary for this. In such case the
proxy technique of monetdbd is a good
solution, which also allows a monetdbd instance on
the border of a network to serve requests to nodes in the
local (unreachable) network. Note that for local databases,
the proxy method uses a UNIX domain socket feature to
pass file-descriptors to the local mserver. This effectively
is as efficient as the redirect approach, but still
hides away the mservers properly behind monetdbd.
Hence, in practice it is only relevant for connections to
remote databases to use redirects instead of proxies.
Changing this property takes effect immediately at
runtime.

REMOTE DATABASES

The neighbor
discovery capabilities of monetdbd allow a user to
contact a remote database transparently, as if it were a
local database. By default, all local databases are
announced in the network, such that neighbors can pick them
up to make them available for their local users. This
feature can be disabled globally, or on database level. For
the latter, the monetdb(1) utility can be used to
change the share property of a database.

While neighbor
discovery in itself is sufficient to locate a database in a
cluster, it is limited in expressiveness. For instance,
database names are assumed to be unique throughout the
entire system. This means local databases overshadow remote
ones, and duplicate remote entries cannot be distinguished.
To compensate for this, monetdbd allows to adds a
tag to each database that is being shared. This tag
is sent in addition to the database name, and only
understood by other monetdbds.

Tags are
arbitrary ASCII-strings matching the pattern
[A−Za−z0−9./]+. There are no assumed
semantics in the tag, which allows for multiple approaches
when using the tag. The tag is always used in combination
with the database name. For this, the ‘/’
character is used as separator, which hence suggests the
user to use that character as separator for multilevel tags.
Monetdbd allows common path globbing using
‘*’ on tags, which allows for many use-cases.
Consider for instance the following three databases with
their tag:

dbX/master/tableQ


dbY/slave/tableQ

dbZ/slave/tableQ

A default match
has implicit ‘/*’ added to the search, making
more generic search strings match more specific ones. Hence,
a connect with database dbX is the same as
dbX/* and hence matches dbX/master/tableQ.
Similar, a database connect for */master matches the
same database as before. Note that the implicit
‘/*’ is not added if that would cause no
matches, such as for */master/tableQ which would
return all masters for tableQ, which in our
hypothetical example is only dbX. In contrast, a
database connect for */slave/tableQ matches with
either dbY or dbZ. Monetdbd returns the
two options to the client in a round-robin fashion, such
that subsequent connects for the same pattern result in a
load-balanced connect to either of both databases.

With tags in
use, one can possibly make distinction between databases, if
setup like that. The previous example could hence also be
setup like this:

tableQ/master


tableQ/slave

tableQ/slave

Connecting to
tableQ/slave would now return either of both
databases even though they are not unique (apart from the
host they are located on, which is not shown in the
example). While being confusing for humans, for
monetdbd it is the same situation as in the previous
example. However, because globbing allows to make things
easier to understand, tags for both slaves could be changed
to slaveX or slave/X and use the necessary
pattern to match them. It is up to the user to decide how to
use the tags.

MULTIPLEX-FUNNELS

Monetdbd
implements multiplex-funnel capabilities. As the name
suggests two techniques are combined, the multiplexer and
the funnel.

The
funnel capability limits the access to the database
to one client at a time. That is, if multiple clients
connect to the funnel, their queries will be serialized such
that they are executed one after the other. An effect of
this approach is that clients no longer have an exclusive
channel to the database, which means that individual queries
from one client may have been interleaved with queries from
others. This most notably makes SQL transaction blocks
unreliable with a funnel. The funnel, hence, is meant to
scale down a large amount of clients that perform
short-running (read-only) queries, as typically seen in
web-based query loads.

When a funnel
is defined to use multiple databases, the funnel adds a
multiplexer to its query channel. A multiplex-funnel
sends each query to all of the defined databases. This
behavior can be quite confusing at first, but proves to be
useful in typical sharding configurations, where in
particular simple selection queries have to be performed on
each of the shards. The multiplexer combines the answers
from all defined databases in one single answer that it
sends back to the client. However, this combining is without
any smart logic, that is, the multiplexer does not evaluate
the query it is running, but just combines all answers it
receives from the databases. This results in e.g. as many
return tuples for a SELECT COUNT(*) query, as there
are databases defined.

Due to the two
above mentioned characteristics, a multiplex-funnel has some
limitations. As mentioned before, transactions over multiple
queries are likely not to result in the desired behavior.
This is due to each query to the funnel is required to be
self-contained. Further, since for each query, the results
from multiple servers have to be combined into one, that
query must only return a single response, i.e.
multi-statement queries are most likely causing the funnel
to respond with an error, or return garbled results. Last,
the size of each query is limited to currently about 80K.
While this size should be sufficient for most queries, it is
likely not enough for e.g. COPY INTO statements. Apart from
the data transfer implications, such statements should not
be used with the funnel, as the results will be undefined
due to the limited query buffer. Applications using the
funnel should aim for short and single-statement queries
that require no transactions.

See the
create command in the monetdb(1) man-page for
details on how to setup a multiplex-funnel.

SIGNALS

Monetdbd
acts upon a number of signals as is common for a daemon.


SIGINT, SIGTERM, SIGQUIT

Any of these signals make
monetdbd enter the shutdown sequence. This sequence
involves cleanly shutting down listener sockets, shutting
down all started databases and finally terminating
itself.

cellspacing="0" cellpadding="0">

SIGHUP

When this signal is received by monetdbd it will
reopen the logfile as pointed to by the logfile
setting. Before it reopens the logfile, it will re-read the
properties file from the dbfarm, which might result in
opening a different file to continue logging.

RETURN VALUE

Monetdbd
returns exit code 0 if it was able to successfully
perform the requested action, e.g. start, stop, etc. When an
error occurs during the action, that prevents
monetdbd from successfully performing the action, the
exit code 1 is returned.

SEE ALSO

monetdb(1),
mserver5(1)

mserver5 man-page

NAME

mserver5
− the MonetDB server version 5

SYNOPSIS

mserver5
[ options ]

DESCRIPTION

Mserver5
is the current MonetDB server that performs all processing
on request of clients for a certain database.

Note that while
mserver5 is the process that does the actual work, it
is usually more common to start, monitor and connect to the
mserver5 process through monetdbd(1).

This man-page
describes the options that mserver5 understands. It
is intended for people who really need to work with
mserver5 itself. In regular cases, the programs
monetdbd(1) and monetdb(1) control the many
options, and allow to adjust them to appropriate values
where sensible. For normal usage, it is preferred to apply
any configuration through these programs.

OPERATION

When the
build-time configuration did not disable this, a
mserver5 process presents the user with a console
prompt. On this prompt, MAL commands can be executed. The
architecture is setup to handle multiple streams of
requests. The first thread started represents the server,
which is the console prompt, reading from standard input and
writing to standard output.

The server
thread started remains in existence until all other threads
die. The server is stopped by Ctrl-D on its console, typing
quit() or by sending it a termination signal (SIGINT,
SIGTERM, SIGQUIT).

MSERVER5 OPTIONS

Mserver5
can be started with options and scripts as arguments. The
MAL scripts will be executed directly after startup on the
console, which eases e.g. testing of MAL scripts directly,
without starting a client.

−−dbpath=
path

Path where mserver5
should find a database. Shorthand for option
gdk_dbpath. Default value:
/var/monetdb5/dbfarm/demo.

−−dbextra=path

Path where mserver5 should
store transient data. Default value is the value of the
−−dbpath option.

−−dbinit=stmt

MAL statement to execute as
part of the startup of the server.

−−config=file

Config file to read options
from. This file can contain all options as can be set with
the --set flag. See CONFIG FILE FORMAT.

−−daemon=yes

−−daemon=no
Disable the console prompt, do not read commands from
standard input. Default: no.

−−set
option=value

Set individual configuration
option. For possible options, see PARAMETERS
sections.

cellspacing="0" cellpadding="0">

−−help

Print list of options.

−−version

Print version and compile
configuration.

GDK PARAMETERS

GDK (Goblin
Database Kernel) is the current columnar storage kernel
engine of the MonetDB 5 database. It is the component that
manages and performs operations on BATs (Binary Association
Tables), single columns. The parameters here affect the
behaviour of GDK which may nagatively impact performance if
set wrongly. The kernel tries to choose the values for
optimal performance. Changing these parameters is
discouraged.

gdk_vmtrim

Enable or disable the vmtrim
thread which tries to unload memory that is not in use.
Default: yes on 32 bit platforms, no on 64 bit
platforms

gdk_debug

You can enable debug output for
specific kernel operations. By default debug is switched off
for obvious reasons. The value of gdk_debug is an integer,
which value can be (a combination of):

          1 = THRDMASK     = thread-specific debug output
          2 = CHECKMASK    = property enforcing on new BATs
          4 = MEMMASK      = memory allocation
          8 = PROPMASK     = property checking on all values:
                             tells about wrongly set properties
         16 = IOMASK       = major IO activity
         32 = BATMASK      = BAT handling
        128 = PARMASK      = Thread management
        256 = HEADLESSMASK = Warn about BAT heads that are not "headless-ready"
        512 = TMMASK       = Transaction management
       1024 = TEMMASK      = Locks and Triggers
       4096 = PERFMASK     = BBP Performance (?)
       8192 = DELTAMASK    = Delta debugging (?)
      16384 = LOADMASK     = Module loading
    2097152 = ALGOMASK     = show join/select algorithm chosen
    4194304 = ESTIMASK     = show result size estimations
                             (for join, select)
   16777216 = JOINPROPMASK = disable property checking with
                             join & outerjoin (e.g., for
                             performance measurements)
   33554432 = DEADBEEFMASK = disable "cleaning" of freed memory
                             in GDKfree() (e.g., for performance
                             measurements)
   67108864 = ALLOCMASK    = exhaustive GDK malloc & free tracing
                             for debugging (GDK developers, only)
  134217728 = OPTMASK      = trace the actions, decisions and
                             effects of MAL optimizers
  268435456 = HEAPMASK     = trace/debug HEAPextend;
                             used only for development & debugging
  536870912 = FORCEMITOMASK = forcefully activate mitosis even on
                              small tables, i.e., split small tables
                              in as many (tiny) pieces as there are
                              cores (threads) available;
                              this allows us to test mitosis
                              functionality without requiring large
                              data sets (— at the expense of a
                              potentially significant interpretation
                              overhead for unnecessarily large plans);
                              used only for development & testing;
                              set automatically by Mtest.py

Note that
mserver5 recognizes a series of command line options
that sets one or more of these debug flags as well:

  −−threads       (THRDMASK | PARMASK)
  −−memory        (MEMMASK | ALLOCMASK)
  −−properties    (CHECKMASK | PROPMASK | BATMASK)
  −−io            (IOMASK | PERFMASK)
  −−heaps         (HEAPMASK)
  −−transactions  (TMMASK | DELTAMASK | TEMMASK)
  −−modules       (LOADMASK)
  −−algorithms    (ALGOMASK | ESTIMASK)
  −−performance   (JOINPROPMASK | DEADBEEFMASK)
  −−optimizers    (OPTMASK)
  −−forcemito     (FORCEMITOMASK)
  −−trace[=stethoscope argument]

Default:
0.

MSERVER5 PARAMETERS

Mserver5
instructs the GDK kernel through the MAL (MonetDB Assembler
Language) language. MonetDB 5 contains an extensive
optimiser framework to transform MAL plans into more optimal
or functional (e.g. distributed) plans. These parameters
control behaviour on the MAL level.

monet_vault_key

The authorisation tables inside
mserver5 can be encrypted with a key, such that
reading the BATs does not directly disclose any credentials.
The monet_vault_key setting points to a file that
stores a secret key to unlock the password vault. It can
contain anything. The file is read up to the first null-byte
(’ ’), hence it can be padded to any length with
trailing null-bytes to obfuscate the key length. Generating
a key can be done for example by using a tool such as
pwgen and adding a few of the passwords generated.
Make sure not to choose a too small key. Note that on
absence of a vault key file, some default key is used to
encrypt the authorisation tables. Changing this setting
(effectively changing the key) for an existing database
makes that database unusable as noone is any longer able to
login. If you use monetdbd(1), a per-database vault
key is set.

max_clients

Controls how many client slots
are allocated for clients to connect. This settings limits
the maximum number of connected clients at the same time.
Note that MonetDB is not designed to handle massive amounts
of connected clients. The funnel capability from
monetdbd(1) might be a more suitable solution for
such workloads.

Default
64.

SQL PARAMETERS

The SQL
component of MonetDB 5 runs on top of the MAL environment.
It has its own SQL-level specific settings.

sql_debug

Enable debugging using a mask.
This option should normally be disabled (0). Default:
0.

sql_optimizer

The default SQL optimizer
pipeline can be set per server. See the optpipe setting in
monetdb(1) when using monetdbd. During SQL
initialization, the optimizer pipeline is checked against
the dependency information maintained in the optimizer
library to ensure there are no conflicts and at least the
pre-requisite optimizers are used. The setting of
sql_optimizer can be either the list of optimizers to run,
or one or more variables containing the optimizer pipeline
to run. The latter is provided for readability purposes
only. Default: default_pipe.

The following
are possible pipes to use:

minimal_pipe

The minimal pipeline necessary
by the server to operate correctly.
minimal_pipe=inline,remap,deadcode,multiplex,garbageCollector

default_pipe

The default pipe line contains
the mitosis-mergetable-reorder optimizers, aimed at large
tables and improved access locality.
default_pipe=inline,remap,costModel,coercions,evaluate,emptySet,aliases,pushselect,mitosis,mergetable,deadcode,commonTerms,joinPath,reorder,deadcode,reduce,matpack,dataflow,querylog,multiplex,garbageCollector

no_mitosis_pipe

The no_mitosis pipe line is
identical to the default pipeline, except that optimizer
mitosis is omitted. It is used mainly to make some tests
work deterministically, and to check/debug whether
"unexpected" problems are related to mitosis
(and/or mergetable).
no_mitosis_pipe=inline,remap,costModel,coercions,evaluate,emptySet,aliases,pushselect,mergetable,deadcode,commonTerms,joinPath,reorder,deadcode,reduce,matpack,dataflow,querylog,multiplex,garbageCollector

sequential_pipe

The sequential pipe line is
identical to the default pipeline, except that optimizers
mitosis & dataflow are omitted. It is use mainly to make
some tests work deterministically, i.e., avoid ambigious
output, by avoiding parallelism.
sequential_pipe=inline,remap,costModel,coercions,evaluate,emptySet,aliases,pushselect,mergetable,deadcode,commonTerms,joinPath,reorder,deadcode,reduce,matpack,querylog,multiplex,garbageCollector

CONFIG FILE FORMAT

The
configuration file readable by mserver5 consists of
parameters of the form "name=value".
The file is line-based, each newline-terminated line
represents either a comment or a parameter. Only the first
equals sign in a parameter is significant. Whitespace before
or after the first equals sign is not stripped. Trailing
whitespace in a parameter value is retained verbatim. Any
line beginning with a hash (#) is ignored, as are
lines containing only whitespace. The values following the
equals sign in parameters are all a string where quotes are
not needed, and if written be part of the string.

SEE ALSO

monetdbd(1),
monetdb(1), mclient(1)

msqldump man-page

NAME

msqldump
− dump a MonetDB/SQL database

SYNOPSIS

msqldump
[ options ] [ dbname ]

DESCRIPTION

MonetDB is a
database management system that is developed from a
main-memory perspective with use of a fully decomposed
storage model, automatic index management, extensibility of
data types and search accelerators, SQL- and XML-
frontends.

Msqldump
is the program to dump an MonetDB/SQL database. The dump can
be used to populate a new MonetDB/SQL database.

Before
msqldump starts parsing command line options, it
reads a .monetdb file. If the environment variable
DOTMONETDBFILE is set, it reads the file pointed to
by that variable instead. When unset, msqldump
searches for a .monetdb file in the current working
directory, and if that doesn’t exist, in the current
user’s home directory. This file can contain defaults
for the flags user and password. To disable
reading the .monetdb file, set the variable
DOTMONETDBFILE to the empty string in the
environment.

OPTIONS

−−help
(−?)

Print usage information and
exit.

−−database=database
(−d database)

Specify the name of the
database to connect to. The −d can be omitted
if it is the last option.

−−host=hostname
(−h hostname)

Specify the name of the host on
which the server runs (default: localhost).

−−port=portnr
(−p portnr)

Specify the portnumber of the
server (default: 50000).

−−user=user
(−u user)

Specify the user to connect as.
If this flag is absent, the client will ask for a user
name.

−−describe
(−D)

Only dump the database
schema.

−−inserts
(−N)

When dumping the table data,
use INSERT INTO statements, rather than COPY INTO + CSV
values. INSERT INTO statements are more portable, and
necessary when the load of the dump is processed by e.g. a
JDBC application.

−−functions
(−f)

Only dump functions
definitions.

−−table=table
(−t table)

Only dump the specified
table.

−−quiet
(−q)

Don’t print the welcome
message.

−−Xdebug
(−X)

Trace network interaction
between mclient and the server.

SEE ALSO

mclient(1),
mserver5(1)

Readonly database

When working in a regular mode, the query processing spans the basic column storage and the data changed by transactions (inserts, updates, and deletes). For instance, the inserted data, kept in a separate delta structure, are added to the column before other operations in the plan. In a 'read-only mode' modifying transactions are not allowed and the query processing spans only over the basic column storage. One of the consequences is that query execution may become faster.

The readonly mode is administered with a boolean variable, which  can be set by the system administrator using the monetdb tool:

shell> monetdb set readonly=yes <mydatabasename>

It ensures that all tables are accessed in read only mode.  Moreover, the database is protected against any catalog change, the user can not even create a temporary table for keeping an intermediate result around.

The alternative route is to deploy SQL schemas in combination with user authentication and access control grants to selectively allow users access to the database. See the corresponding description in the SQL manual.

System resources

MonetDB has been designed to use the available resources without many tuning knobs. Functional enhancements are collected in autoloading scripts, which are executed when a database is created.

Other resources are mostly gathered upon need from the operating system environment. MonetDB aims to use as much of the main memory available, as many cores as can be practically deployed in parallel processing of queries, and trying to avoid going to a slow disk. As such, it is a rather aggressive system, which find immediate competition with ongoing heavy demand applications, e.g. a browser with a ton of open windows.

We recommend to use MonetDB in production settings on a dedicated server or a sizeable Cloud server.

Autoloading scripts

Autoloading  is used to initialize a new SQL catalog with scripts containing tables, functions, and procedures while creating a new database with monetdb. Such scripts are kept in the MonetDB library area ./lib/monetdb5/createdb/ . Look them up for the details. The prefix number is used to direct the order of their execution.  A summary of the current collection is illustrated below

Script Description
09_like.sql String regular pattern matching using PCRE
10_math.sql Polar arithmetic
11_times.sql OS timer and io
12_url.sql URL manipulations (see manual )
13_date.sql POSIX formatted date coercions ((see man strptime/strftime)
14_inet.sql PostgreSQL inet data type (see manual)
15_querylog.sql Query history management (see manual)
16_tracelog.sql Query trace table (see manual)
19_cluster.sql Table clustering functionality (experimental!)
20_vacuum.sql Table vacuum commands (experimental!)
21_dependency_functions.sql Schema analysis
22_clients.sql Client administration procedures
23_skyserver.sql Application support library
24_zorder.sql Application support library (experimental!)
25_debug.sql Debugging tools for optimziers
26_sysmon.sql Active SQL query queue management
30_rdf.sql RDF support module
39_analytics.sql Analytic functions
40_jstore.sql Json support functions
41_jsonstore.sql Minimalistic Json functionality
45_uuid.sql UUID data type functionality
40_geom.sql GIS extension library (see manual)
75_storagemodel.sql Storage footprint analysis and prediction.
80_statistics.sql Analyze tables for optimizer statistics
99_system.sql System administration (internal)

Along the same line, MonetDB library area ./lib/monetdb5/autoload/ contains MAL scripts to be executed each time the server is started.

 

Memory footprint

MonetDB requires all data that needs to be active at any given point in time to fit into the address space --- and of course to fit on the storage device, (i.e., your disk system). On 32-bit systems, the address space is at most 32-bit (4 GB); in practice, it is actually limited to 3 GB or even 2 GB on most systems. On 64-bit systems, the address space can theoretically be 64-bit, but in practice is often "limited" to 48-bit or so --- not that that makes any difference ...

MonetDB excessively uses main memory for processing, but does not require that all data fit in the available physical memory. To handle a dataset that exceeds the available physical memory, MonetDB does not (only) rely on the available swap space, but (also) uses memory-mapped files to exploit disk storage beyond the swap space as virtual memory.

For example, while bulk-loading data (preferably using a COPY INTO statement from a (possibly compressed) CSV file), MonetDB needs to have all columns of the table that is currently being loaded "active", i.e., accessible in the address space. However, during loading, parts of the data are continuously written to the persistent files on disk, i.e., the whole table does not have to fit into main memory. E.g., loading a 100 GB table works fine on a system with 8 GB RAM and 16 GB swap -- provided there is sufficient free disk space.

During query processing, MonetDB requires for each single MAL operation during the query execution that all its inputs, its outputs, and possible temporary data structures fit in the address space. This won't be a problem on 64-bit machines, but may be experienced if you are close to the limit on 32-bit machines. MonetDB automatically resorts to virtual memory and memory-mapped files for large intermediate results. Also (large) persistent tables are accessed using memory mapping.

While running, you might see your mserver5 process' virtual size grow well beyond the available physical memory and possibly also well beyond your swap space.  In principle, this is not a problem at all.  Most of this virtual size is due to the fact that large base tables (or intermediate results) that reside as files on disk are memory-mapped into the address space. Those parts of the data that are currently not accessed do not consume any physical memory (except possibly for caching purposes).

However, if individual columns of your table(s) and/or individual columns of intermediate results exceed the size of the available physical memory, the performance of MonetDB might (will) decrease due to increased I/O requirements.

Multi-core usage

Modern systems come with many CPUs on a single die and further augmented with hyperthreading to increase parallel processing. MonetDB will use as much of the cores as possible for both inter- and intra- parallelism.

A command line argument gdk_nr_threads=<number> can be set to  experiment with a limited number of cores assigned to each user session.

Data compression

A distinctive feature of column stores is to apply aggressive data compression. However, compression is often a two-edged sword, where movement of large data files over relative slow networks, disk access or memory interconnects is compensated for by applying CPU cycles. The effectiveness of which strongly depends on the mapping from database schema into data structures,  their high maintenance cost, the relational algorithms,  the system architecture, and the data distribution. The compression ratios cited  depends on the input size, which is commonly assumed to be in CSV format, the data distribution, and the database storage footprint, with or without auxiliary data structures like indices.

MonetDB applies different compression techniques automatically at many levels.

The column store representation is highly optimized, where the basic storage structure is a dense array, i.e. without holes to accommodate future insertions or overhead caused by the data structure itself (e.g. B-trees). This dense representation allows for direct mapping of the database files into memory. The storage width ranges from 1 (byte) to 8 bytes (doubles).  NULL values are part of the domain space, which avoids auxiliary bit masks at the expensive of 'loosing' a single value from the domain.

All strings are stored using dictionary encoding. This significantly reduces their storage space, but with larger dictionaries the maintenance cost may become expensive. Therefore for really large dictionary tables, MonetDB resort to non-compressed string representation. The references into the dictionary table occupy anywhere from 1 to 8 bytes, depending on the number of elements.

During query evaluation, a dense range of results is represented by a column view. This is a small footprint representation of the result set. It avoids both copying the result and storing it in its private column structure.

Disk space

The disk space footprint is determined by the way columns are being stored. MonetDB uses dictionary encoding for string columns, but aside from this there is no default compression applied to reduce the disk footprint. The prime reason being the two-headed sword of compression. It saves (cheap) disk space and IO bandwidth at the cost of expensive CPU (de)compression overhead (See compression). Since all columns are memory mapped upon access, i.e. they need not be decompressed.  If disk space comes at a premium and memory residency can be guaranteed for a long time, then a compressed file system, e.g. BTRFS, may become helpful. Its compression behavior is often as good as  dedicated algorithms executed within the critical path of a query execution.

The disk footprint can be assessed using the (Linux) command 'du' on the dbfarm directory or to run the query 'select * from storage();', provided the sql extensions are pre-loaded into your database. (See storage model)

Running out of diskspace

One of the features of MonetDB's execution model is that all intermediates are materialized as memory mapped files. A consequence of this approach is that when memory is too small to keep them around, they will be swapped to disk by the operating system. This can be seen as a decaying free space and ultimately a full disk. In turn this (should) lead to a single transaction abort and removing its disk claim. Evidently, filling your disk depends on the number of users and the complexity of their queries. It aids to the cost of running COPY INTO and queries concurrently.

If you have limited resources for concurrent access then the monetdb funnel may be an option to serialize the user requests.  A single server can have multiple funnels.

OOM Killer

The MonetDB server may become a victim of Linux kernel functionality  called Out Of Memory Killer (or OOM Killer) responsible for dealing with excessive memory requirements.
If the system reaches a point where it may  run out of memory, OOM Killer looks for victim process and ends its life the hard way.

In most cases, a simple restart of the server will suffice, but when the killer enters during a complex update transaction then the database may end-up in an inconsistent state.

Server debugging

In most cases, the system produces  informative error messages. However, there are situations where MonetDB enters an area not covered by the test suite or previous use and a segmentation fault occurs. These cases are hard to analyze outside the development lab. To isolate and resolve the issue we need at least the following information.

Sent us the result of the command mserver5 --version --dbname=<databasename> or the equivalent using monetdb --version <databasename>

Is the error reproducible with a small (5-10 line) script/query/program? Trim your experiment to the minimal size that demonstrates the erroneous behavior. Such a script is the best we can hope for, because it will end up in the nightly testing.

In addition, follow the steps, assuming that you are logged onto the same Linux (!) machine as where the server will run:

  1. Sent the tail of the merovingian log (100 lines) located in the dbfarm directory.
  2. Start the database server either with monetdb  or as a full command line mserver --dbinit="..." .
  3. Lookup the process id for this server using the command  ps -al |grep mserver5
  4. Attach the debugger with gdb mserver5 <processid>
  5. Run your experiment again until it hits the segmentation fault, which you see emerging in window.
  6. Then obtain a stack trace with the command where. It shows of the context of the crash using the list command and the variables used in the context of the crash.
  7. Collect the status of all threads using the command thr app all bt.

 

SQL dump and restore

An SQL dump of a database is a common method to safely store away a snapshot of the database for archival purposes or to migrate data between database instances, e.g. between two major system releases. The content of a SQL dump is a large collection of SQL commands in ASCII. Running the script will recreate the database in the same state as it was when the dump was created. Since MonetDB does not provide global locking schemes, a time-consuming dump operation may become invalidated by a concurrent update query.

The primary tool to consider for making an ASCII dump is msqldump, which includes a wide variety of options:.

Usage: msqldump [ options ] [ dbname ]

Options are:
 -h hostname | --host=hostname    host to connect to
 -p portnr   | --port=portnr      port to connect to
 -u user     | --user=user        user id
 -d database | --database=database  database to connect to
 -f          | --functions        dump functions
 -t table    | --table=table      dump a database table
 -D          | --describe         describe database
 -N          | --inserts          use INSERT INTO statements
 -q          | --quiet            don't print welcome message
 -X          | --Xdebug           trace mapi network interaction
 -?          | --help             show this usage message
--functions and --table are mutually exclusive

It act as a normal database application, e.g. it runs concurrently with all other user sessions,  and dumps the generated SQL statements onto standard output. Safe this to a file for later a later restore session by calling mclient with the saved session as argument. Details on the various arguments can be found in the manual page for mclient.

For example, consider you have already installed the SQL tutorial database voc on a Linux platform and wishes to transport it to another machine. Then the following  command line option generates the dump file.

shell> msqldump --database=voc >/tmp/voc.sql

You can inspect the file /tmp/voc.sql to confirm that indeed a readable database dump is available. If storage space is a problem, you can readily pipe the output of this command to a compression tool, e.g. gzip, or sent it directly to another machine in your environment using conventional Linux tools, e.g. scpy. As a baseline for recovery, move the file over to the new machine. Then the monetdb tool can be used to create the database on the recipient machine. Once done, it suffices to feed the dump file to mclient to populate the database.

Creation of the dump respects your credentials, which means you only can dump the tables you have access to.

Migration of a database from other database systems follow the same route, but be aware that SQL dialects often differ. A manual or scripted patch of a foreign SQL dump is often needed. An overview of the various SQL dialects can be found in the SQL dialects book and general background on database compatibility on Wikipedia.

shell> monetdb release demo

Windows platforms

On Windows platforms we recommend to always dump the old database into ASCII before installing a new MonetDB release. Subsequently remove the dbfarm. After installation, the dump can be restored.

To dump the SQL database, start the MonetDB SQL Client program and type the command

\>...\databasedump.sql
\D
\>

The path after \> should be an absolute path name (i.e. start with a drive letter) and be in a save location. By default the database is located in  %APPDATA%\MonetDB5. After having made a database dump it can be removed. This folder is located inside the dbfarm\demofolder.

Restoring the SQL database can be done using the MonetDB SQL Client program with the following command

\<...\databasedump.sql

 

File system level backup

Aside from the ASCII dumps, one can also take the database server out of production using the following commands. Be careful in using shutdown(), because there may be users running transactions. This can be seen using the sys.sessions table or the system monitor.

shell> mclient -d demo
select * from sys.sessions;
call sys.shutdown(10);
<cntrl-d>
shell> monetdb stop demo
shell> monetdb lock demo

After the database has been securely stopped, we can create a copy of the database directory in the dbfarm and put it aside in a safe place. Alternatively, incremental file-system dumps can be used to reduce the time and storage space for a recovery point. Finally, the database is released for production again using:

shell>monetdb release <dbname>