Rethinking containers

Our previous post (Running MonetDB in containers) described how to use the MonetDB images we publish at dockerhub. However, the process explained there is not very straightforward. Ideally, we would like to start the container and then be able to manage databases in it using the tools installed in the host machine.

To do this, we need to be able to set a few properties for the dbfarm, and in order to do that, the farm needs to be created not when the image is built, but when the container starts for the first time.

Properties and environment variables

This process is implemented by changing the entrypoint of the image to be a script that will create the dbfarm, and set the needed properties. Then, you can pass these properties to the script as environment variables.

The possible properties you can specify are the following:

Passphrase: This is the passphrase used to contact the MonetDB daemon remotely, i.e. from outside the container. By default, it is monetdb, but you should seriously consider changing it. You can change it by setting the environment variable MDB_DAEMONPASS.
Logfile: The file where the daemon should write the log messages. By default it’s the file merovingian.log relative to the database farm directory. You can change it by setting the environment variable MDB_LOGFILE
Snapshotdir: The directory where the daemon should write the database snapshots. If no value is set, the daemon does not produce snapshots. You can set it by setting the environment variable MDB_SNAPSHOTDIR.
Snapshotcompression: The compression scheme used to store the snapshots. By default the scheme used is .tar.lz4, but other possible values are .tar, .tar.gz, .tar.xz and .tar.bz2. You can change it by setting the environment variable MDB_SNAPSHOTCOMPRESSION.

Finally, there is a special environment variable MDB_SHOW_VARS that, if defined, displays all the properties associated with the database farm in the container.

You can set the values of these parameters to the MonetDB daemon in the container by using the environment variable mechanism of the container runtime. For example, in docker you would say:

docker run --rm -p 50000:50000 -it \
    -e MDB_SHOW_VARS=1 \
    -e MDB_SNAPSHOTDIR=/snapshots \
    -e MDB_SNAPSHOTCOMPRESSION=.tar \
    -v snapshots:/snapshots \
    -v data-vol:/var/monetdb5/ \
    monetdb/monetdb:latest

This invocation mounts two local directories, /tmp/snap and /tmp/farm as volumes in the container, mapped to /snapshot and /var/monetdb5 respectively. It also sets the variables MDB_SNAPSHOTDIR, MDB_SNAPSHOTCOMPRESSION and MDB_SHOW_VARS. Notice that we mount a separate docker volume for the snapshots and specify the value for MDB_SNAPHSHOTDIR.

You should keep in mind that the database farm is always located at the directory /var/monetdb5/dbfarm inside the container, so you should map your external volumes there.

Epilogue

Hopefully these changes will make it even easier to use containerized MonetDB.