[MonetDB-users] Dual instances of MonetDB for 1 dbfarm - is it possible?

Fabian Groffen Fabian.Groffen at cwi.nl
Thu Sep 4 20:01:25 CEST 2008

On 04-09-2008 12:11:49 -0400, McKennirey.Matthew wrote:
> We are trying to create a deployment with as much redundancy (failover) as we 
> can. We assume hardware will fail (drives, hardware network interfaces, 
> memory, etc, etc) and we may lose a machine, a switch, etc.


> (As an aside my understanding is that merovingian starts and monitors mserver5 
> processes on the same machine - I do not see a way to configure merovingian 
> to start mserver5 processes on other machines.)

(it can't start, but it *does* discover neighbour databases)

> Our plan is to use multiple instances of MonetDB (running on multiple 
> machines) each serving an architecturally distinct portion of the system's 
> data such that the failure of one instance would not prevent other parts of 
> the system from functioning. However, we would dearly like to have a failover 
> capability on each instance of MonetDB. Again, only one instance of MonetDB 
> would be interacting with a specific dbfarm and dbname at a time, but if it 
> (or the machine it is on) failed to respond, we would redirect the work to 
> a 'backup' instance on another machine. The merovingian daemon of 
> the 'backup' would be started but there would be no activity until needed.

Here an interesting opportunity is for the merovingian "network".  Each
merovingian does announcing and listening to others.  This makes remote
databases known at the local merovingian.  The current branch has code
to also list this remote information (instead of peeking in
merovingian's logs).  Currently, it is a very simple idea: a database is
announced, and as such stored by other merovingians that receive the
message.  Each database received can be redirected to.  Merovingian will
transparantly do that when a remote database name is requested.  The
rules of "resolving" are simple: always first find a local database, and
if not present, look in the remote list.  This remote list can be in any
order and can contain duplicates.  First one is taken.  Currently no
proprities are encoded in here.  However, it is not impossible to think
of a priority scheme (like DHCP authority, or WINNT PDC master
negociations) in this picture.  It would allow to have the same database
being installed on more machines, but the primary always be the first in
merovingians remote list.  As such a stand-alone merovingian could do
the fail-over step once the primary falls out.

> So I guess the question is, when instance 1 of merovingian or a mserver5 
> process locks a dbfarm and dbname when does it release the lock? and if it 
> fails (software or hardware failure) I presume the locks still exist 
> preventing instance 2 from using that dbfarm and dbname? In which case we are 
> out of luck.

The operating system should release all locks as soon as the program is
terminated.  The lock is only active as long as the filedescriptor is
held open, and the OS closes all file descriptors when it cleans up a
terminated or crashed process.  Locks cannot be "stored", so that should
be safe too.

More information about the users-list mailing list