Merge/Remote tables causing remote server to segfault

Anderson, David B david.b.anderson at citi.com
Mon Apr 25 23:43:56 CEST 2016


All,

I am seeing odd behavior with Remote/Merge tables. I have a pair of servers, one of which has a merge table that combines 12 remote & 12 local tables (24 total) into a single large table. Queries from the main server on the merge table cause the remote server to crash with a SIGSEGV. If I create a merge table which only contains the remote tables (loan2) the remote server doesn't crash, but I get different responses for the same query in the same session:

/opt/flash1/gnmaloan$ mclient -d loan
Welcome to mclient, the MonetDB/SQL interactive terminal (Jul2015-SP4)
Database: MonetDB v11.21.19 (Jul2015-SP4), 'mapi:monetdb://malxcs1p:50000/loan'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>select asofdate,sum(balance),count(loanseqnum) from loan2 group by asofdate;
(mapi:monetdb://monetdb@malxcs2p/loan) unable to find sys.loan201402(loanseqnum)
sql>select asofdate,sum(balance),count(loanseqnum) from loan2 group by asofdate;
+------------+--------------------------+----------+
| asofdate   | L1                       | L2       |
+============+==========================+==========+
| 2014-03-01 |          124075979043384 |  9177029 |
| 2014-05-01 |      1.2634454179935e+14 |  9264814 |
| 2014-07-01 |          126798740013672 |  9335352 |
| 2014-09-01 |          126526301326951 |  9407070 |
| 2014-11-01 |      1.2604434513281e+14 |  9478920 |
| 2015-01-01 |      1.2644932715183e+14 |  9529639 |
| 2015-03-01 |          127602153039423 |  9579604 |
| 2015-05-01 |          126615476286119 |  9654842 |
| 2015-07-01 |          125415108251273 |  9756258 |
| 2015-09-01 |          124427385478338 |  9855118 |
| 2015-11-01 |          126150962049943 |  9945919 |
| 2016-01-01 |          129313947667802 | 10011556 |
+------------+--------------------------+----------+
12 tuples (0.9s)

The only thing I see in the crashing server's log is:

2016-04-25 16:55:50 MSG merovingian[28226]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying
2016-04-25 16:55:50 MSG merovingian[28226]: database 'loan' (2706) was killed by signal SIGSEGV
2016-04-25 16:56:37 MSG merovingian[28226]: starting database 'loan', up min/avg/max: 2m/1h/2h, crash average: 1.00 0.90 0.33 (12-2=10)
2016-04-25 16:56:37 MSG loan[28022]: arguments: /opt/mr/monetdb/bin/mserver5 --dbpath=/opt/flash2/monetdb2/loan --set merovingian_uri=mapi:monetdb://malxcs2p:50000/loan --set mapi_open=false --set mapi_port=0 --set mapi_usock=/opt/flash2/monetdb2/loan/.mapi.sock --set monet_vault_key=/opt/flash2/monetdb2/loan/.vaultkey --set gdk_nr_threads=24 --set max_clients=64 --set sql_optimizer=default_pipe --set monet_daemon=yes

Not sure where to start debugging this. I guess I will see if this still occurs with a small number of rows in the base tables.

Thanks,
Dave


More information about the users-list mailing list