Hi,

The usual questions:
Which version of MonetDB
And
What kind of workload triggers this?

Best,
Stefan


-------- Original message --------
From: "Murthy, Gautham" <Gautham.Murthy@harman.com>
Date: 13/08/2018 18:06 (GMT+01:00)
To: users-list@monetdb.org
Subject: #!ERROR: THRnew: too many threads

Hello,

 

We had a DB crash reported with the below messages, Do we need alter any system level parameters or changes to avoid the below issue –

Server has 128 CPU and 1TB memory.

 

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

 

/etc/security/limits.conf

 

monetadmin soft nproc 102400

monetadmin hard nproc 102400

monetadmin soft nofile 500000

monetadmin hard nofile 500000

monetadmin soft stack 10240

monetadmin soft memlock unlimited

monetadmin hard memlock unlimited

 

 

[monetadmin@lnx1535 DBFARM_TSV_P_B]$ /monet_binaries/MonetDB-11.27.13_PY/bin/monetdb -p 50010 get all TSV_PROD_DB_B

     name          prop     source           value

TSV_PROD_DB_B    name      -        TSV_PROD_DB_B

TSV_PROD_DB_B    type      default  database

TSV_PROD_DB_B    shared    default  yes

TSV_PROD_DB_B    nthreads  default  128

TSV_PROD_DB_B    optpipe   local    sequential_pipe

TSV_PROD_DB_B    readonly  local    yes

TSV_PROD_DB_B    embedr    default  no

TSV_PROD_DB_B    embedpy   local    yes

TSV_PROD_DB_B    embedpy3  default  no

TSV_PROD_DB_B    nclients  local    2048

TSV_PROD_DB_B    dbextra   default  <unknown>

[monetadmin@lnx1535 DBFARM_TSV_P_B]$

 

 

 

2018-08-13 08:46:30 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying

2018-08-13 08:46:30 MSG merovingian[36790]: proxying client 10.106.5.250:29929 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B

2018-08-13 08:46:30 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:31 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:33 MSG merovingian[36790]: proxying client 10.106.5.250:29957 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B

2018-08-13 08:46:33 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying

2018-08-13 08:46:36 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:46:40 MSG merovingian[36790]: proxying client 10.106.5.250:64830 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B

2018-08-13 08:46:40 MSG merovingian[36790]: target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying

2018-08-13 08:46:42 MSG TSV_PROD_DB_B[57408]: #!ERROR: THRnew: too many threads

2018-08-13 08:47:47 MSG discovery[36790]: new neighbour lnx1536.ch3.prod.i.com (lnx1536.ch3.prod.i.com)

2018-08-13 08:47:48 MSG discovery[36790]: new database mapi:monetdb://lnx1536.ch3.prod.i.com:50010/TSV_PROD_DB_B (ttl=660s)

2018-08-13 08:48:55 MSG merovingian[36790]: database 'TSV_PROD_DB_B' (57408) has crashed (dumped core)

2018-08-13 08:49:12 MSG merovingian[36790]: database 'TSV_PROD_DB_B' has crashed after start on 2018-08-13 05:35:17, attempting restart, up min/avg/max: 2h/5d/3w, crash average: 1.00 0.70 0.43 (23-10=13)

2018-08-13 08:49:12 MSG TSV_PROD_DB_B[72455]: arguments: /monet_binaries/MonetDB-11.27.13_PY/bin/mserver5 --dbpath=/monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B --set merovingian_uri=mapi:monetdb://lnx1535.ch3.prod.i.com:50010/TSV_PROD_DB_B --set mapi_open=false --set mapi_port=0 --set

2018-08-13 08:49:12 MSG TSV_PROD_DB_B[72455]:  mapi_usock=/monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock --set monet_vault_key=/monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.vaultkey --set gdk_nr_threads=128 --set max_clients=2048 --set sql_optimizer=sequential_pipe --set embedded_py=true --readonly --set monet_daemon=yes

2018-08-13 08:49:12 MSG TSV_PROD_DB_B[72455]:

2018-08-13 08:49:12 MSG merovingian[36790]: proxying client 10.106.5.250:16467 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B

2018-08-13 08:49:12 MSG merovingian[36790]: proxying client 10.106.5.250:45439 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B

2018-08-13 08:49:12 MSG merovingian[36790]: starting a proxy failed: cannot connect: Connection refused

2018-08-13 08:49:12 ERR control[36790]: !monetdbd: an internal error has occurred 'cannot connect: Connection refused'

2018-08-13 08:49:12 ERR merovingian[36790]: client error: cannot connect: Connection refused

2018-08-13 08:49:12 MSG merovingian[36790]: proxying client 10.106.5.250:45447 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B

2018-08-13 08:49:12 ERR control[36790]: !monetdbd: an internal error has occurred 'cannot connect: Connection refused'

2018-08-13 08:49:12 MSG merovingian[36790]: starting a proxy failed: cannot connect: Connection refused

2018-08-13 08:49:13 ERR merovingian[36790]: client error: cannot connect: Connection refused

2018-08-13 08:49:13 MSG merovingian[36790]: proxying client 10.106.5.250:36605 for database 'TSV_PROD_DB_B' to mapi:monetdb:///monet_data02/DBFARM_TSV_P_B/TSV_PROD_DB_B/.mapi.sock?database=TSV_PROD_DB_B

 

 

/etc/security/limits.conf

 

monetadmin soft nproc 102400

monetadmin hard nproc 102400

monetadmin soft nofile 500000

monetadmin hard nofile 500000

monetadmin soft stack 10240

monetadmin soft memlock unlimited

monetadmin hard memlock unlimited

 

 

[monetadmin@lnx1535 DBFARM_TSV_P_B]$ /monet_binaries/MonetDB-11.27.13_PY/bin/monetdb -p 50010 get all TSV_PROD_DB_B

     name          prop     source           value

TSV_PROD_DB_B    name      -        TSV_PROD_DB_B

TSV_PROD_DB_B    type      default  database

TSV_PROD_DB_B    shared    default  yes

TSV_PROD_DB_B    nthreads  default  128

TSV_PROD_DB_B    optpipe   local    sequential_pipe

TSV_PROD_DB_B    readonly  local    yes

TSV_PROD_DB_B    embedr    default  no

TSV_PROD_DB_B    embedpy   local    yes

TSV_PROD_DB_B    embedpy3  default  no

TSV_PROD_DB_B    nclients  local    2048

TSV_PROD_DB_B    dbextra   default  <unknown>

[monetadmin@lnx1535 DBFARM_TSV_P_B]$