[Monetdb-developers] glib - double free or corruption

Agustin Schapira schapira at cs.umass.edu
Wed Aug 8 21:52:00 CEST 2007


Dear Monet developers,

We're having trouble with a script that uses the {} operator to  
compute the avg of a large (4M rows) table: we get a 'double free'  
error from glib, and Monet crashes. This is using Monet 4.16.2  
compiled for 64-bits and 32-bit oids on Linux. (BTW, it also happens  
if we use {sum}, but not if we use {count})

The code takes two tables, links and attr, and does the equivalent of

SELECT avg(attr.value)
FROM   links, attr
WHERE  links.id = attr.id
GROUP BY links.from

Here's the MIL code:

# Get the BATs
var var_attr:=bat(bat("prox_link_attr").fetch(2)).find(oid(0));
var var_attr_id:=bat(bat(var_attr).fetch(0));
var var_attr_val:=bat(bat(var_attr).fetch(1));

var var_link_id:=bat(bat("prox_link").fetch(0));
var var_link_from:=bat(bat("prox_link").fetch(1));

# Join ATTR x LINK, keep LINK.from, ATTR.val
var var_1:=var_link_id.join(var_attr_id.reverse());
var var_2:=var_1.mark(0 at 0);
var var_3:=var_1.reverse().mark(0 at 0);
var var_5:=var_2.reverse().join(var_link_from);
var var_8:=var_3.reverse().join(var_attr_val);

# GROUP BY LINK.from and compute AVG(ATTR.val)
var var_9:=var_5.reverse().join(var_8);
var_5.info().print();
var_8.info().print();
var_9.info().print();
var var_10:={avg}(var_9);

At the end of the last statement, Monet crashes with the error:

*** glibc detected *** double free or corruption (!prev):  
0x0000000000632080 ***


I've run this using gdb; below is the trace, including the  
information printed by the .info() calls at the end of the script,  
and some extra info that is printed with debugmask(32 and 131072).  
Any thoughts? Any more information that you would need to debug this  
problem?

Thanks a lot,

-- Agustin

------------
$ gdb --args /usr/local/Monet-venus-debug/bin/Mserver --dbname xxx ~/ 
test.mil
GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and  
you are
welcome to change it and/or distribute copies of it under certain  
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for  
details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host  
libthread_db library "/lib64/tls/libthread_db.so.1".

(gdb) run
Starting program: /usr/local/Monet-venus-debug/bin/Mserver --dbname  
xxx /home/aschapira/test.mil
[Thread debugging using libthread_db enabled]
[New Thread 182918188384 (LWP 14037)]
[New Thread 1082132832 (LWP 14040)]
# Monet Database Server V4.16.2
# Copyright (c) 1993-2007, CWI. All rights reserved.
# Compiled for x86_64-redhat-linux-gnu/64bit with 32bit OIDs;  
dynamically linked.
# Visit http://monetdb.cwi.nl/ for further information.
#-----------------------------------------#
# h                     t                 # name
# str                   str               # type
#-----------------------------------------#
[ "version",              "25105"         ]
[ "batId",                "tmp_41"        ]
[ "batCacheid",           "-33"           ]
[ "batParentid",          "0"             ]
[ "batSharecnt",          "0"             ]
[ "head",                 "oid"           ]
[ "tail",                 "oid"           ]
[ "batPersistence",       "transient"     ]
[ "batRestricted",        "read-only"     ]
[ "batRefcnt",            "1"             ]
[ "batLRefcnt",           "1"             ]
[ "batDirty",             "dirty"         ]
[ "batSet",               "0"             ]
[ "void_tid",             "0"             ]
[ "void_cnt",             "0"             ]
[ "hsorted",              "65"            ]
[ "hident",               "t"             ]
[ "hdense",               "1"             ]
[ "hseqbase",             "0 at 0"           ]
[ "hkey",                 "1"             ]
[ "hloc",                 "4"             ]
[ "hvarsized",            "0"             ]
[ "halign",               "1008981"       ]
[ "hnosorted",            "0"             ]
[ "hnosorted_rev",        "0"             ]
[ "hnodense",             "0"             ]
[ "hnokey[0]",            "0"             ]
[ "hnokey[1]",            "0"             ]
[ "tident",               "h"             ]
[ "tdense",               "0"             ]
[ "tseqbase",             "nil"           ]
[ "tsorted",              "0"             ]
[ "tkey",                 "0"             ]
[ "tloc",                 "0"             ]
[ "tvarsized",            "0"             ]
[ "talign",               "1005793"       ]
[ "tnosorted",            "58570"         ]
[ "tnosorted_rev",        "0"             ]
[ "tnodense",             "0"             ]
[ "tnokey[0]",            "0"             ]
[ "tnokey[1]",            "1"             ]
[ "batInserted",          "0"             ]
[ "batDeleted",           "0"             ]
[ "batFirst",             "0"             ]
[ "top",                  "4180239"       ]
[ "batStamp",             "-73"           ]
[ "lastUsed",             "24012"         ]
[ "curStamp",             "86"            ]
[ "batCopiedtodisk",      "0"             ]
[ "batDirtydesc",         "dirty"         ]
[ "batDirtybuns",         "clean"         ]
[ "batBuns.free",         "33441912"      ]
[ "batBuns.size",         "33441912"      ]
[ "batBuns.maxsize",      "40173552"      ]
[ "batBuns.storage",      "malloced"      ]
[ "batBuns.filename",     "41.buns"       ]
[ "hheapdirty",           "clean"         ]
[ "theapdirty",           "clean"         ]
#-----------------------------------------#
# h                     t                 # name
# str                   str               # type
#-----------------------------------------#
[ "version",              "25105"         ]
[ "batId",                "tmp_42"        ]
[ "batCacheid",           "34"            ]
[ "batParentid",          "0"             ]
[ "batSharecnt",          "0"             ]
[ "head",                 "oid"           ]
[ "tail",                 "int"           ]
[ "batPersistence",       "transient"     ]
[ "batRestricted",        "read-only"     ]
[ "batRefcnt",            "1"             ]
[ "batLRefcnt",           "1"             ]
[ "batDirty",             "dirty"         ]
[ "batSet",               "0"             ]
[ "void_tid",             "-1"            ]
[ "void_cnt",             "0"             ]
[ "hsorted",              "65"            ]
[ "hident",               "h"             ]
[ "hdense",               "1"             ]
[ "hseqbase",             "0 at 0"           ]
[ "hkey",                 "1"             ]
[ "hloc",                 "0"             ]
[ "hvarsized",            "0"             ]
[ "halign",               "1009002"       ]
[ "hnosorted",            "0"             ]
[ "hnosorted_rev",        "0"             ]
[ "hnodense",             "0"             ]
[ "hnokey[0]",            "0"             ]
[ "hnokey[1]",            "0"             ]
[ "tident",               "t"             ]
[ "tdense",               "0"             ]
[ "tseqbase",             "0 at 0"           ]
[ "tsorted",              "0"             ]
[ "tkey",                 "0"             ]
[ "tloc",                 "4"             ]
[ "tvarsized",            "0"             ]
[ "talign",               "1009003"       ]
[ "tnosorted",            "0"             ]
[ "tnosorted_rev",        "0"             ]
[ "tnodense",             "0"             ]
[ "tnokey[0]",            "0"             ]
[ "tnokey[1]",            "0"             ]
[ "batInserted",          "0"             ]
[ "batDeleted",           "0"             ]
[ "batFirst",             "0"             ]
[ "top",                  "4180239"       ]
[ "batStamp",             "-84"           ]
[ "lastUsed",             "24028"         ]
[ "curStamp",             "86"            ]
[ "batCopiedtodisk",      "0"             ]
[ "batDirtydesc",         "dirty"         ]
[ "batDirtybuns",         "clean"         ]
[ "batBuns.free",         "33441912"      ]
[ "batBuns.size",         "35202008"      ]
[ "batBuns.maxsize",      "42270704"      ]
[ "batBuns.storage",      "malloced"      ]
[ "batBuns.filename",     "42.buns"       ]
[ "hheapdirty",           "clean"         ]
[ "theapdirty",           "clean"         ]
#-----------------------------------------#
# h                     t                 # name
# str                   str               # type
#-----------------------------------------#
[ "version",              "25105"         ]
[ "batId",                "tmp_43"        ]
[ "batCacheid",           "-35"           ]
[ "batParentid",          "0"             ]
[ "batSharecnt",          "0"             ]
[ "head",                 "oid"           ]
[ "tail",                 "int"           ]
[ "batPersistence",       "transient"     ]
[ "batRestricted",        "read-only"     ]
[ "batRefcnt",            "1"             ]
[ "batLRefcnt",           "1"             ]
[ "batDirty",             "dirty"         ]
[ "batSet",               "0"             ]
[ "void_tid",             "6122832"       ]
[ "void_cnt",             "0"             ]
[ "hsorted",              "0"             ]
[ "hident",               "t"             ]
[ "hdense",               "0"             ]
[ "hseqbase",             "nil"           ]
[ "hkey",                 "0"             ]
[ "hloc",                 "4"             ]
[ "hvarsized",            "0"             ]
[ "halign",               "1005793"       ]
[ "hnosorted",            "58570"         ]
[ "hnosorted_rev",        "0"             ]
[ "hnodense",             "0"             ]
[ "hnokey[0]",            "0"             ]
[ "hnokey[1]",            "1"             ]
[ "tident",               "h"             ]
[ "tdense",               "0"             ]
[ "tseqbase",             "0 at 0"           ]
[ "tsorted",              "0"             ]
[ "tkey",                 "0"             ]
[ "tloc",                 "0"             ]
[ "tvarsized",            "0"             ]
[ "talign",               "1009003"       ]
[ "tnosorted",            "0"             ]
[ "tnosorted_rev",        "0"             ]
[ "tnodense",             "0"             ]
[ "tnokey[0]",            "0"             ]
[ "tnokey[1]",            "0"             ]
[ "batInserted",          "0"             ]
[ "batDeleted",           "0"             ]
[ "batFirst",             "0"             ]
[ "top",                  "4180239"       ]
[ "batStamp",             "-85"           ]
[ "lastUsed",             "24043"         ]
[ "curStamp",             "86"            ]
[ "batCopiedtodisk",      "0"             ]
[ "batDirtydesc",         "dirty"         ]
[ "batDirtybuns",         "clean"         ]
[ "batBuns.free",         "33441912"      ]
[ "batBuns.size",         "33441912"      ]
[ "batBuns.maxsize",      "40173552"      ]
[ "batBuns.storage",      "malloced"      ]
[ "batBuns.filename",     "43.buns"       ]
[ "hheapdirty",           "clean"         ]
[ "theapdirty",           "clean"         ]
*** double free or corruption (!prev): 0x0000000000632080 ***

Program received signal SIGABRT, Aborted.
[Switching to Thread 182918188384 (LWP 14037)]
0x0000003a7b72e21d in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000003a7b72e21d in raise () from /lib64/tls/libc.so.6
#1  0x0000003a7b72fa1e in abort () from /lib64/tls/libc.so.6
#2  0x0000003a7b763451 in __libc_message () from /lib64/tls/libc.so.6
#3  0x0000003a7b76906e in _int_free () from /lib64/tls/libc.so.6
#4  0x0000003a7b7693b6 in free () from /lib64/tls/libc.so.6
#5  0x0000002a96665c43 in GDKfree (blk=0x632088)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ 
gdk/gdk_utils.mx:1121
#6  0x0000002a965ea1cf in HEAPfree (h=0x631940)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ 
gdk/gdk_heap.mx:264
#7  0x0000002a966dc693 in BATdelete (b=0x631828)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ 
gdk/gdk_storage.mx:716
#8  0x0000002a965e5e6d in BBPaddtobin (b=0x631828)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ 
gdk/gdk_bbp.mx:2366
#9  0x0000002a965e252a in BBPdestroy (b=0x631828)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ 
gdk/gdk_bbp.mx:1665
#10 0x0000002a965e12b1 in BBPreclaim (b=0x631828)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ 
gdk/gdk_bbp.mx:1514
#11 0x0000002a95ef5e0f in interpret_setaggr (name=0x62a038 "{avg}",  
argc=2, argv=0x620c18, res=0x7fbffff790, tt=0x631078,
     stk=1) at /export/scratch0/monet/monet.GNU.64.64.d.14791/ 
MonetDB4/src/monet/monet_multiplex.mx:1819
#12 0x0000002a95e95c57 in interpret (stk=1, lt=0x625090,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:1202
#13 0x0000002a95e9da04 in interpret_assignment (stk=1, lt=0x625040,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:1842
#14 0x0000002a95e97349 in interpret_var (stk=1, lt=0x625018,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:1329
#15 0x0000002a95e93e30 in interpret (stk=1, lt=0x625018,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:832
#16 0x0000002a95e9dca3 in interpret_seqblock (stk=1, lt=0x5ca8b8,  
res=0x7fbffff790, scope=0)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:1892
#17 0x0000002a95e93606 in interpret (stk=1, lt=0x629478,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:770
#18 0x0000002a95e9173d in interpret_str (stk=0,
     buf=0x629908 "# Get the BATs\nvar var_attr:=bat(bat 
(\"prox_link_attr\").fetch(2)).find(oid(0));\nvar var_attr_id:=bat(bat 
(var_attr).fetch(0));\nvar var_attr_val:=bat(bat(var_attr).fetch(1)); 
\n\nvar var_link_id:=bat(bat(\"p"...,
     res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d. 
14791/MonetDB4/src/monet/monet_interpreter.mx:246
#19 0x0000002a95e91bf0 in interpret_file (stk=0, lt=0x5411c8,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:291
#20 0x0000002a95e93f39 in interpret (stk=0, lt=0x5411c8,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_interpreter.mx:858
#21 0x0000002a95ebf65e in handleRequest (t=0x2a96a46840, q=0x64d5a8,  
res=0x7fbffff790)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_queue.mx:537
---Type <return> to continue, or q <return> to quit---
#22 0x0000002a95ebfadb in doRequest (t=0x2a96a46840, preference=0x0)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_queue.mx:563
#23 0x0000002a95f1108c in monetInterpreter (status=0x7fbffff7f8)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
monet/monet_process.mx:112
#24 0x0000000000402674 in main (argc=4, av=0x7fbffff908)
     at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ 
tools/Mserver.mx:403



When I run this with debugmask(32 + 131072), I get the following  
debug output as well. This is right after the calls to .info(),  
during the processing of {avg}:

#interpret_unpin(print) on bat(59) refcnt = 0
##60 = new tmp_74(int,int)
##BBPreclaim: bat(60) view=0 lrefs=0 ref=1 stat=1
#interpret_pin({avg}) on bat(-35) refcnt = 1
##61 = new tmp_75(oid,int)
##62 = new tmp_76(oid,void)
##65 = new tmp_101(oid,void)
##66 = new tmp_102(oid,void)
##67 = new tmp_103(oid,void)
##BBPreclaim: bat(67) view=0 lrefs=0 ref=1 stat=1
##clear 67 (tmp_103)
##uncache 67 (tmp_103)
##BBPreclaim: bat(66) view=0 lrefs=0 ref=1 stat=1
##clear 66 (tmp_102)
##uncache 66 (tmp_102)
##BBPreclaim: bat(65) view=0 lrefs=0 ref=1 stat=1
##clear 65 (tmp_101)
##uncache 65 (tmp_101)
##BBPreclaim: bat(62) view=0 lrefs=0 ref=1 stat=1
##clear 62 (tmp_76)
##uncache 62 (tmp_76)
##BBPreclaim: bat(61) view=0 lrefs=0 ref=1 stat=1
##clear 61 (tmp_75)
##uncache 61 (tmp_75)

#setaggr impl: non-optimized hash

#interpret_pin(count) on bat(61) refcnt = 2
#interpret_unpin(count) on bat(61) refcnt = 1
#interpret_pin(sum_lng) on bat(61) refcnt = 2
#interpret_unpin(sum_lng) on bat(61) refcnt = 1
...
[these 4 lines repeat 20,000 times, which corresponds to the number  
of unique values in the HEAD of var_9]
...
##BBPreclaim: bat(65) view=0 lrefs=0 ref=1 stat=1
##clear 65 (tmp_101)
##uncache 65 (tmp_101)
##BBPreclaim: bat(61) view=0 lrefs=0 ref=1 stat=1


I hope this helps. Thanks again.








More information about the developers-list mailing list