SELECT failure: "insufficient space for HEAP ..."

Clarkson, Edward C. Edward.Clarkson at gtri.gatech.edu
Wed May 14 17:25:28 CEST 2014


Also maybe one more clue--we are executing lots of SELECT queries against
these tables--even quite similar CASE statements--but the query that fails
I believe is the only one that uses a CASE statement with arithmetic (in
this case, multiplication/division) in the THEN clause.  I.e., we have
plenty of arithmetic combinations of fields and constants, and plenty of
CASE statements (using both equality and LIKE comparisons), but I think
this is the only query we use that has both.

On 5/13/14, 1:13 PM, "Clarkson, Edward C."
<Edward.Clarkson at gtri.gatech.edu> wrote:

>Hi Stefan,
>
>Thanks for the response, see below:
>
>>Could you check how large your merver5 process is at the time the error
>>occurs?
>
>40-60 MB (private working set), which is pretty much in line with normal
>usage.
>
>>At that very time, how much main memory and how much disk space (on the
>>partition/drive
>>where your dbfarm is located) are still free?
>
>I misspoke below:  there are only 4 GB on the machine I'm using; however,
>there seems to be plenty of memory (>1.2 GB) and disk (>100 GB) free when
>the failure happens.
>
>>Instead of the (statically generated) explain, a "live" trace of a
>>failing run
>>would be more useful (if at all).
>
>A difficulty is that I'm actually triggering this issue in a closed lab
>(i.e., heavy restrictions on data coming out, and no network access in or
>out), so I can't feasibly get the full output to attach to an email or put
>online.  I transcribed the last few lines below (hopefully with no
>errors...) in case that helps.  I did save the full trace log in the lab,
>so if you want me to look for anything specific, I'm happy to do so and
>report back.
>
>In reproducing this, I did notice something I hope is a clue:  the number
>of bytes that the error consistently correlates with the table row count
>used in the query.  Specifically:
>
>Table size	|	Byte Count in error
> 846721			 847842
>1016065			1017856
>1693441			1693696
>
>>Also, if there is any more output before/after the error message you
>>send,
>>that might be helpful to analyze the problem.
>
>I wish there were something more, but just that message.  Just to clarify:
> the error is *not* fatal.  I get the error in response to my SELECT on
>the client side, but the server stays up and doesn't report anything on
>its console, and all subsequent queries (unless of the same form) work as
>expected.
>
>Thanks for the help,
>
>Ed 
>
>Trace output follows:
>--------------------------------------------
>... 
>[ 16900,		"done",	"01:13:25.287000",		7,		11613,
>		"X_4227=<tmp_2556>[105840] :=
>batcalc./(X3688=<tmp_10547>[105840],X4216=<tmp_3655>[105840]);", ]
>
>[ 16901,		"done",	"01:13:25.288000",		4,		40532,
>			"X_2080=<tmp_15234>[105840] :=
>batstr.like(X_1175=<tmp2455>[105840],\"Static%\";",	]
>
>[ 16902,		"done",	"01:13:25.289000",		5,		39936,
>		"X_2024=<tmp_2236>[105840] :=
>batstr.like(X_1175=<tmp2455>[105840],\"Static%\";",	]
>[ 16903,	"done",	"01:13:25.302000",		11,		57964,
>		"X_2199=<tmp_5501>[105840] :=
>batstr.like(X_1175=<tmp2455>[105840],\"OTM%\";",	]
>
>[ 16904,	"done",	"01:13:25.302000",	3,	2756824, 	"barrier X_4923 :=
>language.dataflow();", ]
>
>>----- Original Message -----
>>> Hi,
>>> 
>>> We have an application (http://tmt.gtri.gatech.edu/lytic.html) based on
>>>a
>>> relatively modest, single (~1-10M row, ~20 column) MonetDB table (Win7,
>>>64
>>> bit, Jan14-SP1 server release, quad-core, 8GB RAM) and have been
>>> encountering sporadic SELECT query failures with the error:
>>>"Insufficient
>>> space for HEAP of 1017856 bytes".  Since I've been trying in earnest to
>>> diagnose this, that byte value has been consistent, FWIW.
>>> 
>>> The failures are frustratingly intermittent; if I rerun a failing query
>>> (by hand, using mclient) it sometimes fails again, and sometimes works
>>> correctly.  I've been unable to figure out any sort of pattern.
>>> 
>>> Unfortunately, I can't post the exact table and query, but the general
>>> form of the queries (which have ~5-10 case clauses) are:
>>> 
>>> SELECT
>>> CASE
>>>   WHEN Foo = 'ValueA' and Bar LIKE 'ValueB%' THEN 100*(Baz-1234)/1234
>>>   WHEN Foo = 'ValueC' and Bar LIKE 'ValueD%' THEN 100*(Baz-3456)/3456
>>>   ...
>>> END
>>> FROM table
>>> WHERE
>>>   <same case statement as above> IS NOT NULL
>>> LIMIT 1
>>> 
>>> Any ideas what I might do to narrow this down or other information I
>>>might
>>> provide?  I captured an explain of a failing query--it's many hundreds
>>> (1000s?) of lines, so I didn't want to include it here in full.  All
>>> suggestions appreciated.  Thanks,
>>> 
>>> Ed
>>> 
>>> P.S.  MonetDB has been a fantastic complement to our application,
>>>thanks
>>> to everyone involved!
>
>_______________________________________________
>users-list mailing list
>users-list at monetdb.org
>https://www.monetdb.org/mailman/listinfo/users-list



More information about the users-list mailing list