[Monetdb-developers] [MonetDB-users] too deep recursion

Jan Rittinger jan.rittinger at uni-tuebingen.de
Tue Mar 10 09:50:44 CET 2009


On Mar 9, 2009, at 23:35, Martin Kersten wrote:

>
> Please run it also against the HEAD, because most of the problems
> may have been resolved there.
>
> Jan Rittinger wrote:
>> Hi Martin and others,
>> I just tested what part the Pathfinder code generation plays and  
>> generated MIL code for the Aug2008 (0.24), the Nov2008, and the  
>> Feb2009 release branches. I ran all queries using the newest stable  
>> version (Feb2009) on Mac OS X.
>> The observations are:
>> * The problem with gdk_heap.mx, mmap, and Mac OS X still resides  
>> (all queries run in 10 seconds instead of 2 seconds)---Peter knows  
>> what I'm talking about.
>> * Like Nils reported the queries are getting slower.
>> * The main performance decrease in my scenario is the document  
>> loading.
>> * The problem does not stem from Pathfinder's MIL code generation.
>> For more details see the attached file...
>> ------------------------------------------------------------------------
>> BTW: For todays' head version the results are even worse...

       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Attached you find a bundle with the queries and the test results.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: server-performance-Feb2009.tar.gz
Type: application/x-gzip
Size: 5833 bytes
Desc: not available
URL: <http://www.monetdb.org/pipermail/developers-list/attachments/20090310/2c1ba1de/attachment.bin>
-------------- next part --------------


(I'm using a Macbook Pro 2.6GHz Intel Core 2 Duo with 4GB of RAM  
running Mac OS X 10.5.6.)

>>
>> Jan
>> On Mar 9, 2009, at 18:08, Martin Kersten wrote:
>>> For all interested. Indeed there are performance differences
>>> between the various releases. Some can be traced back to
>>> functional enhancements, others are a result from internal
>>> administrative activities.
>>>
>>> Recent experiments with the TPC-H scale-factor 2 on Feb 2009
>>> branch show a performance degradation compared to Aug 2008,
>>> as reported on the website.
>>>
>>> It appears that some low-level actions related to allocation
>>> of BATs and their management in memory-scarce situations are
>>> debet to this situation.
>>>
>>> Solutions are integrated with the HEAD, and may (depending
>>> on our resources) be back propagated into a bugfix release
>>> of the Feb 2009 version.
>>>
>>> Nils Grimsmo wrote:
>>>> On Wed, Mar 04, 2009 at 11:08:40PM +0100, Jan Rittinger wrote:
>>>>> Hi Nils,
>>>>>
>>>>> I just ran your queries with the latest (not yet announced)  
>>>>> Feb2009
>>>>> release (http://monetdb.cwi.nl/downloads/sources/Feb2009/) and
>>>>> received an answer in 1.5 (Q1) and 2.5 (Q2) seconds. If you  
>>>>> still have
>>>>> problems with the new version, then please let us know.
>>>>
>>>> Thank you for your answer, Jan.  Feb2009 is indeed faster than  
>>>> Nov2008,
>>>> but on my computer it is still slower than Aug2008.  I also see  
>>>> some
>>>> strange and unfavorable performance characteristics on subsequent  
>>>> queries
>>>> for Nov2008 and Feb2009 (see below).
>>>>
>>>>
>>>> Aug2008:
>>>> # MonetDB Server v4.24.0
>>>> # based on GDK   v1.24.0
>>>> # PF/Tijah module v0.5.0 loaded. http://dbappl.cs.utwente.nl/ 
>>>> pftijah
>>>> # MonetDB/XQuery module v0.24.0 loaded (default back-end is  
>>>> 'algebra')
>>>>
>>>> Nov2008-SP2:
>>>> # MonetDB Server v4.26.4
>>>> # based on GDK   v1.26.4
>>>> # PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/ 
>>>> pftijah
>>>> # MonetDB/XQuery module v0.26.4 loaded (default back-end is  
>>>> 'algebra')
>>>>
>>>> Feb2009:
>>>> # MonetDB Server v4.28.0
>>>> # Based on GDK   v1.28.0
>>>> # PF/Tijah module v0.9.0 loaded. http://dbappl.cs.utwente.nl/ 
>>>> pftijah
>>>> # MonetDB/XQuery module v0.28.0 loaded (default back-end is  
>>>> 'algebra')
>>>>
>>>>
>>>> I run the queries multiple times in different scenarios.
>>>>
>>>> A - Have just indexed the document, first run.
>>>> B - Second run (subsequent have similar timing).
>>>> C - Restart the server (Mserver), then first run.
>>>> D - Second run (subsequent have similar timing).
>>>>
>>>>
>>>> Query Q0:
>>>>    Aug2008    Nov2008    Feb2009
>>>> A       1101       3687       1760
>>>> B       1031       4510       3015
>>>> C       1350       5216       3390
>>>> D       1035      12620       9533
>>>>
>>>>
>>>> Query Q1:
>>>>    Aug2008    Nov2008    Feb2009
>>>> A       2161      15119       3013
>>>> B       2099      19292       4072
>>>> C       2526      18523       4567
>>>> D       2117      42555      10602
>>>>
>>>>
>>>> This seems very strange to me.  The timings make sense for  
>>>> Aug2008, where
>>>> the query is slightly slower right after restarting the server  
>>>> (C).  For
>>>> Nov2008 and Feb2009, the second (and subsequent) runs are slower  
>>>> than the
>>>> first.  How can this be?  It can make sense for the first run after
>>>> restarting the server (C) to be slower (reading stuff from disk  
>>>> etc.), but
>>>> why is the second (D) terribly slower?  If I just keep running  
>>>> the query,
>>>> the timings are similar to D.
>>>>
>>>> Note:  If I start mixing Q0 and Q1 after step D, they are both as  
>>>> slow as
>>>> in step D.
>>>>
>>>>
>>>> I hope this feedback is helpful.  Is there something strange with  
>>>> my
>>>> setup, or is this a "bug"?  (My timings in step (A) seem similar  
>>>> to Jan's
>>>> timings).
>>>>
>>>>
>>>> If I want to compare MonetDB/XQuery to other implementations in a
>>>> scientific paper, I typically want to warm up the system, then  
>>>> run the
>>>> query multiple times to get an average timing.  It is kind of  
>>>> inconvenient
>>>> not to be able to close down Mserver between experiments...
>>>>
>>>>
>>>>> P.S.: The E-Mail subject seems slightly off topic here :)
>>>>
>>>> Yes, thought I'd avoid touching the mouse to copy the email  
>>>> address.  Cut
>>>> away In-Reply-To:, but forgot to change Subject:...
>>>>
>>>>
>>>> Thank you for your assistance!
>>>>
>>>>
>>>> Klem fra Nils
>>>>
>>>>> On Mar 4, 2009, at 16:30, Nils Grimsmo wrote:
>>>>>
>>>>>> Hi, I just upgraded from the Augst to the Noveber super-ball,  
>>>>>> and the
>>>>>> performance has worsened badly.
>>>>>>
>>>>>> Example queries on dblp.xml (441 MB):
>>>>>>
>>>>>> Q0: count(/dblp//author[text()="Michael Stonebraker"])
>>>>>> Q1: count(/dblp/*/author[text()="Michael Stonebraker"])
>>>>>>
>>>>>> Query time in milliseconds:
>>>>>>
>>>>>>     August    November
>>>>>> Q0     1100        4867
>>>>>> Q1     3993       17999
>>>>>>
>>>>>> I have compiled with --enable-optimise both times.  I query with:
>>>>>>
>>>>>> mclient --language=xquery --algebra --time < $QUERYFILE
>>>>>>
>>>>>> Is this performance degradation expected?  If so, why?
>>>>>>
>>>>>>
>>>>>> BTW:  Is there any way of finding how much disk space a  
>>>>>> collection
>>>>>> uses?
>>>>>>
>>>>>>
>>>>>> Thank you for contributing free software!
>>>>>>
>>>>>>
>>>>>> Klem fra Nils

-- 
Jan Rittinger
Lehrstuhl Datenbanken und Informationssysteme
Wilhelm-Schickard-Institut f?r Informatik
Eberhard-Karls-Universit?t T?bingen

http://www-db.informatik.uni-tuebingen.de/team/rittinger



More information about the developers-list mailing list