[Monetdb-developers] Single column select vs aggregation

Martin Kersten Martin.Kersten at cwi.nl
Thu Feb 18 19:31:26 CET 2010


Stefan Manegold wrote:
> On Thu, Feb 18, 2010 at 07:05:47PM +0100, Stefan de Konink wrote:
>> Op 18-02-10 18:23, Stefan Manegold schreef:
>>> On Thu, Feb 18, 2010 at 05:46:31PM +0100, Stefan de Konink wrote:
>>>> Attached is a trace of a single column select operation, with a max,
>>>> avg and equality.
>>>>
>>>> The equality takes about 4 times more time than the avg operation.
>>>>
>>>> From the plan I see that the operation is executed paralel, but I do
>>>> wonder why suddenly the uselect takes such a high amount of time.
>>> Which version of MonetDB?
>> CVS HEAD, 2 am.
>>
>>> How configured / compiled?
>> --enable-assert --enable-debug
>>
>>> How was the mserver5 started (which command line options were given)?
>> /opt/monetdb-head/bin/mserver5 --dbfarm=/home/skinkie/monetb
>> --dbname=kvk --dbinit="include sphinx; include sql;" --set
>> mapi_port=50001
>>
>>> Any changes compared to the default monetdb5.conf?
>> No, not changed anything in there.
>>
>>> Is the "kvk" column sorted?
>> No, it does include a partial sort (msb are sorted). A 32bit column
>> (kvks) that is (or either should be...) ends up with between 170ms ~
>> 220ms.
>>
>>> Could you run the select query in the MAL debugger
>>> (DEBUG select kvk from kvk where kvk = 412657690010;),
>>> single step through it,
>>> set the debug mask to 2097152
>>> (debug 2097152)
>>> just before the algebra.uselect() is executed,
>>> and report the output on the server console after the algebra.uselect() has
>>> been executed?
>> Do you mean this?
> 
> indeed
> 
>> mdb>#    _44 := algebra.uselect(_40=<tmp_5710>:bat[:oid,:lng][1196532],A0=412657690010:lng);
>> mdb>
>> #BAT_select_(b=tmp_5710): sampling: tmp1 = BATslice(b=tmp_5710, _lo=598266, _hi=598371);
>> #BAT_select_(b=tmp_5710): sampling: tmp2 = BAT_select_(tmp1=tmp_5704, tl, th, tail);
>> #BAT_select_(b=tmp_5704): BAT_scanselect(b=tmp_5704, bn=tmp_5703, tl, th, equi=1, nequi=0, lval=1, hval=1, nocheck=1);
>> #seqscan_eq_lng_void_tloc_oid_vid_nocheck_noinc[if ( simple_EQ(tl ,v,lng) ),v,oid_ctr,oid_ctr++;](b=tmp_5704, bn=tmp_5703, tl, th, oid_ctr=598266, str_idx=0);
>> #BAT_select_(b=tmp_5704): tmp_5703: hkey=1, tkey=0, hsorted=139659451564097, tsorted=0.
>> #BAT_select_(b=tmp_5710): BAT_hashselect(b=tmp_5710, bn=tmp_5703, tl); (building hash-table on the fly)
>                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> "It" (in fact we) choose to do a hash select, and since there is no hash
> table, yet, we need to build it, which is infact more expensive than a
> simple scan select for this very operation (later operation *might* then
> benefit from the hash table ...):
> ========
> $ grep -9n --color 'building hash-table on the fly' MonetDB/src/gdk/gdk_batop.mx
> 1210-		int nocheck = (estimate >= batcnt);
> 1211-
> 1212-		if (!preserve_order && equi && b->T->hash) {
> 1213-			ALGODEBUG THRprintf(GDKout, "#BAT_select_(b=%s): BAT_hashselect(b=%s, bn=%s, tl); (using existing hash-table)\n", BATgetId(b), BATgetId(b), BATgetId(bn));
> 1214-
> 1215-			bn = BAT_hashselect(b, bn, tl);
> 1216-		} else if (!preserve_order && equi && ATOMsize(b->ttype) > 1 && estimate * 100 < batcnt && batcnt * 2 * sizeof(int) < (GDK_mem_maxsize / 4)) {
> 1217-			/* Build a hash-table on the fly for equi-select if the selectivity is low
> 1218-			 * and it is not too big */
> 1219:			ALGODEBUG THRprintf(GDKout, "#BAT_select_(b=%s): BAT_hashselect(b=%s, bn=%s, tl); (building hash-table on the fly)\n", BATgetId(b), BATgetId(b), BATgetId(bn));
> 1220-
> 1221-			bn = BAT_hashselect(b, bn, tl);
> 1222-		} else {
> 1223-			ALGODEBUG THRprintf(GDKout, "#BAT_select_(b=%s): BAT_scanselect(b=%s, bn=%s, tl, th, equi=%d, nequi=%d, lval=%d, hval=%d, nocheck=%d);\n", BATgetId(b), BATgetId(b), BATgetId(bn), equi, nequi, lval, hval, nocheck);
> 1224-
> 1225-			bn = BAT_scanselect(b, bn, tl, th, li, hi, equi, nequi, lval, hval, nocheck);
> 1226-		}
> 1227-	}
> 1228-	if (bn == NULL) {
> ========
> 
> In fact, I doubt whether investing in building a hash table is a good
> decision in such cases where we do/can not know whether it will ever pay off
> ...
> 
> Hence, I'd propose to simply drop the choise to build a hash table on the
> fly, and rather fall through to the basic scan select also in this (rare?) case.
I support it. It might pay off to build the hash when the column is key, though, because
that is a good indicator for future point select. Provided the table is persistent
and we can predict it to be used.

> 
> For now, you can just locally disable/remove that alternative in the above code,
> try again, and report the result.
> 
> Stefan
> 
> 
>> #BAThash: create hash(1196532);
>> #BAT_select_(b=tmp_5710): tmp_5703: hkey=1, tkey=0, hsorted=139659451564032, tsorted=139659451564032.
> 





More information about the developers-list mailing list