[Monetdb-developers] FW: [Monetdb-checkins] MonetDB/src/gdk gdk_atoms.mx, MonetDB_1-20, 1.134, 1.134.6.1 gdk_posix.mx, MonetDB_1-20, 1.143, 1.143.2.1

Peter Boncz P.Boncz at cwi.nl
Wed Oct 17 16:07:53 CEST 2007


Thanks Stefan,

> I'll leave the further interpretation of the above results to 
> the interested recipient / reader.

I will give it a try, then.

> S08-64:
>      SR        QR       SU       QU[1]     QU[2]
> mh   659m34s   1m09s    81m06s    ERROR    -
> mH   -         -       383m17s    ERROR    -
> Mh   -         -        77m03s   344m32s   1m34s
> MH   644m01s   0m49s   390m42s   342m47s   1m36s

These h/SU results are strange, because a 5 x faster time due to a hash function
with more collisions is.. puzzling. In any case, with 64-bits oids on a 8GB
machine, the shredding generates very intense swapping, and Linux performance
under swapping is known to be highly variable. Difficult to draw conclusions.

I am still investigating the behavior of MT_mmap under load and it may well be
that shredding performance can be improved by better memory management (less
swapping). From the result below we know that it can be done in 25m.

The not-yet checked in improvements in index construction will strongly reduce
the QU[1] time by taking less memory; in the order of the 100m seen below.


> S16-32:
>      SR        QR       SU       QU[1]     QU[2]
> mh   127m59s   0m17s    43m14s    ERROR    -
> mH   110m33s   0m16s    26m26s    ERROR    -
> Mh   128m11s   0m18s    44m00s   100m50s   1m15s
> MH   191m42s   0m17s    25m43s   101m37s   1m21s

These numbers show that MonetDB prefers to operate on RAM resident data.

I am glad you reproduced my 25 minutes h/SU result. 

Must be a coicidence that the mini hash self-join benchmark on 50M string
numbers ran in 26s with the new hash and 43s with the old hash (we see the same
numbers in minutes here repeated for shredding).

In my current version, QU[2] with the ordered indices is back at 17s. Reason for
1m15s is that you tend to loose the hash-table on the index bats due to
swapping. So that includes hash table creation.

Whether QU[1] is acceptable is highly questionable. Starting the database takes
1.5 hours!! I am seriously considering to switch to persistent ordered indices
for the updatable case. For that to happen, the worksing set needs to be
extended with persistent delta-bats on those indices, changes to these deltas
need to be logged (in order to make the indices recoverable).


> -----Original Message-----
> From: Stefan Manegold [mailto:Stefan.Manegold at cwi.nl] 
> Sent: Wednesday, October 17, 2007 1:54 PM
> To: monetdb-developers at lists.sourceforge.net
> Cc: Peter Boncz; Peter Boncz
> Subject: Re: [Monetdb-checkins] MonetDB/src/gdk gdk_atoms.mx, 
> MonetDB_1-20, 1.134, 1.134.6.1 gdk_posix.mx, MonetDB_1-20, 
> 1.143, 1.143.2.1
> 
> 
> Just for the records:
> 
> I finally managed to finsh my experiments regarding
> [ 1811229 ] [ADT] Adding large document, with update support
> http://sourceforge.net/tracker/index.php?func=detail&aid=18112
> 29&group_id=56967&atid=482468
> and the related code changes. For those interested, here's 
> the detailed
> story:
> 
> 
> "S08-64" System (beo-24):
> - 2x 64-bit Dual-Core Opteron270 @ 2 Ghz
> - 8 GB memory
> - MonetDB/XQuery 0.20, 64-bit, 64-bit OIDs, --enable-optimize 
> (gcc 4.1.2)
> 
> "S16-32" System (core-1):
> - 4x 64-bit Dual-Core Opteron870 @ 2 Ghz
> - 16 GB memory
> - MonetDB/XQuery 0.20, 64-bit, 32-bit OIDs, --enable-optimize 
> (gcc 4.1.2)
> 
> Document:
> http://mirror.openstreetmap.nl/planet/planet-071003.osm.bz2
> (extracted: 19 GB XML file)
> 
> "SR" Shredding read-only:
> pf:add-doc(".../planet-071003.osm","planet-071003.osm")
> 
> "SU" Shredding updateable:
> pf:add-doc(".../planet-071003.osm","planet-071003.osm","planet
> -071003.osm",5)
> 
> "QR"/"QU" Count query:
> count(doc("planet-071003.osm")//*)
> 
> Configurations:
> m: without Peter's mmap fix in gdk_posix.mx
>    (i.e., using rev. 1.143 of gdk_posix.mx)
> M: with Peter's mmap fix in gdk_posix.mx
>    (i.e., using rev. 1.143.2.1 of gdk_posix.mx)
> h: without Peter's new string hash function in gdk_atoms.mx
>    (i.e., using rev. 1.134 of gdk_atoms.mx)
> H: with Peter's new string hash function in gdk_atoms.mx
>    (i.e., using rev. 1.134.6.1 of gdk_atoms.mx)
> 
> 
> Results (wall-clock times):
> 
> S08-64:
>      SR        QR       SU       QU[1]     QU[2]
> mh   659m34s   1m09s    81m06s    ERROR    -
> mH   -         -       383m17s    ERROR    -
> Mh   -         -        77m03s   344m32s   1m34s
> MH   644m01s   0m49s   390m42s   342m47s   1m36s
> 
> S16-32:
>      SR        QR       SU       QU[1]     QU[2]
> mh   127m59s   0m17s    43m14s    ERROR    -
> mH   110m33s   0m16s    26m26s    ERROR    -
> Mh   128m11s   0m18s    44m00s   100m50s   1m15s
> MH   191m42s   0m17s    25m43s   101m37s   1m21s
> 
> (NB: "SR" includes building of indices, while "SU" does not;
>  consequently, "QR" can exploit the indices built during 
> "SR", while "QU[1]"
>  has to build the indices first, and only "QU[2]" can exploit them.)
> 
> 
> Apparently, the mmap fix in gdk_posix.mx seems to be 
> sufficient to prevent
> the remap-ERROR reported (for a system & configuration 
> similar to "S08-64")
> in
> [ 1811229 ] [ADT] Adding large document, with update support
> http://sourceforge.net/tracker/index.php?func=detail&aid=18112
> 29&group_id=56967&atid=482468
> 
> I'll leave the further interpretation of the above results to 
> the interested
> recipient / reader.
> 
> 
> Stefan
> 
> 





More information about the developers-list mailing list