[Monetdb-developers] Fixing the update issue for Large XML documents

Stefan de Konink skinkie at xs4all.nl
Thu Oct 11 17:45:30 CEST 2007

Hash: SHA512

Hi Peter,

Peter Boncz schreef:
> In case I find that there is something wrong here, a bug should be opened at
> sourceforge. That is in http://sf.net/projects/monetdb, choosing Tracker ->
> Bugs.

Was already done. But also with the 'small' amount of information.

> I am currently importing the dataset -- the shredding is already taking an hour.
> This is due to many hash collisions on the attributes that have a purely numeric
> value. Your document has no text nodes, but lots of those attributes. I already
> noted that the MonetDB hash function is very fragile in this domain. 

Without the proper GB's of RAM, this took me at least 6 hours. The total
amount was around 40GB in read only mode.

> I think I will open a bug report on that one. Bad thing is that fixing it will
> alter our binary repository format. But that has been done before.

I'm sorry for breaking your binary format ;)

> Will keep you posted about my progress in reproducing your remap error. I am
> trying with 64-bits compilation and 64-bits oids, so there should be no
> scalability problems.. Thing is I will be using fedora core, not gentoo.

I never believed in distribution differences. If the hashing is fragile
I presume this should have general priority. Finding out what the
breaking point is, would be interesting too, if your machine is slower
than mine I could offer a virtual machine with Fedora Core.

Version: GnuPG v2.0.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


More information about the developers-list mailing list