[Monetdb-developers] Fixing the update issue for Large XML documents
Stefan de Konink
skinkie at xs4all.nl
Thu Oct 11 17:45:30 CEST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Peter Boncz schreef:
> In case I find that there is something wrong here, a bug should be opened at
> sourceforge. That is in http://sf.net/projects/monetdb, choosing Tracker ->
Was already done. But also with the 'small' amount of information.
> I am currently importing the dataset -- the shredding is already taking an hour.
> This is due to many hash collisions on the attributes that have a purely numeric
> value. Your document has no text nodes, but lots of those attributes. I already
> noted that the MonetDB hash function is very fragile in this domain.
Without the proper GB's of RAM, this took me at least 6 hours. The total
amount was around 40GB in read only mode.
> I think I will open a bug report on that one. Bad thing is that fixing it will
> alter our binary repository format. But that has been done before.
I'm sorry for breaking your binary format ;)
> Will keep you posted about my progress in reproducing your remap error. I am
> trying with 64-bits compilation and 64-bits oids, so there should be no
> scalability problems.. Thing is I will be using fedora core, not gentoo.
I never believed in distribution differences. If the hashing is fragile
I presume this should have general priority. Finding out what the
breaking point is, would be interesting too, if your machine is slower
than mine I could offer a virtual machine with Fedora Core.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
-----END PGP SIGNATURE-----
More information about the developers-list