[Monetdb-developers] Fixing the update issue for Large XML documents

Stefan de Konink skinkie at xs4all.nl
Thu Oct 11 03:00:56 CEST 2007

Hash: SHA512


A short introduction. I'm Stefan de Konink, one of the students in the
ADT course at the University of Amsterdam. I have a degree in Advanced
Computer Science, so I know what C looks like ;) Currently I can use
MonetDB on a rather heavy machine, basically to benchmark the storage of
this machine. And look for bottle necks.

I am also a participant of OpenStreetMap NL, as datacollector from
already existing GIS data. OpenStreetMap runs on MySQL in the UK, and I
wanted to find out if I could run it more efficiently on MonetDB. For
compatibility reasons OSM publishes an XML file containing the current
view of the database daily. My 'job' to see if I could get the same
results on this XML file (using XQuery) as the active MySQL/Ruby

Importing a 20GB XML file was great, but now the 'lessons' progressed we
should update the data. So I took a recent version of the document (they
migrated from one format to another this week), allocated 5% slackspace
and run the pf:add-doc. This went ok, but the trick to count all nodes
ended up in:

> xquery>
> more> fn:count(doc("planet.osm")//*)
> more>MAPI  = monetdb at localhost:50000
> QUERY =  fn:count(doc("planet.osm")//*)
> ERROR = !ERROR: [remap]: 5 times inserted nil due to errors at tuples
> 1 at 0, 2 at 0, 3 at 0, 4 at 0, 5 at 0.
>         !ERROR: [remap]: first error was:
>         !ERROR: CMDremap: operation failed.
>         !ERROR: interpret_unpin: [remap] bat=492,stamp=-729 OVERWRITTEN
>         !ERROR: BBPdecref: 1000000001_rid_nid does not have pointer fixes.
>         !ERROR: interpret_params: leftfetchjoin(param 2): evaluation error.

(As posted to the bugtracker and privately to the lecturer.)

My first guess was that the 1..5 maybe had something to do with the
'slackspace' that was five also. Today I imported the same document
readonly without any problems.

To confirm 'import-updating' in general works I took the data from only
The Netherlands and imported this. (143MB) Which worked like a charm.

I'm running MonetDB4/5 (SR3) on a Quad Xeon, 8GB memory. Storage is a
bit variable ;) But currently on local raid5.

Are there any suggestions from the developers to catch this bug?
Upgrading to CVS could be an option. Trying again with CVS is an option,
trying again with a smaller document too, or if someone instantly comes
up with a solution that could be nice too ;)

Yours Sincerely,

Stefan de Konink
Version: GnuPG v2.0.7 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org


More information about the developers-list mailing list