[Monetdb-developers] Unable to shred 1gb xml file (OS: Not enough space)

Stefan Manegold Stefan.Manegold at cwi.nl
Tue Aug 8 11:24:59 CEST 2006


Hi Klarinda,

On Fri, Aug 04, 2006 at 11:54:47AM +0800, kla gw wrote:
> Hi Stefan,
> 
> Thanks for your reply.
> 
> >Concerning your, it looks as if the 1GB requires more (virtual) memory
> >(and/or disk space) to load than is available on your system. The error
> >indicates that MonetDB/XQuery fails to allocate 442499072 bytes (422 MB),
> >while already 794361856 bytes (757 MB) of your machine's virtual memory 
> >(512
> >MB physical memory plus the size of your Windows installation's "page 
> >file")
> >are used by MonetDB. Most probably, your virtual memory is less than
> >422 MB + 757 MB = 1179 MB, right?
> >It could also be, that you harddisk (at least the partition used by 
> >MonetDB)
> >has only less then 422 MB free at the time that MonetDB/XQuery tries to
> >allocate the extra 422 MB.
> 
> I'm sorry, I don't really understand about the memory thing. But when
> I check in: Right Click My Computer -> Properties -> Advanced ->
> Performance Settings -> Advanced -> Virtual Memory Change, it is said
> that
>    Drive C: Paging File Size 768-1536 MB
>    Space Available: 3156MB
> 
> My Hard Disk free space in Drive C is 2.33 GB (total is 12 GB).
> 
> I use windows XP Professional version 2002 service pack 2, Pentium 4
> CPU, 2.40GHz, 512 MB of RAM.
> 
> How much memory and space does monetDB use?

MonetDB tries to use as much memory as available and necessary for running
the very task, what ever is less. Obviously, the latter depends (a.o.) on
the amount of data that has to be processed.

In principle, your memory settings look fine to me.

I just tried to load your 1.1GB DC1000catalog.xml document on my laptop
(Pentium M (Dothan) 1.5 GHz, 2 GB RAM, 2.1 GB free disk space; running
Fedora Core 5 Linux (Kernel 2.6.17)) into MonetDB/XQuery 0.12.0, and it did
work. MonetDB (Mserver) never required more than 1 GB of (virtual) memory,
but is took quite long (2552.377s ~= 42.5 min) (for the expert: I/O-bound!):
(I haven't analysed, yet, why it is that slow...)

========
$ Mserver --dbinit='module(pathfinder);'
# Monet Database Server V4.12.0
# Copyright (c) 1993-2006, CWI. All rights reserved.
# Compiled for i686-redhat-linux-gnu/32bit with 32bit OIDs; dynamically linked.
# Visit http://monetdb.cwi.nl/ for further information.
MonetDB>shred_doc("/data/StM/DC1000catalog.xml","DC1000catalog.xml");
# Shredded XML doc("DC1000catalog.xml"), total time after commit=2552.377s
MonetDB>
========

Hence, I'm not quite sure, why it fails in your case.
Unfortunately(??), I'm everything but a Windows expert, and I don't have any
Windows machine to test on.

Maybe someone else could try to load the document into MonetDB/XQuery 0.12.0
on Windows?

The document is now also available at
	http://www.cwi.nl/~manegold/Public/DC1000catalog.xml.zip

> How to change the MonetDB/XQuery setting to shred the file in
> "D:\MyFolder\MonetDB" instead of the default location "C:\Documents
> and Settings\klarinda\Application Data\MonetDB\dbfarm"?

Well, you have to start the MonetDB/XQuery server (Mserver) with
"--dbfarm=D:\MyFolder\MonetDB" --- at least on a Unix system one simply
starts Mserver from a command line --- AFAIK, on Windows, the
MonetDB/XQuery server is started by clicking on some icon, hence, I'm not
sure how to pass extra arguments... sorry...

Sjoerd,
could you please help and enlighten me/us?

> >To test this, we'd need to have your document --- the pure size of the
> >serialized document is not enough information for us to estimate how big 
> >the
> >internal data structure will/must be --- we also need to know (a.o.) how
> >many nodes the document has, what the structure look like, etc.
> 
> You can download the file (DC1000catalog.xml) from here:
> http://www.yousendit.com/transfer.php?action=download&ufid=1604D48611A31D6D
> (the file is only available for 7 days)
> The size after zipping is around 214mb.

> >I tried to generate your document myself in order to analyse the problem,
> >but (while working fine for Text-Centric documents) the XBench/ToXgene
> >generator fails for me with Document-Centric documents (at least with the
> >"large" and "huge" ones):
> 
> I don't know why it throws error. I get it from my friend and it's
> working fine. Have you tried the small and normal? Actually I also
> never tried the huge one.

For small ans normal documents. it works fine for me.
Thank you very much for proving the large document.

Stefan

> >One final question (for now): Which version of MonetDB/XQuery are you 
> >using?
> 
> I'm using version 0.12.0
> FYI, previously I installed 0.10.2, then I uninstall it and install a new 
> one.
> 
> Thanks,
> 
> Klarinda

-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold at cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4312       |




More information about the developers-list mailing list