Hello,
I am interested in evaluating MonetDB/XQuery. I have some XML collections
for a project I work on that are of 1-2 million individual XML documents
for which I use XQuery to access for OLAP style reporting operations. This
is a Java-based project, and while working on this project I developed a
thin XQuery-centric API so that I could evaluate/use many different XQuery
implementations, which my company released as an open source project
called Xaj (http://sourceforge.net/projects/xaj). Currently I am using a
Berkeley DB XML implementation, but am interested in exploring other
options.
I'd like to implement a MonetDB version of this API now, but I am having
some difficulty finding out how to effectively access MonetDB through
Java. So far I was able to use the MonetDB JDBC driver to construct XQuery
statements and get XML results, and also I was able to construct MonetDB
pf:add-doc() statements to add documents, but I am wondering if this is
the best approach (and if so, how to add documents to a remote MonetDB
server using the pf:add-doc() mechanism).
The API I am trying to implement you can view here, to give you an idea
what I'm trying to do:
http://xaj.svn.sourceforge.net/viewvc/xaj/xaj/src/net/sf/xaj/XmlDb.java?rev…
It was somewhat modeled after the defunct XML:DB API, but focused just on
add/store/XQuery operations. Any help/advise would be greatly appreciated!
-- m@
Is there any method, besides restarting the database,
of shrinking the sql log file?
When I am performing a large set of large inserts (via
COPY INTO) the log file gets very large. I wish there
was a way to cycle the log file.
Also during these large batches of inserts the
database server becomes less and less responsive until
I am forced to restart the server. Upon restart and
the normal recovery monet returns to a very responsive state.
Hi all,
I am new to monetDB and ran into the following problem (I'll try to be as
verbose as I can, perhaps it helps helping me):
I've installed MonetDB server 5.0.0, clients and commons 1.18.2 and sql
2.18.2 on Ubuntu Feisty. Then I've taken the JDBC example which is available
online and modified it to be able to insert data into a test database. More
specifically I'm using prepared statements and executeBatch() together with
the monetdb-1.6 JDBC driver.
Which is quite successful until about 5 million records are inserted -
afterwards the speed of the inserts drops dramatically.
I've experimented with different batch sizes (10000 records seems to work
ok), and I've tried to understand the chapter about memory tuning for
MonetDB (not sure yet about how to use mem_cursize() though).
In any case at first inserting 2 million records only takes about 5 minutes
(measured with Javas System.timemillis), later on this time increases and
the server gets rather unresponsive.
I found only two posts concerning the speed of JDBC inserts (from January
2007 I think) and I would be interested if a solution or explanation has
been found off-list.
Thank you
--
View this message in context: http://www.nabble.com/JDBC-insert-performance-tf4589411.html#a13100422
Sent from the monetdb-users mailing list archive at Nabble.com.
Dear all,
I am wondering whether it is possible with MonetDB4/XQuery, PF/Tijah,
to blacklist certain XML element names from being "shredded", and thus
being indexed? If so, could someone tell me how? I hope this is the
correct mailinglist for asking this question...
Best,
Junte
Stefan Manegold wrote:
> 32-bit system on 8 GB machines does not make much sense, does it?
> A 32-bit systems can (at least per process) address at most 4 GB, usually
> only 2 GB in practice...
True, but doesn't seem like I am hitting the maximum process memory limit (not even approaching 1.5GB).
> With 9M+ records of >= 438 byte each, your data is >=3.9 GB in size.
> I'm not completely sure whether MonetDB temprarily need to hold all data
> concurrently in its address sapce during bulk loading; but if so, your data
> size might simply exceed the 32-bit address space.
> (Obviously, MonetDB should better give an error than crash ...)
I am bulk loading only 250K rows at a time, at which point they are copied from staging table
into permament final table. am I allowed to have a table with total data size larger than max
process memory size?
> Did you try on a 64-bit systems using 64-bit MonetDB?
I have, but hitting another - higher limit - 100M rows ending with corruption on 64 bits, see first message on this thread.
-----Original Message-----
From: Martin Kersten <Martin.Kersten(a)cwi.nl>
Sent: Friday, February 22, 2008 17:07
To: Communication channel for MonetDB users <monetdb-users(a)lists.sourceforge.net>
Subject: Re: [MonetDB-users] COPY TO corrupting data
mobigital1 wrote:
> FYI, i am using 500K rows at a time.
>
> I've seen the mserver5.exe process memory utilization go up to 13GB (free
> memory was around 1.5 GB).
> but then it winds down pretty quickly.
>
>
>
>
Thanks. This means that the expected database chunk size is topped at
500K * (7 * 8+ 2*20+6 *10+15+144*8 ) bytes
i.e. 500K* 1323= 0.670Gb
chucksize wise there does not seem a problem.
However, virtual memory footprint quickly increases
and I am wondering if we hit a limit in OS regaring
memory mapped files.
I would check a single batch for consistency.
Then built a merge table over the smaller batches.
In a later stage you can then glue together the pieces
until we hit another limit
thanks for the info and help in improving MonetDB
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
MonetDB-users mailing list
MonetDB-users(a)lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/monetdb-users
I am trying to load about 500M rows into a table with about 160 columns:
column types:
7 columns of type "int"
2 columns of type "varchar(20)"
6 columns of type "varchar(10)"
1 columns of type "varchar(15)"
144 columns of type "real"
on a system with 16 GB RAM , 64 bit windows server 2003 SP2.
it takes some 20 hours to load the 100M rows, but when I reach that
boundary, the server seems to crash and database becomes corrupt so that
about 95M rows have 0 in numeric columns, blanks in varchars.
The remaining 5M rows have real data.
does anyone have any suggestions on what can be done to make it possible to
load this data?
--
View this message in context: http://www.nabble.com/COPY-TO-corrupting-data-tp15634267p15634267.html
Sent from the monetdb-users mailing list archive at Nabble.com.
Hi,
I'm trying to run a benchmark using the latest February stable (on an
Ubuntu 7.04 - Feisty Fawn), and want to cleanup the database entirely
between two single runs. Basically, my benchmark script is as follows:
//////////////////// START CODE
// kill all processes and remove databases)
// (I know this is not the intended way, but using
// > monetdb stop test
// > monetdb destroy -f test
// while not restarting merovingian basically results in
// in the same error)
> killall -9 merovingian
> killall -9 mserver5
> killall -9 mclient
> rm -Rf /usr/local/src/MonetDB/var/MonetDB5/dbfarm/*
// start the merovingian server
> merovingian
> sleep 2
// wait until ports 50000 and 50001 are free,
// as they might still be blocked (I inserted this piece
// of code because I thought busy port allocation might
// be the cause of failure)
> [code works fine (ommited)]
// create and start the database
> monetdb create test;
> sleep 2;
> monetdb start test;
> sleep 2;
// load the document into the database (this part tends to fail)
> mclient -lsql -dtest < document.sql
//////////////////// END CODE
The strange thing is that every second run fails due to a "Connection
terminated" error when loading the document, i.e. mserver5 crashes away.
Did anyone experience similar problems? Any hints would be appreciated.
Thanks in advance,
Michael