Hello,
I am interested in evaluating MonetDB/XQuery. I have some XML collections
for a project I work on that are of 1-2 million individual XML documents
for which I use XQuery to access for OLAP style reporting operations. This
is a Java-based project, and while working on this project I developed a
thin XQuery-centric API so that I could evaluate/use many different XQuery
implementations, which my company released as an open source project
called Xaj (http://sourceforge.net/projects/xaj). Currently I am using a
Berkeley DB XML implementation, but am interested in exploring other
options.
I'd like to implement a MonetDB version of this API now, but I am having
some difficulty finding out how to effectively access MonetDB through
Java. So far I was able to use the MonetDB JDBC driver to construct XQuery
statements and get XML results, and also I was able to construct MonetDB
pf:add-doc() statements to add documents, but I am wondering if this is
the best approach (and if so, how to add documents to a remote MonetDB
server using the pf:add-doc() mechanism).
The API I am trying to implement you can view here, to give you an idea
what I'm trying to do:
http://xaj.svn.sourceforge.net/viewvc/xaj/xaj/src/net/sf/xaj/XmlDb.java?rev…
It was somewhat modeled after the defunct XML:DB API, but focused just on
add/store/XQuery operations. Any help/advise would be greatly appreciated!
-- m@
my seasonal compliments..
this is ashok. i am using monetdb for my application with java. just i need
to copy the monetdb table to a csv file..
in an sql manual i read the syntax for copying from table to csvfile is
Syntax:
COPY <subquery> INTO <file_name> [ [USING] DELIMITERS
tuple_separator [’,’
record_separator [ ’,’
string_quote ]]]
[ NULL AS null_string ];
Operating System: Windows XP
My query:
copy select * from table_name into 'D:/file_name.csv' using delimiters '|';
but if i execute this, am facing the following problem.. any suggestion?
asap...
Error: Syntax error, unexpected SELECT, expecting INTO in: “copy select”..
Also the query specified below which having --with null as 'null_String"--
is not working..
copy select * from table_name into 'D:/file_name.csv' using delimiters '|'
with null as ' ';
I want a way to copy monetdb tables to csv files..
--
View this message in context: http://www.nabble.com/monetdb-copy-command-tp18752294p18752294.html
Sent from the monetdb-users mailing list archive at Nabble.com.
Thank you for your response!
> Were you able to see any (error-) message in the server window?
I could not see any message in the server window as it was directly closed.
> On you 64-bit hardware, are you running 64-bit or 32-bit windows?
> If 64-bit, did you install the 64-bit or 32-bit MonetDB/XQuery?
I am using 32-bit windows.
> The serialized size ("1 GB") does not say much about the actual complexity
> of the XML document, e.g., how many nodes, attributes, etc. does the
> document contain?
> Do you know these statistics, or could you share your document with us for
> testing?
I tested using XBench DCSD. The total numbers of nodes and attributes were
almost 22-23 millions and 1.5 millions, respectively.
Thanks!
John
i have a application where i have constant bulk inserts(about 50000 rows of
1Kbyte each) every 1 minute,i am wondering which is the fastest and most
efficient way to insert data.
i am using now the COPY command but it is kind of awkward since i need to
dump my data as csv and then execute COPY to insert it,is there any
streaming method or a special api for bulk inserts?
also when i issue a copy command it uses a little bit of cpu aroud 2%(it is
a quad core setup with windows so i guess this means 8% of one cpu) and keep
loading and releasing memory(i have 8GB of ram) even that it doesn`t get
near 2GB.
is this ok,what it is actually doing?
P.S.: are insert/selects multi threaded?
thanks for this awsome peace of software!
-Uriel
>>>>
>>>> ok,but why it don`t utilize one core to the max,it seem to keep using 8%
>>>> of
>>>> the core and takes about 17 seconds to insert 200,000 record of 1Kb each
>>>> (the actual file is 64MB).
>>>>
>>>>
>>>
>>> That depends on the total setup. Especially, when you are exceeding
>>> memory then Linux
>>> is not the best operating system. One of the issues is that dirty pages
>>> are not properly flushed.
>>> A BATsave in the new bulk loader at critical points helped to avoid this
>>> case.
>>> if you do a COPY into a clean bat, then the log overhead is neglectable.
>>>
>>>>
>>>> can i some how skip the logs and make it insert right into the bat
>>>> files?
>>>> also how can i stream the tuples programaticlly from C mapi or ODBC?
>>>>
>>>>
>>>
>>> See the C Mapi library. Or you might look at the Stethoscope, which
>>> streams tuples
>>> from server to an application.
>>>
>>
>> i am running MonetDB on windows with a 8GB of ram.
>>
>
> That should be more then enough ; )
>
>> what is Stethoscope?
>>
>
> it is part of the source distribution and a Linux utility that picks up a
> stream of tuples from the server.
does it work on windows,and how it works,using a special mil/mal
syntax or does it is a plugin in the server for fast loads?
also how you guys load data into benchmark databases which
are(according to papers i have seen) 1-100GB of data.
>
> The information you provide is hard to track down to a possible cause.
> a COPY operation most likely reads the complete file into its buffers
> before processing it.
>
> One pitfall you may have stumbled upon is the following.
> Did you indicate the number of records that you are about to copy into the
> table?
> If not, then the system has to guess and will repeatedly adjust this guess,
> which
> involves quite some overhead.
> Pleae use, COPY 50000 RECORDS INTO .....
even if i want to read all the file?
>>>
>>> Do you happen to be a Ruby-on-Rails expert?
>>>
>>
>> i am not a ruby on rails expert,sorry i am python guy :)
>>
>>>>
>>>> thanks again.
>>>>
>>>> Martin Kersten wrote:
>>>>
>>>>
>>>>>
>>>>> uriel katz wrote:
>>>>>
>>>>>
>>>>>>
>>>>>> i have a application where i have constant bulk inserts(about 50000
>>>>>> rows of 1Kbyte each) every 1 minute,i am wondering which is the fastest and
>>>>>> most efficient way to insert data.
>>>>>>
>>>>>>
>>>>>
>>>>> COPY into is the fastest
>>>>>
uriel katz wrote:
>>> ok,but why it don`t utilize one core to the max,it seem to keep using 8% of
>>> the core and takes about 17 seconds to insert 200,000 record of 1Kb each
>>> (the actual file is 64MB).
>>>
>>>
>> That depends on the total setup. Especially, when you are exceeding memory then Linux
>> is not the best operating system. One of the issues is that dirty pages are not properly flushed.
>> A BATsave in the new bulk loader at critical points helped to avoid this case.
>> if you do a COPY into a clean bat, then the log overhead is neglectable.
>>
>>> can i some how skip the logs and make it insert right into the bat files?
>>> also how can i stream the tuples programaticlly from C mapi or ODBC?
>>>
>>>
>> See the C Mapi library. Or you might look at the Stethoscope, which streams tuples
>> from server to an application.
>>
> i am running MonetDB on windows with a 8GB of ram.
>
That should be more then enough ; )
> what is Stethoscope?
>
it is part of the source distribution and a Linux utility that picks up
a stream of tuples from the server.
The information you provide is hard to track down to a possible cause.
a COPY operation most likely reads the complete file into its buffers
before processing it.
One pitfall you may have stumbled upon is the following.
Did you indicate the number of records that you are about to copy into
the table?
If not, then the system has to guess and will repeatedly adjust this
guess, which
involves quite some overhead.
Pleae use, COPY 50000 RECORDS INTO .....
>> Do you happen to be a Ruby-on-Rails expert?
>>
> i am not a ruby on rails expert,sorry i am python guy :)
>
>>> thanks again.
>>>
>>> Martin Kersten wrote:
>>>
>>>
>>>> uriel katz wrote:
>>>>
>>>>
>>>>> i have a application where i have constant bulk inserts(about 50000 rows of 1Kbyte each) every 1 minute,i am wondering which is the fastest and most efficient way to insert data.
>>>>>
>>>>>
>>>> COPY into is the fastest
>>>>
>>>>
>>>>> i am using now the COPY command but it is kind of awkward since i need to dump my data as csv and then execute COPY to insert it,is there any streaming method or a special api for bulk inserts?
>>>>>
>>>>>
>>>>>
>>>> I think it is possible to inline the tuples into the sql stream as well. The SQL developer will answer this one.
>>>>
>>>>
>>>>> also when i issue a copy command it uses a little bit of cpu aroud 2%(it is a quad core setup with windows so i guess this means 8% of one cpu) and keep loading and releasing memory(i have 8GB of ram) even that it doesn`t get near 2GB.
>>>>> is this ok,what it is actually doing?
>>>>>
>>>>>
>>>> We have recently upgraded our bulk loader to utilize as many cores as possible
>>>> This is not yet in the release.
>>>> The preview of effects you can see in http://monetdb.cwi.nl/projects/monetdb//SQL/Benchmark/TPCH/
>>>>
>>>>
>>>>> P.S.: are insert/selects multi threaded?
>>>>>
>>>>> thanks for this awsome peace of software!
>>>>>
>>>>> -Uriel
>>>>>
>>>>>
>>>
I just installed MonetDB/XQuery 0.24.0 (released on 30 June 2008) and wanted
to store some XML documents. When I tried to shred an XML document with
almost 1GB filesize, I received the following error message.
> MAPI = monetdb@localhost:50000
> ACTION= read_line
> QUERY = pf:add-doc("C:/Data/TestData-10.xml","TestData-10.xml")
> ERROR = Connection terminated
After receiving this error on my mclient window, I noticed that the MonetDB
XQuery Server and MClient were terminated. FYI, I used a Windows XP Pro SP3
PC with Intel Core2 Duo E6550 processor and 3.25GB of RAM. The size of my
hard disk is 232 GB. May I know how I can shred this file?
Another question is about to test the query performance. I tried to execute
an XQuery using the following command.
> mclient.bat -lxq -t -G XQuery-1.xq
This command will print out the query result and return the "Timer".
a. Can I hide the result? I tried to add "-f none" to the above command, but
it did not work.
b. What is the "Timer"? How can I get only the query time?
Thanks
John
Are there any functions like date_format in MonetDB, I want to do something like date_format(zncsrq, '%Y%m%d', that is Mysql's function, in MonetDB.
Best regards!
Hello everybody,
are there any results of the official XML Query Test Suite on MonetDB/
XQuery available
or is there an easy way to generate them?
Regards,
Richard