Hi,
I was wondering if you can give me some ideas on how to retrieve data from
MonetDB at very high throughput. The query will be very simple, issued
against a single table. This table contains billions of records and I'd
like to retrieve the data at a rate of at least one million records per
second. From my naive tests with mclient running on the same 16-core
machine as mserver5, I am only able to extract the data at about 20,000
records per second with the Feb2010 release.
As a baseline case, with the same data in compressed (gzip'd) ASCII form
stored in a regular file on the same (SAN) file system as MonetDB, I am able
to read at the desired speed of one million records per second.
I understand that there is communication overhead between mserver5 and
mclient (or whatever client I use). Therefore, one possibility is to embed
my application within mserver5. The embedded application basically just
needs to be able to issue a SQL (or even MAL) query against the enclosing
mserver5 and process the result set. If this is a viable approach, I'd like
some guidance on where the hooks are.
Thanks.
Hering Cheng
Hi,
The first query after a load action takes longer because the new data is
still 'cold'.
Is there a way in MonetDB to load data directly with a 'hot' status?
Thank in advance.
--
View this message in context: http://old.nabble.com/Loading-directly-into-memory.-tp29105569p29105569.html
Sent from the monetdb-users mailing list archive at Nabble.com.
Hi Niels:
I don't care if the log gets written on table creation. I tried the flag
but it didn't do anything. The default value is 0. In the FebPS2 version
it's a debug flag - actually setting it to 128 doesn't help. On closer
examination it appears that it checks for if (debug & 1) then it would write
a log. Setting debug to 128, would imply that debug & 1 = 0 and so no debug
output is written. Can you tell me where the log writer code is ? Is it
gdk_logger.c ? if yes, which is the function that writes the
log_write_format, log_write_string, if not which ones.
-- Shailendra
------------------------------
>
> Message: 5
> Date: Wed, 28 Jul 2010 08:40:06 +0200
> From: Niels Nes <Niels.Nes(a)cwi.nl>
> Subject: Re: [MonetDB-users] Creating table with nologging.
> To: Communication channel for MonetDB users
> <monetdb-users(a)lists.sourceforge.net>
> Message-ID: <20100728064006.GA6626(a)cwi.nl>
> Content-Type: text/plain; charset=utf-8
>
> On Tue, Jul 27, 2010 at 04:28:56PM -0700, Shailendra Mishra wrote:
> > I have the need for creating table with nologging enabled. I do
> > understand that MonetDB is not meant to be used this way. However, I am
> > trying to use it to ingest data off data-feeds. My db lives on a SSD,
> > and the only DML that I execute is INSERT. I would appreciate if folks
> > familiar with the server code could point me source files ?where I
> > could make the change to achieve this. Thanks in advance for your help.
> >
> > -- Shailendra
>
> Shailendra
>
> I haven't tried this in a while but it used to be possible to switch
> logging of inserts of using a debug flag (passed to the logger code).
>
> See the file src/storage/bat/bat_logger.mx
> function bl_create. The first argument to logger_create should
> be 128 to switch off logging.
>
> This will only disable that every insert will also write the inserted
> values to the log. It still creates a log for your initial create table
> statement.
>
> Indeed this is un documented and far from normal supported usage of
> monetdb.
>
> Niels
>
>
I'm a newbie MonetDB user and have two issues out of the box. (I have
downloaded and installed the latest stable version, on Mac OS X
10.6.3.)
1. I cannot get the command line history function to work with XQuery.
I invoke the server with:
Mserver --dbinit="module(pathfinder);"
and invoke a client with:
mclient -lx --history
However, the command line history does not work; hitting the up arrow
after a command results in "^[[A" being displayed:
shell> mclient -lx --history
Welcome to mclient, the MonetDB/XQuery interactive terminal
Type \q to quit, \? for a list of available commands
xquery>\<test1.xq
<hello>World</hello>
xquery>^[[A
2. When loading namespaces and modules, and making an error, the error
can only be corrected *after* I restart the client and server
processes. For example, when I try to import a module with namespace
"http://www.foo.edu" but accidentally type "http://www.food.edu", I
receive an error message. This is expected. However, if I try again to
import the module again, this time with the correct namespace, I get
the same exact error message (complete with "http://www.food.edu").
This continues unless I restart both the client and server.
Is this a "feature", or am I doing something incorrectly?
Thanks for your help,
Steve
I have the need for creating table with nologging enabled. I do understand
that MonetDB is not meant to be used this way. However, I am trying to use
it to ingest data off data-feeds. My db lives on a SSD, and the only DML
that I execute is INSERT. I would appreciate if folks familiar with the
server code could point me source files where I could make the change to
achieve this. Thanks in advance for your help.
-- Shailendra
Hello,
I found some problems to run a remote Xquery from mclient.
The query is the following (it works fine with the Java XRPC API but not
with mclient).
mclient -lx -s "execute at {'192.168.0.12'} {pf:collections()}"
I get the following error.
user(win32):
password:
MAPI = win32@localhost:50000
QUERY = execute at {'192.168.0.12'} {pf:collections()}
ERROR = !ERROR: interpret: no matching MIL operator to 'reverse(void)'.
!MAYBE YOU MEAN:
! reverse(BAT[any::1,any::2]) : BAT[any::2,any::1]
!ERROR: interpret_params: sort(param 1): evaluation error.
!ERROR: interpret_params: reverse(param 1): evaluation error.
Did anyone had already successfully run such kind of query?
PS1: there are 2 Mserver running: one on the local host/one on the remote
host.
The Java XRPC API does not need to run a Mserver on the local host, but
mclient do (else I get a "initiating connection socket failed" error )
PS2:I have Mserver 4.39.0 on the remote linux server and Mserver 4.36.5 on
the windows client (The query also fails with client/linux/4.39.0 &
server/linux/4.39.0)
PS3: the default example provided in the XRPC doc fails (no MIL operator)
the same way:
import module namespace test = "xrpc-test-function" at "
http://192.168.0.12:50001/export/xrpc-mod.xq";
execute at {"192.168.0.12"} {test:add(100, 200)}
Regards
Ryad
Dear Kun Ren,
first of all, I would highly appreciate if you could stick to using the
MonetDB mailing lists for such questions about MonetDB instead of sending me
private email. On the one hand, this increases the chances for you to get
quick and suitable answers, as more people read the mailing lists. On the
other hand, other users might have similar questions and benefit from the
answers sent to the mailing lists.
Having said that, given that you mention neither the version of MonetDB
you're using or any specification of your Linux & Windows machine(s) (in
particular whether MonetDB, the OS, and the hardware architecture are 32- or
64-bit), nor whether you get any error or not (What exactly does "auto stop
mean"? What exactly happens?) I can only guess that your Linux system is
64-bit while your Windows system (or at least the MonetDB version you
installed on it) is 32-bit.
For a detailed discussion of the impact of 32-bit vs. 64-bit on the
scalability of MonetDB, please see
http://sourceforge.net/mailarchive/message.php?msg_name=20100319145945.GA91…
Hope this helps.
Stefan
On Tue, Jul 20, 2010 at 05:19:40PM +0800, kun ren wrote:
> Dear professer Stefan,
> Recently,I use TPC-H test the Monetdb, On windows ,so I use "copy
> ,, records,,,into,,from" to load data to Monetdb,when the size of the file
> is about 700M,It is OK.But when the size is 1.4G,the Monetdb server is auto
> stop when loading. But when I am in Linux, it is no problem. So I want to
> ask why it is wrong on windows when the file is so large?
> Best regards,
--
| Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl |
| CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ |
| 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 |
| The Netherlands | Fax : +31 (20) 592-4199 |
The MonetDB team at CWI/MonetDB BV is pleased to announce the
Jun2010-SP1 bug fix release of the MonetDB suite of programs.
Lots of problems have been fixed, the most important one being the fix
in the handling of database upgrades of databases created with the
Feb2010 release to the current version.
More information (including release notes) on this release is available
at <http://monetdb.cwi.nl/Development/Releases/Jun2010/>.
The download location has changed to
<http://dev.monetdb.org/downloads/>. Please fix any bookmarks you may have.
--
Sjoerd Mullender
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Hi,
I rather have a general question and I would like to know something
about normalisation in MonetDB in relation to the underlying storage model.
Since my first courses in SQL, that was PostgreSQL at the time. It was a
strong advise always try to normalise duplications of data, because it
would always increase your search preformance to look up numerical keys
over string comparisons.
Given that in MonetDB, text-strings are deduplicated in a best effort
way, factually become numbers, how does this compare to the additional
costs of:
- enums
- foreign key relations
For example, some database schema's use enums (or worse: varchars) for
two values. For example 'accessible', 'inaccessible'. It is clear that
the storage size for this field BOOLEAN NOT NULL, would be sufficient
and of the length of one bit (best case).
Now within MonetDB the string data is currently optimised, would it
therefore cost additional time to create a secondary table to join to vs
using a string field.
Thus, does normalisation for 'manual' deduplication, hurt or not?
Stefan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEAREKAAYFAkw9uQAACgkQYH1+F2Rqwn2sjACeNnvDQS+cXjzs1USpomkL6rz8
7wkAn2rF2ZRvEJbQ1cX+oxJsWPcMoTk6
=RzDh
-----END PGP SIGNATURE-----