Hi Stefan,

Due to all the data in RAM having to be written to disk when it's full, if the RAM grows larger and i don't change hard disk, the data volume become larger, will it make the COPY BINARY INTO much slower compared to the small RAM?

Contrarily, if the RAM get smaller, will it make the COPY BINARY INTO quicker?

I just want to lower the peak value of COPY, the 10+ seconds every about a hundred times.

For the most time COPY BINARY INTO took only less than 1 second, is it  because they are written to RAM not disk when the RAM is not full? if so, can i write to disk more frequently before the disk is full so i could lower down the peak value of COPY.


Regrads, 
Meng

------------------ Original ------------------
From:  "Stefan Manegold"<Stefan.Manegold@cwi.nl>;
Date:  Fri, Jul 26, 2013 11:10 PM
To:  "Communication channel for MonetDB users"<users-list@monetdb.org>;
Subject:  Re: Monetdb copy binary time varys very much!

Hi,

I'm not sure whether I understand correct what you are doing.

If you repeat the test 1000 times, does that mean that (1) 10000 times you
re-create (or empty) the table and thus always copy into an empty table, or
(2) 10000 times you copy into the same (growing) table, i.e., resulting in a
table of 10,000 times 200,000 rows, i.e., 2,000,000,000 rows, i.e., ~16 GB
per column, i.e., ~336 GB in total?

(Only) in case (1) the binary files to be imported are simply moved at zero
costs.
In case (2), only the first copy into (into the empty table) can simply move
the files at zero costs; all subsequent copy into (into a no longe empty
table) must copy the files (and delete them afterwards to mimic the same
behavior as the initial copy into), which is of cause not "for free".

Also, as Martin explained, unless your machine has (significantly) more RAM
than the ~336 GB of data you copy, the data needs to be written to disk in
between, making some copy into's "slower" than others. There's not much to
do about that other than (a) getting more RAM, or (b) improving I/O
bandwidth by using either a high performance RAID or SSDs.

Stefan

.