Monetdb copy binary time varys very much!

Stefan Manegold Stefan.Manegold at cwi.nl
Tue Jul 30 08:33:55 CEST 2013


the difference would be that then you would measure the time required for each copy into in a scenario that would realistically mimic your real-world scenario --- only do this experiment can tell, whether or not these times differ, and if so how, from the ones you measured in a less realistic scenario.

Stefan


integrity <357416268 at qq.com> wrote:
>Hi Stefan, 
>Just as you said ,we issue the next copy into as soon as the previous
>ended.
>
>
>What would be different if i mimic the real-world scenario and respect
>these gaps between each two consecutive copy into's?
>
>
>
>
>Thank!
>Meng
>------------------ Original ------------------
>From:  "Stefan Manegold"<Stefan.Manegold at cwi.nl>;
>Date:  Mon, Jul 29, 2013 02:45 PM
>To:  "Communication channel for MonetDB users"<users-list at monetdb.org>;
>
>
>Subject:  Re: Monetdb copy binary time varys very much!
>
>
>
>Hi Meng,
>
>My analysis was mainly an (educated) guess of what (most probably)
>happens.
>To be sure, you need to profile your system in detail, e.g., monitor
>CPU and I/O activities.
>
>Having said that, with less RAM, you might force the system to write
>the loaded data to disk instantly with each copy into,
>making each copy into slower than merely loading the data into disk,
>but the worst case might become better since each time
>less data has to we written than with more RAM.
>Again, this is my guess what happening; the behaviour you observe might
>be caused by something else;
>only a detailed profiling analysis can tell.
>
>Also, if you eventually want to query your 300+ GB (or even more?)
>efficiently, you might want to have a suitable system,
>in particular sufficient RAM. (Would you mind sharing the hardware
>characteristics of your machine?).
>
>Moreover, what was the time gap between two consecutive copy into's in
>your experiment, i.e., did you issue the next copy into as soon as the
>previous ended?
>Does this mimic the your "real-world" scenario realistically?
>Or would there be a time gap between to copy into's in reality? I
>recall you mentioned some 15 seconds?
>If so, you should rerun your experiment respecting these gaps between
>each two consecutive copy into's.
>
>Best,
>Stefan
>
>
>----- Original Message -----
>> Hi Stefan,
>> 
>> 
>> Due to all the data in RAM having to be written to disk when it's
>full, if
>> the RAM grows larger and i don't change hard disk, the data volume
>become
>> larger, will it make the COPY BINARY INTO much slower compared to the
>small
>> RAM?
>> 
>> 
>> Contrarily, if the RAM get smaller, will it make the COPY BINARY INTO
>> quicker?
>> 
>> 
>> I just want to lower the peak value of COPY, the 10+ seconds every
>about a
>> hundred times.
>> 
>> 
>> For the most time COPY BINARY INTO took only less than 1 second, is
>it
>> because they are written to RAM not disk when the RAM is not full? if
>so,
>> can i write to disk more frequently before the disk is full so i
>could lower
>> down the peak value of COPY.
>> 
>> 
>> 
>> 
>> Regrads,
>> Meng
>> 
>> 
>> ------------------ Original ------------------
>> From:  "Stefan Manegold"<Stefan.Manegold at cwi.nl>;
>> Date:  Fri, Jul 26, 2013 11:10 PM
>> To:  "Communication channel for MonetDB
>users"<users-list at monetdb.org>;
>> 
>> Subject:  Re: Monetdb copy binary time varys very much!
>> 
>> 
>> 
>> Hi,
>> 
>> I'm not sure whether I understand correct what you are doing.
>> 
>> If you repeat the test 1000 times, does that mean that (1) 10000
>times you
>> re-create (or empty) the table and thus always copy into an empty
>table, or
>> (2) 10000 times you copy into the same (growing) table, i.e.,
>resulting in a
>> table of 10,000 times 200,000 rows, i.e., 2,000,000,000 rows, i.e.,
>~16 GB
>> per column, i.e., ~336 GB in total?
>> 
>> (Only) in case (1) the binary files to be imported are simply moved
>at zero
>> costs.
>> In case (2), only the first copy into (into the empty table) can
>simply move
>> the files at zero costs; all subsequent copy into (into a no longe
>empty
>> table) must copy the files (and delete them afterwards to mimic the
>same
>> behavior as the initial copy into), which is of cause not "for free".
>> 
>> Also, as Martin explained, unless your machine has (significantly)
>more RAM
>> than the ~336 GB of data you copy, the data needs to be written to
>disk in
>> between, making some copy into's "slower" than others. There's not
>much to
>> do about that other than (a) getting more RAM, or (b) improving I/O
>> bandwidth by using either a high performance RAID or SSDs.
>> 
>> Stefan
>> 
>> .
>> _______________________________________________
>> users-list mailing list
>> users-list at monetdb.org
>> http://mail.monetdb.org/mailman/listinfo/users-list
>> 
>
>-- 
>| Stefan.Manegold at CWI.nl | DB Architectures   (DA) |
>| www.CWI.nl/~manegold/  | Science Park 123 (L321) |
>| +31 (0)20 592-4212     | 1098 XG Amsterdam  (NL) |
>
>_______________________________________________
>users-list mailing list
>users-list at monetdb.org
>http://mail.monetdb.org/mailman/listinfo/users-list
>.
>
>------------------------------------------------------------------------
>
>_______________________________________________
>users-list mailing list
>users-list at monetdb.org
>http://mail.monetdb.org/mailman/listinfo/users-list

-- 
| Stefan.Manegold at CWI.nl | Database Architectures   (DA) |
|  www.CWI.nl/~manegold  | Science Park 123 (L321) |
|   +31 (0)20 592-4212   | 1098 XG Amsterdam  (NL) |
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/users-list/attachments/20130730/b6b4bdc1/attachment.html>


More information about the users-list mailing list