Monetdb copy binary time varys very much!

Stefan Manegold Stefan.Manegold at cwi.nl
Mon Jul 29 08:45:15 CEST 2013


Hi Meng,

My analysis was mainly an (educated) guess of what (most probably) happens.
To be sure, you need to profile your system in detail, e.g., monitor CPU and I/O activities.

Having said that, with less RAM, you might force the system to write the loaded data to disk instantly with each copy into,
making each copy into slower than merely loading the data into disk, but the worst case might become better since each time
less data has to we written than with more RAM.
Again, this is my guess what happening; the behaviour you observe might be caused by something else;
only a detailed profiling analysis can tell.

Also, if you eventually want to query your 300+ GB (or even more?) efficiently, you might want to have a suitable system,
in particular sufficient RAM. (Would you mind sharing the hardware characteristics of your machine?).

Moreover, what was the time gap between two consecutive copy into's in your experiment, i.e., did you issue the next copy into as soon as the previous ended?
Does this mimic the your "real-world" scenario realistically?
Or would there be a time gap between to copy into's in reality? I recall you mentioned some 15 seconds?
If so, you should rerun your experiment respecting these gaps between each two consecutive copy into's.

Best,
Stefan


----- Original Message -----
> Hi Stefan,
> 
> 
> Due to all the data in RAM having to be written to disk when it's full, if
> the RAM grows larger and i don't change hard disk, the data volume become
> larger, will it make the COPY BINARY INTO much slower compared to the small
> RAM?
> 
> 
> Contrarily, if the RAM get smaller, will it make the COPY BINARY INTO
> quicker?
> 
> 
> I just want to lower the peak value of COPY, the 10+ seconds every about a
> hundred times.
> 
> 
> For the most time COPY BINARY INTO took only less than 1 second, is it
> because they are written to RAM not disk when the RAM is not full? if so,
> can i write to disk more frequently before the disk is full so i could lower
> down the peak value of COPY.
> 
> 
> 
> 
> Regrads,
> Meng
> 
> 
> ------------------ Original ------------------
> From:  "Stefan Manegold"<Stefan.Manegold at cwi.nl>;
> Date:  Fri, Jul 26, 2013 11:10 PM
> To:  "Communication channel for MonetDB users"<users-list at monetdb.org>;
> 
> Subject:  Re: Monetdb copy binary time varys very much!
> 
> 
> 
> Hi,
> 
> I'm not sure whether I understand correct what you are doing.
> 
> If you repeat the test 1000 times, does that mean that (1) 10000 times you
> re-create (or empty) the table and thus always copy into an empty table, or
> (2) 10000 times you copy into the same (growing) table, i.e., resulting in a
> table of 10,000 times 200,000 rows, i.e., 2,000,000,000 rows, i.e., ~16 GB
> per column, i.e., ~336 GB in total?
> 
> (Only) in case (1) the binary files to be imported are simply moved at zero
> costs.
> In case (2), only the first copy into (into the empty table) can simply move
> the files at zero costs; all subsequent copy into (into a no longe empty
> table) must copy the files (and delete them afterwards to mimic the same
> behavior as the initial copy into), which is of cause not "for free".
> 
> Also, as Martin explained, unless your machine has (significantly) more RAM
> than the ~336 GB of data you copy, the data needs to be written to disk in
> between, making some copy into's "slower" than others. There's not much to
> do about that other than (a) getting more RAM, or (b) improving I/O
> bandwidth by using either a high performance RAID or SSDs.
> 
> Stefan
> 
> .
> _______________________________________________
> users-list mailing list
> users-list at monetdb.org
> http://mail.monetdb.org/mailman/listinfo/users-list
> 

-- 
| Stefan.Manegold at CWI.nl | DB Architectures   (DA) |
| www.CWI.nl/~manegold/  | Science Park 123 (L321) |
| +31 (0)20 592-4212     | 1098 XG Amsterdam  (NL) |




More information about the users-list mailing list