[Monetdb-developers] Performance of COPY INTO: data source & constraint checking

Stefan Manegold Stefan.Manegold at cwi.nl
Mon May 18 11:40:25 CEST 2009


On Sun, May 17, 2009 at 08:26:07PM +0200, Stefan de Konink wrote:
> Roberto Cornacchia wrote:
> > 2. The restore is almost unusable for large databases, as it is
> > extremely slow. I suspect the main reasons being:
> > - COPY INTO .. FROM STDIN (used in the dump) is much much slower than
> > copying from a file, regardless of whether the number or record to copy
> > is given or not.
> 
> This is my conclusion as well.

In case you came to your conclusion from experimental/emperical study, I
would be very interested to know about the experiemntes performed and the
results measured.
In that case, I would be verythankful if you could share the informative
details with us, i.e. detailed descriptions of
- experimental setup (HW, SW, datasets, workload/queries)
- measured results (what was measured how, and what are the results?)
- (your) discussion/analysis/interpretation of the results

> > - Column constraints, as primary key, not null, etc, are issued upon
> > table creation, before the COPY INTO, rather than being postponed with
> > an ALTER TABLE after the copy (FOREIGN KEYS are instead postponed to the
> > end of the script).
> 
> I have interesting results; it seems to be actually ~5% faster to 
> *insert* with a primary key, opposed to postpone it (!) This even 
> excludes the time to enforce the primary key afterwards.
> 
> I have verified in different runs that it is not related to the 'NOT 
> NULL' constraints. So this is a pretty surprising results.

Same as above: If you consider your results surprising, I'd be curious to
learn about them and would hence be thankful if you could share them with us
(see above).

> The data was highly structural and had a multicolumn primary key.

Could you please reveal what you refer to as "highly structural", and how
many columns of which type the primary key contained?

Could you please elaborate on whether you consider you setup/schema
"representaive" and/or whether/to which extend you think that your results
for one specific case could be generalized to other cases?

You might want to study (e.g.) any or all of

http://old-www.cwi.nl/htbin/ins1/publications?request=abstract&key=MaMa:EDBT:09
http://old-www.cwi.nl/htbin/ins1/publications?request=abstract&key=Ma:IS:08
http://old-www.cwi.nl/htbin/ins1/publications?request=abstract&key=SiGoKeNeMa:VLDB:08

on how to design, perform, and in particular report on repeatable
experiemental studies.

Please don't hesitate to ask for specific advice/help in case of doubt.

Looking to learning about your detailed experiences!

Stefan

> 
> 
> Stefan
> 
> 
> ------------------------------------------------------------------------------
> Crystal Reports - New Free Runtime and 30 Day Trial
> Check out the new simplified licensing option that enables 
> unlimited royalty-free distribution of the report engine 
> for externally facing server and web deployment. 
> http://p.sf.net/sfu/businessobjects
> _______________________________________________
> Monetdb-developers mailing list
> Monetdb-developers at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/monetdb-developers
> 

-- 
| Dr. Stefan Manegold | mailto:Stefan.Manegold at cwi.nl |
| CWI,  P.O.Box 94079 | http://www.cwi.nl/~manegold/  |
| 1090 GB Amsterdam   | Tel.: +31 (20) 592-4212       |
| The Netherlands     | Fax : +31 (20) 592-4312       |




More information about the developers-list mailing list