[MonetDB-users] Profiling queries

Niels Nes Niels.Nes at cwi.nl
Wed Dec 5 08:46:57 CET 2007


On Wed, Dec 05, 2007 at 12:36:36AM -0700, m h wrote:
> On Dec 5, 2007 12:15 AM, Niels Nes <Niels.Nes at cwi.nl> wrote:
> > On Tue, Dec 04, 2007 at 11:58:55PM -0700, m h wrote:
> > > On Dec 4, 2007 12:56 PM, m h <sesquile at gmail.com> wrote:
> > > >
> > >
> > > In the meantime I've done the following in an attempt to make the
> > > query faster all to no avail (ie the speed stays the same):
> > >
> > >   * Remove the calculation doing summing and coerce'ing
> >         (this was in the selection part were I guess you do not have much
> >         data left)
> > >   * Change IN statement to =
> > >   * Remove primary KEY for date table and join date table with between
> > > clause (actually surprised this didn't make it slower)
> >         Only if the otherside has a foreign key it would slow down.
> >
> >         Single column primary key's don't help, as these are handled
> >         by in memory hash tables which are created when needed (ie
> >         without influence of the user).
> > >   * Added indexes for non-PK columns I'm joining on.  No difference!
> >         So your query involves many join expressions ?
> 
> I don't think it's that many, 3-4...
> 
> >
> > >
> > > Does this have something to do with the tables in the "tmp" schema?
> > > Does it make queries slower?
> > We never use the 'tmp' schema for columns other then really temporaries
> > ourselfs, so I have know real data on this, but known how the queries
> > are translated I would expect any impact.
> >
> 
> I'll work on using non tmp schema.  I thought I read something about
> indexes not working there, but I'm probably just imagining it....
> 
> ok, names have been changed to protect the innocent, here's an
> anonymized query (the column types are in the comments to the right):
> 
> SELECT
>   item_dim.deptnm AS "DEPARTMENT", --varchar(30)
>   item_dim.upcdesc AS "DESCRIPTION", --varchar(30)
>   item_dim.freeformsz AS "SIZE", --varchar(30)
>   loc_dim.orglevel1 AS "STORE CHAIN",  --varchar(30)
>   loc_dim.store AS "STORE NAME", --varchar(30)
>   item_dim.upc AS "UPC",  -- bigint
>   sum(sale_fact.units * (coalesce(sale_fact.dealamt, 0) +
> coalesce(sale_fact.purbillback, 0) + coalesce(sale_fact.scanbillback,
> 0))) AS "current allowances",
>   sum(sale_fact.sales - coalesce(sale_fact.cost, 0)) AS margin,
>   sum(sale_fact.sales) AS sales,
>   sum(sale_fact.units) AS units
> FROM
>   item_dim, loc_dim, sale_fact
> WHERE
>   sale_fact.adweekid = 16 AND --int
>   item_dim.upc = sale_fact.upc AND --bigint
>   loc_dim.stornum = sale_fact.stornum AND --int
>   loc_dim.store IN ('FOO STORE') AND  --varchar
>   item_dim.groupnumnm IN ('BAR ITEMS') --varchar
> GROUP BY
>   item_dim.deptnm,
>   item_dim.upcdesc,
>   item_dim.freeformsz,
>   loc_dim.orglevel1,
>   loc_dim.store,
>   item_dim.upc
All of the groupby columns are varchar (except the last bigint). I'm just
guessing but do you see high numbers in your trace for the
group.* calls? (The numbers are listed a before the statement being executed).

Niels
> ORDER BY item_dim.depnm ASC
> 
> thanks,
> 
> -matt
> 
> -------------------------------------------------------------------------
> SF.Net email is sponsored by: The Future of Linux Business White Paper
> from Novell.  From the desktop to the data center, Linux is going
> mainstream.  Let it simplify your IT future.
> http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
> _______________________________________________
> MonetDB-users mailing list
> MonetDB-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/monetdb-users

-- 

Niels Nes, Centre for Mathematics and Computer Science (CWI)
Kruislaan 413, 1098 SJ Amsterdam, The Netherlands
room C0.02,  phone ++31 20 592-4098, fax ++31 20 592-4312
url: http://www.cwi.nl/~niels   e-mail: Niels.Nes at cwi.nl




More information about the users-list mailing list