[MonetDB-users] Profiling queries
Niels.Nes at cwi.nl
Wed Dec 5 08:46:57 CET 2007
On Wed, Dec 05, 2007 at 12:36:36AM -0700, m h wrote:
> On Dec 5, 2007 12:15 AM, Niels Nes <Niels.Nes at cwi.nl> wrote:
> > On Tue, Dec 04, 2007 at 11:58:55PM -0700, m h wrote:
> > > On Dec 4, 2007 12:56 PM, m h <sesquile at gmail.com> wrote:
> > > >
> > >
> > > In the meantime I've done the following in an attempt to make the
> > > query faster all to no avail (ie the speed stays the same):
> > >
> > > * Remove the calculation doing summing and coerce'ing
> > (this was in the selection part were I guess you do not have much
> > data left)
> > > * Change IN statement to =
> > > * Remove primary KEY for date table and join date table with between
> > > clause (actually surprised this didn't make it slower)
> > Only if the otherside has a foreign key it would slow down.
> > Single column primary key's don't help, as these are handled
> > by in memory hash tables which are created when needed (ie
> > without influence of the user).
> > > * Added indexes for non-PK columns I'm joining on. No difference!
> > So your query involves many join expressions ?
> I don't think it's that many, 3-4...
> > >
> > > Does this have something to do with the tables in the "tmp" schema?
> > > Does it make queries slower?
> > We never use the 'tmp' schema for columns other then really temporaries
> > ourselfs, so I have know real data on this, but known how the queries
> > are translated I would expect any impact.
> I'll work on using non tmp schema. I thought I read something about
> indexes not working there, but I'm probably just imagining it....
> ok, names have been changed to protect the innocent, here's an
> anonymized query (the column types are in the comments to the right):
> item_dim.deptnm AS "DEPARTMENT", --varchar(30)
> item_dim.upcdesc AS "DESCRIPTION", --varchar(30)
> item_dim.freeformsz AS "SIZE", --varchar(30)
> loc_dim.orglevel1 AS "STORE CHAIN", --varchar(30)
> loc_dim.store AS "STORE NAME", --varchar(30)
> item_dim.upc AS "UPC", -- bigint
> sum(sale_fact.units * (coalesce(sale_fact.dealamt, 0) +
> coalesce(sale_fact.purbillback, 0) + coalesce(sale_fact.scanbillback,
> 0))) AS "current allowances",
> sum(sale_fact.sales - coalesce(sale_fact.cost, 0)) AS margin,
> sum(sale_fact.sales) AS sales,
> sum(sale_fact.units) AS units
> item_dim, loc_dim, sale_fact
> sale_fact.adweekid = 16 AND --int
> item_dim.upc = sale_fact.upc AND --bigint
> loc_dim.stornum = sale_fact.stornum AND --int
> loc_dim.store IN ('FOO STORE') AND --varchar
> item_dim.groupnumnm IN ('BAR ITEMS') --varchar
> GROUP BY
All of the groupby columns are varchar (except the last bigint). I'm just
guessing but do you see high numbers in your trace for the
group.* calls? (The numbers are listed a before the statement being executed).
> ORDER BY item_dim.depnm ASC
> SF.Net email is sponsored by: The Future of Linux Business White Paper
> from Novell. From the desktop to the data center, Linux is going
> mainstream. Let it simplify your IT future.
> MonetDB-users mailing list
> MonetDB-users at lists.sourceforge.net
Niels Nes, Centre for Mathematics and Computer Science (CWI)
Kruislaan 413, 1098 SJ Amsterdam, The Netherlands
room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312
url: http://www.cwi.nl/~niels e-mail: Niels.Nes at cwi.nl
More information about the users-list