I have created a traditional star-schema in MonetDB, using both Linux and Windows.  We are using Mondrian OLAP on top so it generates the standard OLAP-y type of query using group-bys of dimensional values.  What I’m finding is that doing joins in MonetDB is extremely expensive, for example:

 

The following query against a fact table with 261 million rows returns in 9-11seconds consistently

select date_hour_id/10000 as date_month_id, avg(seconds) from fact_utilization_hourly group by date_month_id;

returning 41 rows, the query includes a “Trick” to get to a month_year value because the  time_ids are smart keys… yyyyMMddhh.

 

With the following query, using standard OLAP-y syntax, the query takes over 5 mins consistently.

select year_month_value, avg(seconds) from fact_utilization_hourly f join dim_date_hour d on f.date_hour_id = d.date_hour_id group by year_month_value;

returning the same 41 rows.

 

I’m at a loss to figure out why this join causes a 3200% percent increase in response time.  And this seems to have gotten worse in SP5 from anecdotal evidence.

 

Bryan



This transmission is confidential and intended solely for the use of the recipient named above. It may contain confidential, proprietary, or legally privileged information. If you are not the intended recipient, you are hereby notified that any unauthorized review, use, disclosure or distribution is strictly prohibited. If you have received this transmission in error, please contact the sender by reply e-mail and delete the original transmission and all copies from your system.