I have created a traditional star-schema in MonetDB, using both Linux and Windows. We are using Mondrian OLAP on top so it generates the standard OLAP-y type of query using group-bys of dimensional values. What I’m finding is that doing joins in MonetDB is extremely expensive, for example:
The following query against a fact table with 261 million rows returns in 9-11seconds consistently
select date_hour_id/10000 as date_month_id, avg(seconds) from fact_utilization_hourly group by date_month_id;
returning 41 rows, the query includes a “Trick” to get to a month_year value because the time_ids are smart keys… yyyyMMddhh.
With the following query, using standard OLAP-y syntax, the query takes over 5 mins consistently.
select year_month_value, avg(seconds) from fact_utilization_hourly f join dim_date_hour d on f.date_hour_id = d.date_hour_id group by year_month_value;
returning the same 41 rows.
I’m at a loss to figure out why this join causes a 3200% percent increase in response time. And this seems to have gotten worse in SP5 from anecdotal evidence.
Bryan