[Monetdb-developers] Partitioning

moredata at fastmail.net moredata at fastmail.net
Mon Jul 31 17:07:20 CEST 2006


Hello,

I've been following the developments of MonetDB off and on. It seems to
me that there is a real emphasis on developing XML-related
functionality, and less so for BI-related functionality.

I wanted to ask if there are any plans to allow tables to be partitioned
by some condition. This would have a couple of benefits. A table could
be broken out into several BAT files, one for each condition met. That
would allow you to potentially overcome the 2 GB BAT file limitation on
32 bit systems. And of course, when adding intelligence into the
optimizer for this, it could examine the conditions of queries and
potentially eliminate entire BAT files to scan.

For example, assume you have a table with a column named sometype, and
chose sometype for partitioning. Sometype has 10 possible values.
Physically, MonetDB would create 10 different BAT files for each column,
corresponding to each possible value. When a query is executed with a
condition like "sometype = 1", the optimizer is smart enough to know it
need only use 1 set of the bat files, and not all 10, significantly
reducing the amount of data that needs to be examined.

If you are familiar with PostgreSQL, they introduced something like this
called Constraint Exclusion Partitioning (also, MySQL has some
partitioning functionality in its new beta). Each partition is treated
as a subtable of a main parent table. You are required to insert
directly into the proper subtable, but querying on the main parent table
will determine which subtables are required to be examined in processing
the query. Their scheme is not entirely convenient with loading up data,
but it is quite flexible in setting up arbitrary conditions on the
subtables.

Are there any such plans to do something similar? At what state is this
in? I believe I saw something about Partitioning on the roadmap several
months ago. The MonetDB home page mentions OLAP, and it would seem to me
that a feature like this is critical if MonetDB really wants to handle
large data volume Business Intelligence queries.

Thanks.
-- 
  
  moredata at fastmail.net

-- 
http://www.fastmail.fm - Choose from over 50 domains or use your own





More information about the developers-list mailing list