Skip to main content

Data Partitioning

The prime method to enable finer control of locality of data access during query evaluation is to horizontally partition the tables in a database. This is made possibly by introducing the concept of MERGE TABLE, which allows a table to be defined as the union of its partitions. Running the ANALYZE command over a MERGE TABLE aids the optimizer to weed out all partitions that can not contribute to a query result using the min/max values available.

Creating MERGE TABLE-s

The SQL statements below show how to create a MERGE TABLE mt containing two partitions t1 and t2. Then the partition tables can be queried both individually and jointly.

    -- Create partition tables with identical column types, and add some values.
    -- The column names are not significant.
    CREATE TABLE t1 (i int);
    CREATE TABLE t2 (j int);
    INSERT INTO t1 VALUES (11), (13);

    INSERT INTO t2 VALUES (23), (27);

    -- Create a MERGE TABLE
    CREATE MERGE TABLE mt1 (t int);

    -- Add the partition tables into the MERGE TABLE
    ALTER TABLE mt1 ADD TABLE t1;
    ALTER TABLE mt1 ADD TABLE t2;

    sql> -- sanity check
    sql> SELECT count(*) FROM t1;
    +------+
    | L1   |
    +======+
    |    2 |
    +------+
    1 tuple (2.231ms)
    sql> SELECT count(*) FROM t2;
    +------+
    | L1   |
    +======+
    |    2 |

    +------+
    1 tuple (2.226ms)
    sql> SELECT count(*) FROM mt1;
    +------+
    | L1   |
    +======+
    |    4 |
    +------+
    1 tuple (3.455ms)

Two MERGE TABLE can contain overlapping partition tables. A MERGE TABLE can also contain other MERGE TABLE-s:

    CREATE TABLE t3 (k int);
    INSERT INTO t3 VALUES (31), (37);

    CREATE TABLE t4 (l int);
    INSERT INTO t4 VALUES (41), (47);

    -- An overlapping MERGE TABLE (with mt1)
    CREATE MERGE TABLE mt2 (t int);

    ALTER TABLE mt2 ADD TABLE t1;
    ALTER TABLE mt2 ADD TABLE t3;

    -- A MERGE TABLE of MERGE TABLE
    CREATE MERGE TABLE mt3 (t int);
    ALTER TABLE mt3 ADD TABLE mt1;

    ALTER TABLE mt3 ADD TABLE t4;

    sql> -- sanity check
    sql> SELECT * FROM mt2;

    +------+
    | t    |
    +======+
    |   11 |
    |   13 |
    |   31 |
    |   37 |
    +------+
    4 tuples (2.677ms)
    sql> SELECT * FROM mt3;
    +------+
    | t    |
    +======+
    |   11 |
    |   13 |
    |   23 |
    |   27 |
    |   41 |
    |   47 |
    +------+
    6 tuples (2.496ms)

Updating MERGE TABLE-s

Currently, a MERGE TABLE cannot be directly updated. All updates must be done on the individual partition tables:

    sql> -- All updates must go directly into the individual partition table
    sql> INSERT INTO mt1 VALUES(1);
    INSERT INTO: cannot insert into merge table 'mt1'
    sql> UPDATE mt1 SET t=10 where t=11;
    UPDATE: cannot update merge table 'mt1'
    sql> UPDATE t1 SET i=10 where i=11;
    1 affected row (4.081ms)

One can remove a partition table from a MERGE TABLE using an ALTER TABLE statement. This also enables dropping a partition table, while keep the merge table.

    -- Remove a partition table from a MERGE TABLE
    ALTER TABLE mt1 DROP TABLE t1;

    sql> -- sanity check
    sql> SELECT count(*) FROM mt1;
    +------+
    | L1   |
    +======+
    |    2 |
    +------+
    1 tuple (1.455ms)

When dropping a partition table, all MERGE TABLE-s that depend on it must be dropped first:

    sql> -- Drop table must happen in the correct order of dependency
    sql> DROP TABLE t2;
    DROP TABLE: unable to drop table t2 (there are database objects which depend on it)
    sql> DROP TABLE mt1;
    DROP TABLE: unable to drop table mt1 (there are database objects which depend on it)
    sql> DROP TABLE mt3;
    operation successful (3.048ms)
    sql> DROP TABLE mt1;
    operation successful (1.948ms)
    sql> DROP TABLE t2;
    operation successful (2.329ms)