Storage Model

Storage Model mk Fri, 10/26/2012 - 13:00

The storage footprint for any given database schema can be obtained by inspecting the table producing function storage().  To illustrate below we see the storage characteristics of the lineitem table in TPCH SF-10. The column width for variable length strings, e.g. l_shipinstruct and l_comment, represents the average length found using a sample over the underlying column storage.

To estimate the storage footprint for a variation of this scheme, we first construct a relation with the model input. Thereafter it can be updated to reflect the expected database size and varwidth properties. The footprint then becomes available as table producing function storagemodel(). Also be aware that dictionary encoding may have taken place, which leads to a much less storage footprint (see l_shipinstruct). Dictionary encoding is currently a runtime feature,  it depends on insertion sequence and the dictionary size. It may lead to an overshoot in the estimated size. Therefore, the size of varchar columns should be taken with a grain of salt.

sql>call storagemodelinit();
sql>update storagemodelinput set  count = 1000000 where "table"='lineitem';
sql>update storagemodelinput set  "distinct" = 1000 where "table"='lineitem' and "type"='varchar';
sql>update storagemodelinput set  "distinct" = 330000 where "table"='lineitem' and "column"='l_comment';
sql>select * from storagemodel() where "table" = 'lineitem';

 

When data updates are done on columns, the changes are first stored in so-called delta structures and later merged (in bulk) into the column storages. For analysis, monitoring and debugging purposes we provide as of release Nov2019 (11.35.3) three new table producing functions:
    sys.deltas("schema" string, "table" string, "column" string)
    sys.deltas("schema" string, "table" string)
    sys.deltas("schema" string)

Each return a table with 7 output columns:

name type description
id int reference to the sys.columns.id
cleared boolean whether the column's upper table is cleared (true) or not (false)
immutable bigint number of the RDONLY deltas of the column
inserted bigint number of the RD_INS deltas of the column
updates bigint number of the RD_UPD_ID deltas of the column
deletes bigint number of deleted values of the column's table
level int the level of the current transaction in the transaction level tree

Some examples to query these table producing functions:
    SELECT * FROM sys.deltas('sys','statistics','minval');
    SELECT id, cleared, immutable, inserted, updates, deletes, level
      FROM sys.deltas('sys','statistics');
    SELECT * FROM sys.deltas('tmp');
    SELECT s.name as "schemanm", t.name as "tablenm", c.name as "columnnm", d.*
      FROM sys.schemas s
      JOIN sys.tables t ON s.id = t.schema_id
      JOIN sys.columns c ON t.id = c.table_id
      JOIN sys.deltas('sys') d ON c.id = d.id;