Skip to main content

Storage Model

The storage footprint for any given database schema can be obtained by inspecting the table producing function storage().  To illustrate below we see the storage characteristics of the lineitem table in TPCH SF-10. The column width for variable length strings, e.g. l_shipinstruct and l_comment, represents the average length found using a sample over the underlying column storage.

To estimate the storage footprint for a variation of this scheme, we first construct a relation with the model input. Thereafter it can be updated to reflect the expected database size and varwidth properties. The footprint then becomes available as table producing function storagemodel(). Also be aware that dictionary encoding may have taken place, which leads to a much less storage footprint (see l_shipinstruct). Dictionary encoding is currently a runtime feature,  it depends on insertion sequence and the dictionary size. It may lead to an overshoot in the estimated size. Therefore, the size of varchar columns should be taken with a grain of salt.

sql>call storagemodelinit();
sql>update storagemodelinput set  count = 1000000 where "table"='lineitem';
sql>update storagemodelinput set  "distinct" = 1000 where "table"='lineitem' and "type"='varchar';
sql>update storagemodelinput set  "distinct" = 330000 where "table"='lineitem' and "column"='l_comment';
sql>select * from storagemodel() where "table" = 'lineitem';