ALTER TABLE ALTER COLUMN SET STORAGE

Daniel Zvinca daniel.zvinca at logbis.com
Tue Sep 10 11:40:22 CEST 2019


Hello,

I am interested to find out more about ALTER TABLE ALTER COLUMN SET STORAGE
feature and how is that related to compression.

As far as I understood this is related to an active development branch,
called MOSAIC which was never merged with any of the previous MonetDB
versions. Obviously, compression is an important feature columnar databases
are providing for data storage and manipulation. A module like MOSAIC that
seems to allow several compression techniques, would be an interesting
option.

First question I have: Can MOSAIC extension be used (sources added and
custom compiled) with success for any of its proposed codecs with any of
the newest versions (Apr2019 +). I mean without affecting any of embedded,
capi, rapi and pyapi modules, which all exchange data with external
libraries.

A quick read of MOSAIC code made me understand that this compression can be
applied only on readonly PERSISTENT columns. That means that I would loose
the major benefit of compression that I mostly need during importing stage.
Sure I can imagine a controlled batching import scheme that would append
data to tables and when it reaches certain threshold table is made
readonly, then compressed, then added to a merged table, but this looks
quite of a scenario. Am I wrong, can MOSAIC be used in a different scenario?

I can understand reasons behind compressing only PERSISTENT bats, yet I am
wondering if TRANSIENT bats can also benefit from it especially for 1.
result building stage (server-client or embedded version)  or 2. for remote
connections when data is transferred for merging operations.

Regarding to above question, are there any chances that you would consider
keeping compressed results in memory? Sure I can use instead disk
temporary tables for subsequent manipulation, but for performance reasons
in memory compressed results would be way faster. Actually, when embedded
version provides a result set, it stays valid till the user releases it,
why not to be able to also use that for possible subsequent SQL operations
that do not fit into a CTE scenario. That would provide superior
flexibility and memory management to CTE mechanism. Temporary results can
be developed in steps, they can be accessed directly at any time as
convenient as temporary views in CTE, but without the burden of possible
temporary bats that are not released till one CTE ends.


Thank you,
Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/developers-list/attachments/20190910/e87e3f64/attachment.htm>


More information about the developers-list mailing list