MonetDB type system

From MonetDB
Jump to: navigation, search

This page is mainly about C types.

For MonetDB's MAL type system, see also https://www.monetdb.org/Documentation/Manuals/MonetDB/MAL/Types .

For the SQL types supported in MonetDB, see also https://www.monetdb.org/Documentation/Manuals/SQLreference/Datatypes .


  • Type Rules
    • C types long & unsigned long are evil (i.e., NOT portable) and must NOT be used!
      While they are 32/64 bit on 32/64-bit systems under Unix, they are always 32 bit under Windows (also on 64-bit systems).
      In other words, C types long & unsigned long are always 32 bit, except on 64-bit non-Windows systems.
      If you need a type that scales from 32 bit on 32-bit systems to 64 bit on 64-bit systems, consider choosing the appropriate portable alternatives detailed below, i.e., size_t, ssize_t, ptrdiff_t, BUN, oid.
    • #include "monetdb_config.h" must be the first (non-comment) statement in each .c file (while it is not to be included in any .h file).
    • In C, a tuple-/object-ID (OID) is of type oid, NOT of type int, lng, size_t, BUN, etc.
      Type oid is always 32 bit (4 byte) on 32-bit systems, but can be either 32 bit (explicit choice during configure) or 64 bit (8 byte; default) on 64-bit systems.
    • In C, the number of tuples in a BAT (BATcount) is of type BUN, NOT of type int, lng, size_t, oid, etc.
      Type BUN has the same size (width) as type oid.
    • In C, the length of a string and the size of an array or memory region are of type size_t, NOT of type int, lng, oid, BUN, etc.
      Type size_t is 32 bit (4 byte) on 32-bit systems and 64 bit (8 byte) on 64-bit systems (as are types ssize_t and ptrdiff_t).
    • In C, a BAT-ID is of type bat, NOT of type int.


MonetDB type system (excerpt!)
Semantics SQL MAL C type width signed? NIL value value range
(excluding NIL value!)
format string availability C example
string length,
array size,
memory size
[byte]
size_t 4/8-byte
32/64-bit
no [0:2^32-1/2^64-1] SZFMT always size_t x = 0;
printf( SZFMT, x);
string length -,
array size -,
memory size -
- difference
[byte]
ssize_t 4/8-byte
32/64-bit
yes [-2^31/-2^63:2^31-1/2^63-1] SSZFMT always ssize_t x = 0;
printf(SSZFMT, x);
pointer difference ptrdiff_t 4/8-byte
32/64-bit
yes [-2^31/-2^63:2^31-1/2^63-1] PDFMT always ptrdiff_t x = 0;
printf( PDFMT, x);
BAT-ID bat 4-byte
32-bit
yes bat_nil ==
(bat) int_nil
(-2^31:2^31-1] "%d" always bat x = 0;
printf( "%d", x);
number of tuples in a BAT
(count)
BUN 4/8-byte
32/64-bit
no BUN_NONE ==
(BUN) GDK_oid_max
[0:BUN_MAX == (BUN_NONE - 1)]
[0:2^31-2/2^63-2]
BUNFMT always BUN x = 0;
printf(BUNFMT, x);
object-ID / tuple-ID  :oid oid 4/8-byte
32/64-bit
no oid_nil ==
2^31/2^63
[GDK_oid_min:GDK_oid_max]
[0:2^31-1/2^63-1]
OIDFMT always oid x = 0;
printf(OIDFMT, x);
bit / boolean
(0/1 / false/true)
BOOLEAN  :bit bit 1-byte
8-bit
(yes) bit_nil ==
(bit) bte_nil
{GDK_bit_min,GDK_bit_max}
{FALSE,TRUE}
{0,1}
"%hhd" always bit x = 0;
printf("%hhd", x);
1-byte (8-bit)
signed integer
TINYINT  :bte bte 1-byte
8-bit
yes bte_nil ==
GDK_bte_min
(GDK_bte_min:GDK_bte_max]
(-2^7:2^7-1]
"%hhd" always bte x = 0;
printf("%hhd", x);
1-byte (8-bit)
unsigned integer
unsigned char 1-byte
8-bit
no [0:2^8-1] "%hhu" always unsigned char x = 0;
printf("%hhu", x);
2-byte (16-bit)
signed integer
SMALLINT  :sht sht 2-byte
16-bit
yes sht_nil ==
GDK_sht_min
(GDK_sht_min:GDK_sht_max]
(-2^15:2^15-1]
"%hd" always sht x = 0;
printf( "%hd", x);
2-byte (16-bit)
unsigned integer
unsigned short 2-byte
16-bit
no [0:2^16-1] "%hu" always unsigned short x = 0;
printf( "%hu", x);
4-byte (32-bit)
signed integer
INT
INTEGER
 :int int 4-byte
32-bit
yes int_nil ==
GDK_int_min
(GDK_int_min:GDK_int_max]
(-2^31:2^31-1]
"%d" always int x = 0;
printf( "%d", x);
4-byte (32-bit)
unsigned integer
unsigned int 4-byte
32-bit
no [0:2^32-1] "%u" always unsigned int x = 0;
printf( "%u", x);
machine-word-size
signed integer
32/64 bit on 32/64-bit systems
Deprecated as there is no such type in SQL.
Still used in MAL for counts, lacking :BUN in MAL
In C, use BUN for counts, otherwise ssize_t.
 :wrd wrd 4/8-byte
32/64-bit
yes wrd_nil ==
GDK_wrd_min
(GDK_wrd_min:GDK_wrd_max]
(-2^31/63:2^31/63-1]
SSZFMT always wrd x = 0;
printf( SSZFMT, x);
8-byte (64-bit)
signed integer
BIGINT  :lng lng 8-byte
64-bit
yes lng_nil ==
GDK_lng_min
(GDK_lng_min:GDK_lng_max]
(-2^63:2^63-1]
LLFMT always lng x = 0;
printf( LLFMT, x);
8-byte (64-bit)
unsigned integer
ulng 8-byte
64-bit
no [0:2^64-1] ULLFMT always ulng x = 0;
printf(ULLFMT, x);
16-byte (128-bit)
signed integer
HUGEINT  :hge hge 16-byte
128-bit
yes hge_nil ==
GDK_hge_min
(GDK_hge_min:GDK_hge_max]
(-2^127:2^127-1]
(none provided by compilers) if supported by compiler
(configure then defines HAVE_HGE)
#ifdef HAVE_HGE
hge x = 0;
printf("%.40g", (dbl) x);
#endif
16-byte (128-bit)
unsigned integer
uhge 16-byte
128-bit
no [0:2^128-1] (none provided by compilers) if supported by compiler
(configure then defines HAVE_HGE)
#ifdef HAVE_HGE
uhge x = 0;
printf("%.40g", (dbl) x);
#endif
4-byte (32-bit)
floating-point number
REAL  :flt flt 4-byte
32-bit
yes flt_nil ==
GDK_flt_min
(GDK_flt_min:GDK_flt_max]
(-FLT_MAX:FLT_MAX]
"%e", "%f", "%g" always flt x = 0;
printf( "%f", x);
8-byte (64-bit)
floating-point number
FLOAT
DOUBLE
 :dbl dbl 8-byte
64-bit
yes dbl_nil ==
GDK_dbl_min
(GDK_dbl_min:GDK_dbl_max]
(-DBL_MAX:DBL_MAX]
"%e", "%f", "%g" always dbl x = 0;
printf( "%f", x);
strings
Internally, only valid UTF-8 encoded strings are supported;
conversion from/to other encoding has to be performed
before/during base data import
and during/after query result export.
CHAR
CHARACTER
VARCHAR
CHARACTER VARYING
TEXT
STRING
CLOB
CHARACTER LARGE OBJECT
 :str str str_nil
const char str_nil[2]
= { '\200', 0 };
"%s" always str x = "";
printf("%s",x);

In case you have any questions about correct, proper, and portable type usage in MonetDB, please do not hesitate to ask Sjoerd or Stefan.