Skip to main content

Lexical structure

Comments

Comments can be added to query scripts for documentation purposes. MonetDB/SQL supports two forms of comments. Any text beginning with '- -' and up to the end of line is ignored. Furthermore, C-style comments ( /* this is ignored */ ) can be injected in a SQL query where a language token is expected. A language token can be a keyword, identifier, a literal constant or a special character. Tokens should be  separated by 'white space' (tab, space, newlines) unless there juxtaposition does not create an ambiguous interpretation. Comments are considered equivalent to white spaces.

Identifiers and Keywords

SQL comes with a large collection of keywords, i.e. names reserved by the committee to designate language constructs. Keywords in MonetDB/SQL are case in-sensitive and adhere to the general convention for identifier denotation in programming languages.

Users can overrule the interpretation of an identifier as a keyword by encapsulation with double quotes, e.g. select denotes a keyword, while "select" denotes a user defined name. This scheme also permits inclusion of white space in the names. However, it is general advisable to limit the number of the escaped keywords.

Names are used to designate database objects. In that role, they are case in-sensitive. The terminal identifier classes distinguished are denoted with the suffix _name.

Literal Constants

Numeric constants follow the convention of most programming languages. A numeric constant that contains neither a decimal point or exponential is considered of type integer. The underlying machine determines whether it will be interpreted as a 32- or 64-bit value, although most likely it will be that latter case these days.

String constants in SQL are embraced with single quotes ('\”). They may be split over multiple lines with blanc space in between.

Often strings can be cast to other types using the CAST( expr AS type) expression, provided the typed coercion routine is available. For example:

CAST ( '#ccddee' AS integer)
CAST ( '0.3' AS double)

illustrates conversion of a color value in hexadecimal string notation into an integer, and a value into a double precision floating point number.

Special Characters

String literals may contain the traditional escape characters: '\n' for new lines, '\t' for tab, '\r' for return, and '\\' for backslash. The conventions '\ddd` where d a digit stands for a number denoted in octal.

Operator Precedences

Most operators in SQL have the same precedence and are left-associative. Parenthesis can be used to disambiguate the precedence order. The operator definitions can not be overloaded, nor can you define new operators.

 

. left table/column name separator
- right unary minus
^ left exponentiation
* / % left multiplication, division, modulo
+ - left unary addition, subtraction
IS, IS TRUE, IS FALSE, IS UNKNOWN    
IS NULL, NOT NULL   test for (not) null
IN   set membership
BETWEEN   range containment
OVERLAPS   time interval overlap
LIKE, ILIKE   string pattern matching
< >   less than, greater than
= right equality, assignment
NOT right logical negation
AND left logical conjunction
OR left logical disjunction`

Value Expressions

The language is built around value- and table- expressions. Value expressions encompass denotation of literal constants, type casts, column references, operator and function invocation, and scalar returning subqueries.

The column references take the form [[database_name '.'] correlation_name '.']column_name. The default database_name is the one identified at the command line while seeking database access. Omission of the database name and separating dot denotes the session default database. The database name names the data stored under a schema, possibly residing at a remote location. (See Distribution.) The correlation_name is either a table name or an alias introduced in a from clause. The correlation name may be omitted if the column name uniquely identifies a column within the scope of current query.

Table expressions produce a relational table. It is internally referenced through a correlation name, which supports attribute selection using the '.' denotation.