Documentation > User Guide > SQL Functions > Aggregate Functions

Aggregate Functions

Aggregate functions operate on a column or column expression of a table or a group (as created via GROUP BY clause).

We support all the well-known standard SQL aggregate functions: COUNT(*|...), COUNT(DISTINCT ...), SUM(...), AVG(...), MIN(...), MAX(...) and even PROD(...) over scalar types/expressions and groupings.

In addition, many important statistical aggregate functions are supported.

Also character string aggregate functions LISTAGG and GROUP_CONCAT and json.tojsonarray() are available.

For the examples below to create the shown results, we use a simple table t with data:

CREATE TABLE t(i int, c varchar(8));
INSERT INTO t(i,c) VALUES (1,'one'), (2,'two'), (3,'tree'), (4,'four'), (NULL,NULL);

Standard aggregate functions

Function	Return type	Description	Example	Result
avg(numeric_expr)	double	compute the average of the non-NULL integer/float/decimal/month_interval values in the column or group	avg(i)	2.5
count(*)	bigint	count the number of rows in the column or group including those who may have NULL values	count(*)	5
count(col_expr)	bigint	count the number of non-NULL values in the column or group	count(i)	4
count(distinct col_expr)	bigint	count the number of distinct non-NULL values in the column or group	count(distinct i%2)	2
max(col_expr)	type of col_expr	return the maximum non-NULL value in the column or group	max(c)	two
min(col_expr)	type of col_expr	return the minimum non-NULL value in the column or group	min(c)	four
prod(numeric_expr)	bigint or hugeint or double	compute the product of the non-NULL numeric values in the column or group	prod(i)	24
prod(distinct numeric_expr)	bigint or hugeint or double	compute the product of the distinct non-NULL numeric values in the column or group	prod(distinct 1 + i%2)	2
sum(integer_expr)	bigint or hugeint	compute the sum of all non-NULL integer values in the column or group	sum(i)	10
sum(distinct integer_expr)	bigint or hugeint	compute the sum of all distinct non-NULL integer values in the column or group	sum(distinct i/2)	3
sum(decimal_expr)	decimal	compute the sum of all non-NULL decimal values in the column or group	sum(cast(i as decimal))	10.000
sum(float_expr)	real or double	compute the sum of all non-NULL floating point values in the column or group	sum(cast(i as real))	10.0
sum(month_interval_expr)	month_interval	compute the sum of all non-NULL interval month values in the column or group	sum(cast(i as interval month))	10
sum(sec_interval_expr)	sec_interval	compute the sum of all non-NULL interval second values in the column or group	sum(cast(i as interval second))	10.000

Aggregate functions returning a concatenated CLOB string

The SQL:2016 standard LISTAGG is supported. It is equivalent to sys.group_concat aggregate function. In the below examples you can simply replace sys.group_concat with LISTAGG and get the same results. LISTAGG does not require the sys. prefix.

For the examples below we add some data to the example table t:

INSERT INTO t(i,c) VALUES
(1, ''), (1, 'kwik'),
(2, NULL), (2, 'kwek'),
(3, ' '), (3, 'kwak'),
(4, 'kwak'), (4, ' ');

to create the shown Results.

Function	Return type	Description	Example	Result
sys.group_concat(col_expr)	varchar	concatenate all non-NULL values in a column or group into one string with values separated by a comma (the default separator)	sys.group_concat(c)	kwek, ,kwak,kwak, ,one,two,tree,four,,kwik
sys.group_concat(ALL col_expr)	varchar	same as sys.group_concat(col_expr), i.e. ALL is the default behaviour	sys.group_concat(ALL c)	one,two,tree,four,,kwik,kwek, ,kwak,kwak,
sys.group_concat(DISTINCT col_expr)	varchar	concatenate all non-NULL and unique values in a column or group into one string with values separated by a comma (the default separator)	sys.group_concat(DISTINCT c)	one,two,tree,four,,kwik,kwek, ,kwak
sys.group_concat(DISTINCT (col_expr))	varchar	same as sys.group_concat(DISTINCT col_expr)	sys.group_concat(DISTINCT (c))	one,two,tree,four,,kwik,kwek, ,kwak
sys.group_concat(col_expr, separator_str)	varchar	concatenate non-NULL values in a column or group into one string with values separated by separator_str	sys.group_concat(i, '\|')	1\|2\|3\|4\|1\|1\|2\|2\|3\|3\|4\|4
sys.group_concat(col_expr, separator_col_expr)	varchar	concatenate non-NULL values in a column or group into one string with values separated by separator_col_expr	sys.group_concat(i, c)	1two2tree3four41kwik12kwek2 3kwak3kwak4 4
sys.group_concat(DISTINCT col_expr, separator_str)	varchar	concatenate non-NULL and unique values in a column or group into one string with values separated by separator_str	sys.group_concat(DISTINCT i, '\|')	1\|2\|3\|4
sys.group_concat(DISTINCT col_expr, separator_col_expr)	varchar	concatenate non-NULL values in a column or group into one string with values separated by separator_col_expr. DISTINCT is computed over (col_expr, separator_col_expr).	sys.group_concat(DISTINCT i, c)	1two2tree3four41kwik12kwek2 3kwak3kwak4 4
sys.group_concat(col_expr1 ORDER BY col_expr2)	varchar	concatenate all non-NULL values in a column or group into one string with values separated by a comma (the default separator). Order the values of col_expr1 according to the ascending order (default) of the values of col_expr2	sys.group_concat(c ORDER BY c)	, , ,four,kwak,kwak,kwek,kwik,one,tree,two
sys.group_concat(col_expr1 ORDER BY col_expr2 DESC)	varchar	concatenate all non-NULL values in a column or group into one string with values separated by a comma (the default separator). Order the values of col_expr1 according to the descending order of the values of col_expr2	sys.group_concat(c ORDER BY c DESC)	two,tree,one,kwik,kwek,kwak,kwak,four, , ,
sys.group_concat(col_expr1) WITHIN GROUP (ORDER BY col_expr2)	varchar	same as sys.group_concat(col_expr1 ORDER BY col_expr2)	sys.group_concat(c) WITHIN GROUP (ORDER BY c)	, , ,four,kwak,kwak,kwek,kwik,one,tree,two
sys.group_concat(col_expr1, separator_str ORDER BY col_expr2)	varchar	concatenate all non-NULL values in a column or group into one string with values separated by separator_str. Order the values of col_expr1 according to the ascending order of the values of col_expr2	sys.group_concat(c, '\|' ORDER BY c)	\| \| \|four\|kwak\|kwak\|kwek\|kwik\|one\|tree\|two
sys.group_concat(DISTINCT col_expr1, separator_str ORDER BY col_expr2)	varchar	concatenate all non-NULL and unique values in a column or group into one string with values separated by separator_str. Order the values of col_expr1 according to the ascending order of the values of col_expr2	sys.group_concat(DISTINCT c, '\|' ORDER BY c)	\| \|four\|kwak\|kwek\|kwik\|one\|tree\|two

Note: for the GROUP_CONCAT examples above without an ORDER BY condition, we have shown only one of their correct result. Without an ordering, the input values can be concatenated in any order. In addition, depending on the implementation, every reexecution of the same function can produce a different but correct result. To demonstrate this, we show different results for sys.group_concat(c) and sys.group_concat(ALL c), even though they are essentially the same function.

Function	Return type	Description	Example	Result
json.tojsonarray(clob)	varchar	convert string values in the column or group into a json array string	json.tojsonarray(c)	[ "one", "two", "tree", "four" ]
json.tojsonarray(double)	varchar	convert numeric values in the column or group into a json array string	json.tojsonarray(i)	[ 1.000000, 2.000000, 3.000000, 4.000000 ]

Note: You must include the json. prefix to these aggregate functions for them to work properly.

Aggregate functions for statistics

Note: You must include the sys. prefix for below aggregate functions (see examples) in order to work properly.

Statistic Function	Return type	Description	Example	Result
sys.corr(numeric_a, numeric_b)	double	return the correlation coefficient of the non-NULL values in the numeric (integer/decimal/real/double) column or group numeric_a versus numeric_b	sys.corr(i, i+2)	0.7500000000000001
sys.covar_pop(numeric_a, numeric_b)	double	return the population covariance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group numeric_a versus numeric_b. Supported from release Jun2020 (11.37.7)	sys.covar_pop(i, i*2)	2.5
sys.covar_samp(numeric_a, numeric_b)	double	return the sample covariance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group numeric_a versus numeric_b. Supported from release Jun2020 (11.37.7)	sys.covar_samp(i, i*2)	3.3333333333333335
sys.median(col_expr)	type of col_expr	return the statistic median of the non-NULL values in the numeric (integer/decimal/real/double) or temporal (date/time/timestamp/interval) column or group	sys.median(i)	2
sys.median_avg(numeric_expr)	double	return the statistic median average of the non-NULL values in the numeric (integer/decimal/real/double) column or group	sys.median_avg(i)	2.5
sys.quantile(col_expr, percentile)	type of col_expr	return the statistic quantile of the non-NULL values in the numeric (integer/decimal/real/double) or temporal (date/time/timestamp/interval) column or group. The percentile argument must be between 0.0 and 1.0	sys.quantile(i, 0.7)	3
sys.quantile_avg(numeric_expr, percentile)	double	return the statistic quantile average of the non-NULL values in the numeric (integer/decimal/real/double) column or group. The percentile argument must be between 0.0 and 1.0	sys.quantile_avg(i, 0.6)	2.8
sys.stddev_pop(numeric_expr)	double	return the population standard deviation of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square root of var_pop().	sys.stddev_pop(i)	1.118033988749895
sys.stddev_samp(numeric_expr)	double	return the sample standard deviation of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square root of var_samp().	sys.stddev_samp(i)	1.2909944487358056
sys.var_pop(numeric_expr)	double	return the population standard variance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square of stddev_pop().	sys.var_pop(i)	1.25
sys.var_samp(numeric_expr)	double	return the sample standard variance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square of stddev_samp().	sys.var_samp(i)	1.6666666666666667

For sys.quantile the percentile argument is a float value between 0.0 and 1.0.
sys.median(expr) is equivalent to sys.quantile(expr, 0.5).

sys.stddev_samp computes the cumulative sample standard deviation and returns the square root of the sample variance. sys.stddev_pop computes the population standard deviation and returns the square root of the population variance. Both functions take as an argument any numeric datatype. Likewise, sys.var_samp and sys.var_pop functions return the sample variance (/n-1) of a set of numbers and the biased variance (/n) of a set of numbers, respectively.

The aggregate functions sys.median_avg and sys.quantile_avg return the interpolated value if the median/quantile doesn't fall exactly on a particular row. These functions always return a value of type DOUBLE and only work for numeric types (various width integers, decimal and floating point).

Checksum aggregate functions

Cryptographic hash function aggregates to calculate a checksum (digest) over a column or group of strings.

Note It only makes sense to use these with an ordering, as SELECT sha256(name ORDER BY id) FROM table, since the ordering in which the values are fed into the digest algorithm is important and cannot otherwise be guaranteed.

SHA-1, SHA-224, SHA-256, SHA-384, SHA-512 and RIPEMD-160 are cryptographic hash algorithms.

Function	Return type	Description	Example	Result
ripemd160(string)	string	calculate a RIPEMD-160 checksum of the string values in the column or group	ripemd160(name ORDER BY id)	e9582c6db306aae2a9fbd061c45dfebf9c2d8840
sha1(string)	string	calculate a SHA-1 checksum of the string values in the column or group	sha1(name ORDER BY id)	2b2de8e0a6f8f1e2fb1538eb4111716f05d986b1
sha224(string)	string	calculate a SHA-224 checksum of the string values in the column or group	sha224(name ORDER BY id)	fa9144213e6e7ae1599306f71b32d5178b71c992def23d3377005b40
sha256(string)	string	calculate a SHA-256 checksum of the string values in the column or group	sha256(name ORDER BY id)	8f3ebe3b8ed93368f60fe63781dffbe15d3286d5c234767a321a648d4d0012c2
sha384(string)	string	calculate a SHA-384 checksum of the string values in the column or group	sha384(name ORDER BY id)	e408241d3cc2c76edc60e7eaa15425c67654c356b9a7b0291301746733395e0c b06eb0445cde8f8ea93dcbe0e0fb7ca2
sha512(string)	string	calculate a SHA-512 checksum of the string values in the column or group	sha512(name ORDER BY id)	c18daaa531683d018988a956c1c35072ca0e02b4a4caf261cd63f4080bb26fd8 44a51c80bd96f56bb820ff10878a335224031e1cd641e28d6614b9654c4ac080

Examples

create schema aggr_tst;
set schema aggr_tst;

create table aggr_tst.tc (c real);
insert into aggr_tst.tc values (1), (2), (3), (4), (5), (9);
select * from aggr_tst.tc;

select
  count(*) as countstar
, count(c) as count
, count(distinct c) as countdistinct
, sum(c) as sum
, avg(c) as average
, prod(c) as product
, min(c) as minimum
, max(c) as maximum
, group_concat(c,',') as group_concat
, listagg(c,',' order by c desc) as listagg
, median(c) as median
, median_avg(c) as median_avg
, quantile(c, 0.5) as quantile
, quantile_avg(c, 0.5) as quantile_avg
, stddev_samp(c) as stddev_samp
, stddev_pop(c) as stddev_pop
, var_samp(c) as var_samp
, var_pop(c) as var_pop
, corr(c, c+1) as corr
, covar_samp(c, c*2) as covar_samp
, covar_pop(c, c*2) as covar_pop
from aggr_tst.tc;

drop table aggr_tst.tc;

set schema sys;
drop schema aggr_tst;

select type, sha256(name ORDER BY id) from sys.tables group by type;

Tip: To list all the available aggregate functions in your MonetDB server run query:

SELECT DISTINCT schema_id, name, func, mod, language, system FROM sys.functions WHERE type = 3 ORDER BY name;

You can also create your own aggregate functions, see Create Aggregate function command.