Aggregate functions

Aggregate functions dinther Wed, 04/15/2020 - 20:06

Aggregate functions operate on a column or column expression of a table or group (as created via GROUP BY clause).

For the examples below we use a simple table t(i int, c varchar(8)) with contents:
INSERT into t(i,c) VALUES
(1, 'one'),
(2, 'two'),
(3, 'tree'),
(4, 'four'),
(NULL, NULL);

to create the shown Results.

Standard aggregate functions

Function Return type Description Example Result
avg(numeric_expr) double compute the average of the non-NULL integer/float/decimal/month_interval values in the column or group avg(i) 2.5
count(*) bigint count the number of rows in the column or group including those who may have NULL values count(*) 5
count(col_expr) bigint count the number of non-NULL values in the column or group count(i) 4
count(distinct col_expr) bigint count the number of distinct non-NULL values in the column or group count(distinct i%2) 2
max(col_expr) type of col_expr return the maximum non-NULL value in the column or group max(c) two
min(col_expr) type of col_expr return the minimum non-NULL value in the column or group min(c) four
prod(numeric_expr) bigint or hugeint or double compute the product of the non-NULL numeric values in the column or group prod(i) 24
prod(distinct numeric_expr) bigint or hugeint or double compute the product of the distinct non-NULL numeric values in the column or group prod(distinct 1 + i%2) 2
sum(integer_expr) bigint or hugeint compute the sum of all non-NULL integer values in the column or group sum(i) 10
sum(distinct integer_expr) bigint or hugeint compute the sum of all distinct non-NULL integer values in the column or group sum(distinct i/2) 3
sum(decimal_expr) decimal compute the sum of all non-NULL decimal values in the column or group sum(cast(i as decimal)) 10.000
sum(float_expr) real or double compute the sum of all non-NULL floating point values in the column or group sum(cast(i as real)) 10.0
sum(month_interval_expr) month_interval compute the sum of all non-NULL interval month values in the column or group sum(cast(i as interval month)) 10
sum(sec_interval_expr) sec_interval compute the sum of all non-NULL interval second values in the column or group sum(cast(i as interval second)) 10.000

 

Aggregate functions returning a concatenated CLOB string

Note: You must include the sys. or json. prefix for these aggregate functions (see examples) in order to work properly.

Function Return type Description Example Result
listagg(col_expr) varchar concatenate non-NULL values in a column or group into one large varchar string with values separated by a comma (the default separator)
Supported from release Jun2020 (11.37.7)
listagg(c) one,two,tree,four
listagg(col_expr, separator_str) varchar concatenate non-NULL values in a column or group into one large varchar string with values separated by separator_str
Supported from release Jun2020 (11.37.7)
listagg(i, ' | ') 1 | 2 | 3 | 4
sys.group_concat(col_expr) clob concatenate non-NULL values in a column or group into one large string with values separated by a comma (the default separator) sys.group_concat(c) one,two,tree,four
sys.group_concat(col_expr, separator_str) clob concatenate non-NULL values in a column or group into one large string with values separated by separator_str sys.group_concat(i, ' | ') 1 | 2 | 3 | 4
json.tojsonarray(clob) clob convert string values in the column or group into a json array string json.tojsonarray(c) [ "one", "two", "tree", "four" ]
json.tojsonarray(double) clob convert numeric values in the column or group into a json array string json.tojsonarray(i) [ "1", "2", "3", "4" ]

 

Aggregate functions for statistics

Note: You must include the sys. prefix for below aggregate functions (see examples) in order to work properly.

Statistic Function Return type Description Example Result
sys.corr(numeric_a, numeric_b) double return the correlation coefficient of the non-NULL values in the numeric (integer/decimal/real/double) column or group numeric_a versus numeric_b sys.corr(i, i+2) 0.7500000000000001
sys.covar_pop(numeric_a, numeric_b) double return the population covariance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group numeric_a versus numeric_b.
Supported from release Jun2020 (11.37.7)
sys.covar_pop(i, i*2) 2.5
sys.covar_samp(numeric_a, numeric_b) double return the sample covariance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group numeric_a versus numeric_b.
Supported from release Jun2020 (11.37.7)
sys.covar_samp(i, i*2) 3.3333333333333335
sys.median(col_expr) type of col_expr return the statistic median of the non-NULL values in the numeric (integer/decimal/real/double) or temporal (date/time/timestamp/interval) column or group sys.median(i) 2
sys.median_avg(numeric_expr) double return the statistic median average of the non-NULL values in the numeric (integer/decimal/real/double) column or group sys.median_avg(i) 2.5
sys.quantile(col_expr, percentile) type of col_expr return the statistic quantile of the non-NULL values in the numeric (integer/decimal/real/double) or temporal (date/time/timestamp/interval) column or group. The percentile argument must be between 0.0 and 1.0 sys.quantile(i, 0.7) 3
sys.quantile_avg(numeric_expr, percentile) double return the statistic quantile average of the non-NULL values in the numeric (integer/decimal/real/double) column or group. The percentile argument must be between 0.0 and 1.0 sys.quantile_avg(i, 0.6) 2.8
sys.stddev_pop(numeric_expr) double return the population standard deviation of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square root of var_pop(). sys.stddev_pop(i) 1.118033988749895
sys.stddev_samp(numeric_expr) double return the sample standard deviation of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square root of var_samp(). sys.stddev_samp(i) 1.2909944487358056
sys.var_pop(numeric_expr) double return the population standard variance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square of stddev_pop(). sys.var_pop(i) 1.25
sys.var_samp(numeric_expr) double return the sample standard variance of numeric_expr of the non-NULL values in the numeric (integer/decimal/real/double/interval) column or group. Same as the square of stddev_samp(). sys.var_samp(i) 1.6666666666666667

For more info see: Statistic Functions

 

You can also create your own aggregate functions, see Create Aggregate function command.

To list the available aggregate functions in your MonetDB server run query:
SELECT DISTINCT schema_id, name, func, mod, language, system FROM sys.functions WHERE type = 3 ORDER BY name;