Aggregate Definitions

Syntax

aggregate_func_def:
    CREATE [OR REPLACE] AGGREGATE [FUNCTION] qname '(' [ paramlist ] ')' RETURNS func_data_type EXTERNAL NAME external_function_name
  | CREATE [OR REPLACE] AGGREGATE [FUNCTION] qname '(' [ paramlist ] ')' RETURNS func_data_type LANGUAGE language_keyword '{' function_impl_code '}'

func_data_type:
      TABLE '(' column data_type ',' ... ')'
    | data_type

language_keyword:
    C | CPP | R | PYTHON | PYTHON_MAP | PYTHON3 | PYTHON3_MAP

Introduce a new user-defined aggregation function for which the implementation is specified externally using. The function can be written in any of the supported programming languages, or directly bound to an existing MAL operation.

To allow other users to invoke a user-defined aggregate function, you must grant the other users (or PUBLIC) EXECUTE privilege for the aggregate function.

A number of built-in aggregate functions is available, see: Aggregate Functions

The associated system table is sys.functions where type = 3 and language = 1.

Examples

CREATE AGGREGATE my_quantile(val bigint, q double) RETURNS bigint EXTERNAL NAME "aggr"."quantile";

DROP AGGREGATE my_quantile;
CREATE AGGREGATE python_aggregate(val INTEGER) 
RETURNS INTEGER 
LANGUAGE PYTHON {
    try:
        unique = numpy.unique(aggr_group)
        x = numpy.zeros(shape=(unique.size))
        for i in range(0, unique.size):
            x[i] = numpy.sum(val[aggr_group==unique[i]])
    except NameError:
        # aggr_group doesn't exist. no groups, aggregate on all data
        x = numpy.sum(val)
    return(x)
};

DROP AGGREGATE python_aggregate;

If your Python code can be executed in parallel (using multiple threads) without side effects, use PYTHON_MAP instead of PYTHON. Idem for PYTHON3_MAP.

Python 2 and the language keywords PYTHON2 and PYTHON2_MAP are NO LONGER supported .

For languages C and CPP, a C/C++ compiler must be available on the deployment server and the MonetDB server started with option: --set embedded_c=true.

For language R, the R script interpreter software must be available on the deployment server and the MonetDB server started with option: --set embedded_r=true.

For language PYTHON, the Python script interpreter software must be available on the deployment server and the MonetDB server started with option: --set embedded_py=true or --set embedded_py=3.

For details see: JIT C/C++ UDFs and MonetDB Embedded. See also: COMMENT ON AGGREGATE.

Associated system table: sys.functions where type = 3 and language > 2.