Skip to main content

Window functions

SQL provides a method to aggregate over a series of related tuples.  They are called window functions and always come with an OVER() clause which determines how tuples are split up over the window functions.The PARTITION BY clause  within OVER divides the rows into groups that share the same values of the PARTITION BY expression(s). For each row, the window function is computed overall rows participating in the group. The order within a partition can be used as well.

The ROW_NUMBER() returns the position of the tuple currently in the result set. The RANK() function produces the row number within a partition, starting at 1. The DENSE_RANK() produces the rank of the current row without gaps, it  counts peer groups.

window_function:

{RANK | DENSE_RANK | PERCENT_RANK | CUME_DIST} OVER window_name |
{RANK | DENSE_RANK | PERCENT_RANK | CUME_DIST} OVER '('
    [window_name] [PARTITION BY column_ref ... ]
    [ORDER BY sort_spec]
    {ROWS | RANGE} window_bound_start
    [BETWEEN {window_bound_end | window_bound_start}
    AND {window_bound_end | window_bound_start} ]
    [EXCLUDING {CURRENT ROW | GROUP | TIES | NO OTHERS}

window_bound_start:

    UNBOUNDED PRECEDING | value PRECEDING | CURRENT ROW

window_bound_end:

     UNBOUNDED FOLLOWING | value FOLLOWING

The snippet below (taken from the test suite) illustrate the functionality provided.

 

create table ranktest ( id int, k string);
select ROW_NUMBER() over () as foo from ranktest;
select ROW_NUMBER() over (PARTITION BY id) as foo, id from ranktest;
select ROW_NUMBER() over (PARTITION BY id ORDER BY id) as foo, id from ranktest;
select ROW_NUMBER() over (ORDER BY id) as foo, id from ranktest;

select RANK() over () as foo from ranktest;
select RANK() over (PARTITION BY id) as foo, id from ranktest;
select RANK() over (PARTITION BY id ORDER BY id) as foo, id from ranktest;
select RANK() over (ORDER BY id) as foo, id from ranktest;

select RANK() over () as foo, id, k from ranktest;
select RANK() over (PARTITION BY id) as foo, id, k from ranktest;
select RANK() over (PARTITION BY id ORDER BY id, k) as foo, id, k from ranktest;
select RANK() over (ORDER BY id, k) as foo, id, k from ranktest;

select DENSE_RANK() over () as foo, id, k from ranktest order by k;
select DENSE_RANK() over (PARTITION BY id) as foo, id, k from ranktest order by k;
select DENSE_RANK() over (PARTITION BY id ORDER BY id, k) as foo, id, k from ranktest order by k;
select DENSE_RANK() over (ORDER BY id, k) as foo, id, k from ranktest order by k;
drop table ranktest;