Bug 6959 - "simply" running-sum "sum() over (order by ...)" very slow: quadratic rather than linear!??
Summary: "simply" running-sum "sum() over (order by ...)" very slow: quadratic rather ...
Status: NEW
Alias: None
Product: SQL
Classification: Unclassified
Component: all (show other bugs)
Version: 11.37.11 (Jun2020-SP1)
Hardware: x86_64 (amd64/em64t) Linux
: Normal normal
Assignee: SQL devs
Depends on:
Reported: 2020-08-31 17:00 CEST by Stefan Manegold
Modified: 2020-10-28 17:07 CET (History)
1 user (show)

test script (2.10 KB, application/x-sh)
2020-08-31 17:00 CEST, Stefan Manegold
output of test script (3.32 KB, text/x-log)
2020-08-31 17:00 CEST, Stefan Manegold

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Manegold cwiconfidential 2020-08-31 17:00:22 CEST
Created attachment 692 [details]
test script

A simple running sum
sum() over (order by "key")
even on data that is already sorted on key (and MonetDB should know that)
appear to be very slow with larger data, seemingly quadratic in the number of tuples rather than linear.

See attached script and log for details.
Comment 1 Stefan Manegold cwiconfidential 2020-08-31 17:00:58 CEST
Created attachment 693 [details]
output of test script
Comment 2 Pedro Ferreira 2020-08-31 17:35:52 CEST
This is a known corner case that can be improved. Is on the TODO list to improve the window functions for these scenarios, but I have been very busy with bug fixes and other performance improvements.
Comment 3 Stefan Manegold cwiconfidential 2020-09-30 12:16:28 CEST
Well, with a "global" Window function, i.e., merely an "order by" but no "partition by", I would not expect quadratic complexity/performance in the first place, but at most n log n, i.e., for the sort ...
Comment 4 Pedro Ferreira 2020-09-30 16:30:01 CEST
After I get done with the alloc-str-branch, which should be around this week, I can go to this. I planning to add 2 more execution scenarios for aggregates as windows functions: cumulative aggregate for your case and segment tree lookup as the more general solution. The naive solution should still be the better solution when the number of rows in the partition is very low.
Comment 5 Pedro Ferreira 2020-10-28 17:07:09 CET
On window-tunning branch, I added the cumulative aggregate for these cases. The performance should be around nlogn.