Bug 6959

Summary: "simply" running-sum "sum() over (order by ...)" very slow: quadratic rather than linear!??
Product: SQL Reporter: Stefan Manegold <Stefan.Manegold>
Component: allAssignee: SQL devs <bugs-sql>
Severity: normal CC: pedrotadim
Priority: Normal    
Version: 11.37.11 (Jun2020-SP1)   
Hardware: x86_64 (amd64/em64t)   
OS: Linux   
Attachments: test script
output of test script

Description Stefan Manegold cwiconfidential 2020-08-31 17:00:22 CEST
Created attachment 692 [details]
test script

A simple running sum
sum() over (order by "key")
even on data that is already sorted on key (and MonetDB should know that)
appear to be very slow with larger data, seemingly quadratic in the number of tuples rather than linear.

See attached script and log for details.
Comment 1 Stefan Manegold cwiconfidential 2020-08-31 17:00:58 CEST
Created attachment 693 [details]
output of test script
Comment 2 Pedro Ferreira 2020-08-31 17:35:52 CEST
This is a known corner case that can be improved. Is on the TODO list to improve the window functions for these scenarios, but I have been very busy with bug fixes and other performance improvements.
Comment 3 Stefan Manegold cwiconfidential 2020-09-30 12:16:28 CEST
Well, with a "global" Window function, i.e., merely an "order by" but no "partition by", I would not expect quadratic complexity/performance in the first place, but at most n log n, i.e., for the sort ...
Comment 4 Pedro Ferreira 2020-09-30 16:30:01 CEST
After I get done with the alloc-str-branch, which should be around this week, I can go to this. I planning to add 2 more execution scenarios for aggregates as windows functions: cumulative aggregate for your case and segment tree lookup as the more general solution. The naive solution should still be the better solution when the number of rows in the partition is very low.
Comment 5 Pedro Ferreira 2020-10-28 17:07:09 CET
On window-tunning branch, I added the cumulative aggregate for these cases. The performance should be around nlogn.
Comment 6 Pedro Ferreira 2020-11-09 09:21:54 CET
Today I merged window-tunning branch into default and it will be available in the next feature release. We ran benchmarks and we identified a great performance increase (now the complexity goes around nlogn).  I'm going to close this bug, but if the performance is still not desirable, fell free to reopen it.

The window functions don't run in parallel yet, but I'm planning to do it in the near future.