Bug 3361 - constants as MAL function parameters prevent intermediate reuse
Summary: constants as MAL function parameters prevent intermediate reuse
Status: RESOLVED FIXED
Alias: None
Product: SQL
Classification: Unclassified
Component: all (show other bugs)
Version: -- development
Hardware: x86_64 (amd64/em64t) Linux
: Normal normal
Assignee: SQL devs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-09 17:11 CEST by Roberto Cornacchia
Modified: 2020-09-23 17:31 CEST (History)
3 users (show)



Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Roberto Cornacchia 2013-09-09 17:11:03 CEST
User-Agent:       Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.57 Safari/537.36
Build Identifier: 

Perhaps I missed something in the recent MonetDB developments, but after noticing unexpected MAL plan explosions (see BUG 3294), I came to this simple query:

explain
select name from sys.functions where mod='str'
union all
select name from sys.functions where mod='str';

Which translates to:
+----------------------------------------------------------------+
| mal                                                            |
+================================================================+
| function user.s2_1(A0:str,A1:str):void;                        |
|     X_4:bat[:oid,:str]  := bat.new(nil:oid,nil:str);           |
|     X_3 := sql.mvc();                                          |
|     X_10 := sql.bind(X_3,"sys","functions","mod",0);           |
|     X_7:bat[:oid,:oid]  := sql.tid(X_3,"sys","functions");     |
|     X_56 := algebra.subselect(X_10,X_7,A0,A0,true,true,false); |
|     (X_13,r1_13) := sql.bind(X_3,"sys","functions","mod",2);   |
|     X_57 := algebra.subselect(r1_13,A0,A0,true,true,false);    |
|     X_16 := sql.bind(X_3,"sys","functions","mod",1);           |
|     X_58 := algebra.subselect(X_16,X_7,A0,A0,true,true,false); |
|     X_18 := sql.subdelta(X_56,X_7,X_13,X_57,X_58);             |
|     X_21 := sql.bind(X_3,"sys","functions","name",0);          |
|     (X_23,r1_28) := sql.bind(X_3,"sys","functions","name",2);  |
|     X_25 := sql.bind(X_3,"sys","functions","name",1);          |
|     X_26 := sql.projectdelta(X_18,X_21,X_23,r1_28,X_25);       |
|     X_27 := bat.append(X_4,X_26,true);                         |
|     X_28:bat[:oid,:oid]  := X_7;                               |
|     X_29 := sql.projectdelta(X_28,X_10,X_13,r1_13,X_16);       |
|     X_30 := algebra.subselect(X_29,A1,A1,true,true,false);     |
|     X_33 := sql.projectdelta(X_28,X_21,X_23,r1_28,X_25);       |
|     X_35 := algebra.leftfetchjoin(X_30,X_33);                  |
|     X_36 := bat.append(X_27,X_35,true);                        |
|     X_38 := sql.resultSet(1,1,X_36);                           |
|     sql.rsColumn(X_38,".L","name","varchar",256,0,X_36);       |
|     X_43 := io.stdout();                                       |
|     sql.exportResult(X_43,X_38);                               |
| end s2_1;                                                      |
| # optimizer.dataflow()                                         |
+----------------------------------------------------------------+

It looks to me that this repeats twice the same select. Is that true?


Reproducible: Always
Comment 1 Martin Kersten cwiconfidential 2013-09-09 17:41:02 CEST
Roberto, 
The select statements are not identical.
They depend on different components

|     X_56 := algebra.subselect(X_10,X_7,A0,A0,true,true,false); | mod,0

|     X_57 := algebra.subselect(r1_13,A0,A0,true,true,false);    | mod,2

|     X_58 := algebra.subselect(X_16,X_7,A0,A0,true,true,false); | mod,1

|     X_30 := algebra.subselect(X_29,A1,A1,true,true,false);     | projection

Optimizer seems to have squeezed it as it should.

regards, Martin
Comment 2 Roberto Cornacchia 2013-09-09 17:49:16 CEST
Hi Martin, 

I was aware of the 0,1,2 binds.

What I don't understand is: if the result consists of the same selection appended twice, why does X_36 (the second append) read as:

X_36 := bat.append(X_27,X_35,true);

and not as

X_36 := bat.append(X_27,X_26,true);

?
Comment 3 Roberto Cornacchia 2013-09-09 18:02:30 CEST
To further clarify, this is an explain from an older MonetDB version (Dec2011), where you can clearly see that the result of the same select is appended twice to the final result.
In the one I posted in the original report, it seems to me that the same select is *computed twice*.


+----------------------------------------------------------------------+
| mal                                                                  |
+======================================================================+
| function user.s2_9():void;                                           |
|     X_30:bat[:oid,:str]  := bat.new(nil:oid,nil:str);                |
|     X_1 := sql.mvc();                                                |
|     X_21:bat[:oid,:str]  := sql.bind(X_1,"sys","functions","mod",0); |
|     X_22 := algebra.uselect(X_21,"str":str);                         |
|     X_19:bat[:oid,:str]  := sql.bind(X_1,"sys","functions","mod",2); |
|     X_23 := algebra.kdifference(X_22,X_19);                          |
|     X_20 := algebra.uselect(X_19,"str":str);                         |
|     X_24 := algebra.kunion(X_23,X_20);                               |
|     X_17:bat[:oid,:str]  := sql.bind(X_1,"sys","functions","mod",1); |
|     X_18 := algebra.uselect(X_17,"str":str);                         |
|     X_25 := algebra.kunion(X_24,X_18);                               |
|     X_15:bat[:oid,:oid]  := sql.bind_dbat(X_1,"sys","functions",1);  |
|     X_16 := bat.reverse(X_15);                                       |
|     X_26 := algebra.kdifference(X_25,X_16);                          |
|     X_27 := algebra.markT(X_26,0@0:oid);                             |
|     X_28 := bat.reverse(X_27);                                       |
|     X_9:bat[:oid,:str]  := sql.bind(X_1,"sys","functions","name",0); |
|     X_7:bat[:oid,:str]  := sql.bind(X_1,"sys","functions","name",2); |
|     X_11 := algebra.kdifference(X_9,X_7);                            |
|     X_12 := algebra.kunion(X_11,X_7);                                |
|     X_4:bat[:oid,:str]  := sql.bind(X_1,"sys","functions","name",1); |
|     X_13 := algebra.kunion(X_12,X_4);                                |
|     X_29 := algebra.leftjoin(X_28,X_13);                             |
|     X_33 := bat.append(X_30,X_29,true);                              |
|     X_35 := bat.append(X_33,X_29,true);                              |
|     X_36 := sql.resultSet(1,1,X_35);                                 |
|     sql.rsColumn(X_36,".functions","name":str,"varchar",256,0,X_35); |
|     X_44 := io.stdout();                                             |
|     sql.exportResult(X_44,X_36);                                     |
| end s2_9;                                                            |
+----------------------------------------------------------------------+
Comment 4 Martin Kersten cwiconfidential 2013-09-09 18:12:35 CEST
A critical test for common term elimination is that instructions should be side-effect free. Appends are not.

Also the equality of the arguments A0, and A1 can not be determined statically.
It would have to be detected at the SQL level leading to a template with just one argument.
Comment 5 Roberto Cornacchia 2013-09-09 18:23:16 CEST
A difference that I notice with the Dec2011 plan that I posted is indeed that the two strings are now parameters of the function.
This justifies your explanation about arguments A0 and A1.

Understanding why this is happening doesn't really satisfy me though. The behaviour of the older version was far more correct, I believe. Notice that in real-life queries I get very long plan portions repeated over and over again, just because of this.

Do you think this is the way it should behave?
Comment 6 Stefan Manegold cwiconfidential 2013-09-09 18:32:44 CEST
Possibly related:

I recently had the problem with Feb2013 (or more precisely SciQL-2, but I don't expect that it was related / limited to the SciQL extensions on to of Feb2013) that when using a UNION ALL of several sub-queries in a function declaration, common sub-expressions --- in my specific case an aggregation that was identical in the select- and the having-clause of a group by sub-query --- were not recognized as such, i.e., the identical aggregation was evaluated twice. In my case, introducing a separate sub-function per sub-query and UNION ALL over the sub-function calls "solved" the problem.

(Given to many other obligations, I did not have/find the time to file my problem as a proper bug report ... :-()

While I assumed the problem was related to "too complicated" function definitions, it might actually be that it is related to UNION ALL queries --- the most prominent commonality between Roborto's and my case.

Would this make this bug report "valid", again?
Comment 7 Stefan Manegold cwiconfidential 2013-09-09 18:36:08 CEST
NB:

While appends are indeed not side-effect free, selections are.
Roberto's point / question is why the identical selects are not shared / re-used, but evaluated twice ("redundantly").

The same holds for the identical aggregations in my group-by-having case.
(I'll try to provide a reproducible example, but cannot promise when ...)
Comment 8 Martin Kersten cwiconfidential 2013-09-09 18:38:22 CEST
Indeed, in this case it all boils down to the handling of constants at the SQL level to derive templates. It is up to the SQL engineer to validate and reassign it.
Comment 9 Stefan Manegold cwiconfidential 2013-09-09 19:23:14 CEST
For what it's worth:
While both share UNION ALL, a notable difference between Roberto's and my case is that the common sub-expression appears "across" the UNION ALL in Roberto's case, while it appears within a sub-query "under" the UNION ALL in my case.
Comment 10 Stefan Manegold cwiconfidential 2013-09-09 19:49:08 CEST
Having said that, with the following simple SQL-only example trying to mimic my (actually more complex SciQL) case, I cannot reproduce the problem, i.e., common subexpression elimination works fine with Feb2013-SP3 & latest Feb2013 from HG (changeset 23a797ab1392):

create table bug_3361_stm ( a int, b int );

explain
select a, sum(b), min(b) from bug_3361_StM group by a having sum(b) > 0 and min(b) > 0
union all
select a, sum(b), max(b) from bug_3361_StM group by a having sum(b) < 0 and max(b) < 0;

+------------------------------------------------------------------------+
| mal                                                                    |
+========================================================================+
| function user.s7_1{autoCommit=true}(A0:lng,A1:int,A2:lng,A3:int):void; |
|     X_7:bat[:oid,:int]  := bat.new(nil:oid,nil:int);                   |
|     X_6 := sql.mvc();                                                  |
|     X_10:bat[:oid,:oid]  := sql.tid(X_6,"sys","bug_3361_stm");         |
|     X_13 := sql.bind(X_6,"sys","bug_3361_stm","b",0);                  |
|     (X_16,r1_16) := sql.bind(X_6,"sys","bug_3361_stm","b",2);          |
|     X_19 := sql.bind(X_6,"sys","bug_3361_stm","b",1);                  |
|     X_21 := sql.delta(X_13,X_16,r1_16,X_19);                           |
|     X_22 := algebra.leftfetchjoin(X_10,X_21);                          |
|     X_23 := sql.bind(X_6,"sys","bug_3361_stm","a",0);                  |
|     (X_25,r1_25) := sql.bind(X_6,"sys","bug_3361_stm","a",2);          |
|     X_27 := sql.bind(X_6,"sys","bug_3361_stm","a",1);                  |
|     X_28 := sql.delta(X_23,X_25,r1_25,X_27);                           |
|     X_29 := algebra.leftfetchjoin(X_10,X_28);                          |
|     (X_30,r1_30,r2_30) := group.subgroupdone(X_29);                    |
|     X_33:bat[:oid,:lng]  := aggr.subsum(X_22,X_30,r1_30,true,true);    |
      ^^^^                            ^^^
|     X_34:bat[:oid,:int]  := aggr.submin(X_22,X_30,r1_30,true);         |
      ^^^^                            ^^^
|     X_38 := algebra.leftfetchjoin(r1_30,X_29);                         |
|     X_42:bat[:oid,:int]  := aggr.submax(X_22,X_30,r1_30,true);         |
      ^^^^                            ^^^
|     X_35 := algebra.thetasubselect(X_34,A1,">");                       |
                                     ^^^^
|     X_37 := algebra.thetasubselect(X_33,X_35,A0,">");                  |
                                     ^^^^
|     X_39 := algebra.leftfetchjoin(X_37,X_38);                          |
|     X_40 := bat.append(X_7,X_39,true);                                 |
|     X_43 := algebra.thetasubselect(X_42,A3,"<");                       |
                                     ^^^^
|     X_45 := algebra.thetasubselect(X_33,X_43,A2,"<");                  |
                                     ^^^^
|     X_46 := algebra.leftfetchjoin(X_45,X_38);                          |
|     X_47 := bat.append(X_40,X_46,true);                                |
|     X_56:bat[:oid,:int]  := bat.new(nil:oid,nil:int);                  |
|     X_58 := algebra.leftfetchjoin(X_37,X_34);                          |
                                         ^^^^
|     X_59 := bat.append(X_56,X_58,true);                                |
|     X_60 := algebra.leftfetchjoin(X_45,X_42);                          |
                                         ^^^^
|     X_61 := bat.append(X_59,X_60,true);                                |
|     X_49:bat[:oid,:lng]  := bat.new(nil:oid,nil:lng);                  |
|     X_52 := algebra.leftfetchjoin(X_37,X_33);                          |
                                         ^^^^
|     X_53 := bat.append(X_49,X_52,true);                                |
|     X_54 := algebra.leftfetchjoin(X_45,X_33);                          |
                                         ^^^^
|     X_55 := bat.append(X_53,X_54,true);                                |
|     X_62 := sql.resultSet(3,1,X_47);                                   |
|     sql.rsColumn(X_62,".L10","a","int",32,0,X_47);                     |
|     sql.rsColumn(X_62,".L10","L1","bigint",32,0,X_55);                 |
|     sql.rsColumn(X_62,".L10","L2","int",32,0,X_61);                    |
|     X_76 := io.stdout();                                               |
|     sql.exportResult(X_76,X_62);                                       |
| end s7_1;                                                              |
+------------------------------------------------------------------------+
44 tuples (3.083ms)
Comment 11 Stefan Manegold cwiconfidential 2013-09-09 20:29:27 CEST
And another difference between Roberto's and my case is that in Roberto's case the common sub-expression contains a literal, while mine does not.
As Martin pointed out, the current (Feb2013) query translation creates a MAL plan (function) that uses separate function arguments for each literal, even if in the given instance some literals happen to have the same values. Thus, while there is a common sub-expression in the SQL query, this is no longer "visible"/present in the generated MAL plan.
Comment 12 Roberto Cornacchia 2013-09-11 11:51:48 CEST
How literals are passed as function parameters seems indeed the be the main cause here.

If I rewrite my original query as 

explain
with sel as (select name from sys.functions where mod='str')
select name from sel
union all
select name from sel;


Then I get the MAL plan that I would expect:

+----------------------------------------------------------------+
| mal                                                            |
+================================================================+
| function user.s5_1(A0:str):void;                               |
|     X_3:bat[:oid,:str]  := bat.new(nil:oid,nil:str);           |
|     X_2 := sql.mvc();                                          |
|     X_9 := sql.bind(X_2,"sys","functions","mod",0);            |
|     X_6:bat[:oid,:oid]  := sql.tid(X_2,"sys","functions");     |
|     X_48 := algebra.subselect(X_9,X_6,A0,A0,true,true,false);  |
|     (X_12,r1_12) := sql.bind(X_2,"sys","functions","mod",2);   |
|     X_49 := algebra.subselect(r1_12,A0,A0,true,true,false);    |
|     X_15 := sql.bind(X_2,"sys","functions","mod",1);           |
|     X_50 := algebra.subselect(X_15,X_6,A0,A0,true,true,false); |
|     X_17 := sql.subdelta(X_48,X_6,X_12,X_49,X_50);             |
|     X_20 := sql.bind(X_2,"sys","functions","name",0);          |
|     (X_22,r1_27) := sql.bind(X_2,"sys","functions","name",2);  |
|     X_24 := sql.bind(X_2,"sys","functions","name",1);          |
|     X_25 := sql.projectdelta(X_17,X_20,X_22,r1_27,X_24);       |
|     X_26 := bat.append(X_3,X_25,true);                         |
|     X_27 := bat.append(X_26,X_25,true);                        |
|     X_28 := sql.resultSet(1,1,X_27);                           |
|     sql.rsColumn(X_28,".L","name","varchar",256,0,X_27);       |
|     X_35 := io.stdout();                                       |
|     sql.exportResult(X_35,X_28);                               |
| end s5_1;                                                      |
| # optimizer.dataflow()                                         |
+----------------------------------------------------------------+


I would put it this way: rewriting the sql query helps the translator produce better MAL code. Ideally, that should not be needed, as SQL is supposed to be declarative. Especially considering that in many real-life applications query are generated - thus more likely to contain verbose patterns, I would consider this something to try and improve. A good reason is also that it came as an unexpected side effect of other developments, reducing the effectiveness of previously working code optimizations.

Will this report be reconsidered?
Comment 13 Roberto Cornacchia 2014-08-29 10:29:45 CEST
A year later, I'd like to reopen this bug report.

It was marked as "INVALID", but as the discussion shows, it was not invalid at all.

Perhaps you want to mark it as "WON'T FIX", but I hope you reconsider, as I find it serious for every non-trivial query scenario.
Comment 14 Niels Nes cwiconfidential 2014-12-17 16:55:21 CET
On the sql side the 2 variables are created because of the query cache. For now you could switch off the query cache (set cache=false). This should push the problem back to the commonterms optimizer. Currently the optimizer fails at finding the duplicates.
Comment 15 MonetDB Mercurial Repository cwiconfidential 2014-12-17 16:57:10 CET
Changeset 9a7952fd5a0c, made by Niels Nes <niels@cwi.nl> in the MonetDB repo, refers to this bug.

For complete details, see http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=9a7952fd5a0c

Changeset description:

	- improved usage for filter functions
	- improved push down of selects (with multiple tid() statements)
	- made enabling/disabling cache function again (needed for bug 3361)
Comment 16 Roberto Cornacchia 2015-06-19 18:23:12 CEST
Setting cache=false does indeed inline the parameters.

However commonTerms cannot find the duplicates yet. I don't really understand why.

Notice that the problem I am describing here is the root of MAL plans longer than 30K lines (where I would expect max 2-3K lines).



On Oct2014-SP4, with cache=false (Jul2015 behaves the same):

sql>set cache=false;
operation successful (0.641ms)
sql>explain
more>select name from sys.functions where mod='str'
more>union all
more>select name from sys.functions where mod='str';
+-------------------------------------------------------------------------------+
| mal                                                                           |
+===============================================================================+
| function user.main{autoCommit=true}():void;                                   |
| barrier X_63 := language.dataflow();                                          |
|     X_3:bat[:oid,:str]  := bat.new(nil:oid,nil:str);                          |
|     X_2 := sql.mvc();                                                         |
|     X_9:bat[:oid,:str]  := sql.bind(X_2,"sys","functions","mod",0);           |
|     X_6:bat[:oid,:oid]  := sql.tid(X_2,"sys","functions");                    |
|     X_48 := algebra.subselect(X_9,X_6,"str":str,"str":str,true,true,false);   |
|     (X_12,r1_12) := sql.bind(X_2,"sys","functions","mod",2);                  |
|     X_49 := algebra.subselect(r1_12,"str":str,"str":str,true,true,false);     |
|     X_15:bat[:oid,:str]  := sql.bind(X_2,"sys","functions","mod",1);          |
|     X_50 := algebra.subselect(X_15,X_6,"str":str,"str":str,true,true,false);  |
|     X_17 := sql.subdelta(X_48,X_6,X_12,X_49,X_50);                            |
|     X_19:bat[:oid,:str]  := sql.bind(X_2,"sys","functions","name",0);         |
|     (X_21,r1_28) := sql.bind(X_2,"sys","functions","name",2);                 |
|     X_23:bat[:oid,:str]  := sql.bind(X_2,"sys","functions","name",1);         |
|     X_24 := sql.projectdelta(X_17,X_19,X_21,r1_28,X_23);                      |
|     X_25 := bat.append(X_3,X_24,true);                                        |
|     X_27:bat[:oid,:oid]  := X_6;                                              |
|     X_51 := algebra.subselect(X_9,X_27,"str":str,"str":str,true,true,false);  |
|     X_52 := algebra.subselect(X_15,X_27,"str":str,"str":str,true,true,false); |
|     X_29 := sql.subdelta(X_51,X_27,X_12,X_49,X_52);                           |
|     X_31 := sql.projectdelta(X_29,X_19,X_21,r1_28,X_23);                      |
|     X_32 := bat.append(X_25,X_31,true);                                       |
|     language.pass(X_6);                                                       |
|     language.pass(X_9);                                                       |
|     language.pass(X_15);                                                      |
|     language.pass(X_27);                                                      |
|     language.pass(X_12);                                                      |
|     language.pass(X_49);                                                      |
|     language.pass(X_19);                                                      |
|     language.pass(X_21);                                                      |
|     language.pass(r1_28);                                                     |
|     language.pass(X_23);                                                      |
| exit X_63;                                                                    |
|     X_33 := sql.resultSet(1,1,X_32);                                          |
|     sql.rsColumn(X_33,".L","name","varchar",256,0,X_32);                      |
|     X_39 := io.stdout();                                                      |
|     sql.exportResult(X_39,X_33);                                              |
| end main;                                                                     |
+-------------------------------------------------------------------------------+
39 tuples (3.649ms)
Comment 17 Martin Kersten cwiconfidential 2015-06-19 18:37:22 CEST
Indeed, the window for searching for constants in a MAL plan is limited
and also not handled at the SQL layer as such.
The underlying reason is that constants are not separately administered
from ordinary variables.
Similarly, the window for searching identical instructions is limited.
Otherwise, the common term optimizer would become way too expensive O(n^2)

In this case we have
X_48 := algebra.subselect(X_9,X_6,"str":str,"str":str,true,true,false)
X_49 := algebra.subselect(r1_12,"str":str,"str":str,true,true,false);   
X_50 := algebra.subselect(X_15,X_6,"str":str,"str":str,true,true,false);
X_51 := algebra.subselect(X_9,X_27,"str":str,"str":str,true,true,false) 
X_52 := algebra.subselect(X_15,X_27,"str":str,"str":str,true,true,false);

None of the arguments pairs are identical and in this simple common term
matching fails. The fact that X_27 is an alias for X_6 is not (yet) taken 
into account. This could be handled by an extra alias removal before the
commonterms optimizer starts. [TODO]
Comment 18 Roberto Cornacchia 2015-06-19 18:49:47 CEST
Martin, do you mean that simply adding the existing aliasRemoval before commonTerms in the current pipeline would do the trick? Or would they need changes anyway?

I am aware of the limited window. We discussed this a few times and I still don't agree with it. ;-)

That's why there is no limit in my code (that's the only change), so the window cannot be a reason in this case. 

Yes, that optimizer is expensive, but with the current limitations it's hardly ever useful and its failed application can be way worse than its quadratic execution on long plans (that's my opinion of course).
Comment 19 Martin Kersten cwiconfidential 2015-06-19 20:38:53 CEST
If you add the alias optimizer:

....
     "optimizer.deadcode();"
     "optimizer.aliases();"
     "optimizer.commonTerms();"
....
You will get the following more interesting result.

sql>set cache=false;
operation successful (0.629ms)
sql>explain select name from sys.functions where mod='str' 
more>union
more>select name from sys.functions where mod='str' ;
+--------------------------------------------------------------------------------------+
| mal                                                                                  |
+======================================================================================+
| function user.main{autoCommit=true}():void;                                          |
|     X_38 := bat.new(nil:oid,nil:str);                                                |
|     X_46 := bat.append(X_38,".L");                                                   |
|     X_41 := bat.new(nil:oid,nil:str);                                                |
|     X_48 := bat.append(X_41,"name");                                                 |
|     X_42 := bat.new(nil:oid,nil:str);                                                |
|     X_50 := bat.append(X_42,"varchar");                                              |
|     X_43 := bat.new(nil:oid,nil:int);                                                |
|     X_52 := bat.append(X_43,256);                                                    |
|     X_45 := bat.new(nil:oid,nil:int);                                                |
|     X_54 := bat.append(X_45,0);                                                      |
|     X_3:bat[:oid,:str] := bat.new(nil:oid,nil:str);                                  |
|     X_2 := sql.mvc();                                                                |
|     X_9:bat[:oid,:str] := sql.bind(X_2,"sys","functions","mod",0);                   |
|     X_6:bat[:oid,:oid] := sql.tid(X_2,"sys","functions");                            |
|     X_69 := algebra.subselect(X_9,X_6,"str","str",true,true,false);                  |
|     (X_12,r1_12) := sql.bind(X_2,"sys","functions","mod",2);                         |
|     X_70 := algebra.subselect(r1_12,nil:bat[:oid,:oid],"str","str",true,true,false); |
|     X_15:bat[:oid,:str] := sql.bind(X_2,"sys","functions","mod",1);                  |
|     X_72 := algebra.subselect(X_15,X_6,"str","str",true,true,false);                 |
|     X_18 := sql.subdelta(X_69,X_6,X_12,X_70,X_72);                                   |
|     X_20:bat[:oid,:str] := sql.bind(X_2,"sys","functions","name",0);                 |
|     (X_22,r1_25) := sql.bind(X_2,"sys","functions","name",2);                        |
|     X_24:bat[:oid,:str] := sql.bind(X_2,"sys","functions","name",1);                 |
|     X_25 := sql.projectdelta(X_18,X_20,X_22,r1_25,X_24);                             |
|     (X_26,r1_31,r2_31) := group.subgroupdone(X_25);                                  |
|     X_29 := algebra.leftfetchjoin(r1_31,X_25);                                       |
|     X_30 := bat.append(X_3,X_29,true);                                               |
|     X_32 := bat.append(X_30,X_29,true);                                              |
|     (X_33,r1_64,r2_64) := group.subgroupdone(X_32);                                  |
|     X_36 := algebra.leftfetchjoin(r1_64,X_32);                                       |
|     sql.resultSet(X_46,X_48,X_50,X_52,X_54,X_36);                                    |
| end user.main;                                                                       |
| # optimizer.mitosis()                                                                |
| # optimizer.dataflow()                                                               |
+--------------------------------------------------------------------------------------+
35 tuples (4.034ms)

Happy? ;)
Comment 20 Roberto Cornacchia 2015-06-22 10:46:43 CEST
Thanks, that looks much better indeed :)

From my point of view this bug report can be closed as fixed.

However.... ;)  I hope you don't mind if I submit others on the same topic. Re-use of intermediate results remains a nasty issue with generated SQL queries.
Comment 21 Sjoerd Mullender cwiconfidential 2015-08-28 13:42:22 CEST
Jul2015 has been released.
Comment 22 Roberto Cornacchia 2016-04-14 12:07:04 CEST
The conclusion for this issues seemed to be, from the last comments, that that aliases optimizer should have been placed before the commonTerms optimizer, in order to catch instructions that are not identical but equivalent.

The bug has been closed as fixed in Jul2015, but I see the default pipe hasn't changed.

Is there a reason? Was that not a good fix?
Comment 23 Roberto Cornacchia 2016-04-14 12:57:05 CEST
Actually: the test case 

explain
select name from sys.functions where mod='str'
union all
select name from sys.functions where mod='str';

eventually worked correctly by 
- setting cache to false
- calling aliases optimizer before commonTerms optimizer

Now this seems not to be enough any more.

To me this remains a serious issue. Plans are exploding and repeating 40-50 times the same computations.
Comment 24 Martin Kersten cwiconfidential 2016-04-14 13:08:01 CEST
Yes, the issue is known and we are considering to make caching default false.  Calling the optimizer for each query instead. That would remove this issue.
Comment 25 MonetDB Mercurial Repository cwiconfidential 2016-04-17 11:41:10 CEST
Changeset 6db31f60efeb, made by Niels Nes <niels@cwi.nl> in the MonetDB repo, refers to this bug.

For complete details, see http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=6db31f60efeb

Changeset description:

	apply the alias optimizer after the pushselect as the later introduces
	new aliases. Partial solution to bug 3361.
Comment 26 Niels Nes cwiconfidential 2016-04-17 11:42:47 CEST
reordered the optimizers such that the alias gets removed. This solve the commonterms problem. For now still the cache needs to be disabled. We are still thinking/testing if the agressive statement cache could be removed (or only used for a smaller set of statements (is updates)).
Comment 27 Roberto Cornacchia 2016-05-06 19:02:04 CEST
Just a reminder that the commit above modifies only default_pipe. The others still have the pushselect optimizer after the alias optimizer.
Comment 28 Niels Nes cwiconfidential 2020-09-23 17:31:10 CEST
the oct2020 will no longer have the query cache, ie should run these issues without problems.