It seems in a query like:
explain select kvk from kvk where length('hello') = length(bedrijfsnaam) and bedrijfsnaam like '%hello%';
The like operation is before the length, I have attached the mal explain of it. This is the plan:
| select (
| | table(sys.kvk) [ kvk.kvk, kvk.bedrijfsnaam, kvk.%TID% NOT NULL ]
| ) [ length(kvk.bedrijfsnaam) = length('hello'), convert(kvk.bedrijfsnaam) like convert('%hello%') ]
) [ kvk.kvk ]
If like is indeed before the length equality function (it seems to be finished after the 2nd barrier), shouldn't the cost for the like operation be increased?
Optimizers generally do not have any knowlegde about the cost of (user-defined) functions. MonetDB does not use a cost-based reordering of the plan either to benefit from this specific case.
It currently does do a cost assessment in exp_count(int *cnt, int seqnr, sql_exp *e), sql/src/server/rel_optimizer.mx. Is it there where a preference is set for the like operation?
Never the less, I do see your point regarding (user-defined) functions, what I do not understand is that the actual operations are also (user defined) function, for them a cost is set. It seems that the long term goal could be the dynamic cost evaluation of each function with respect to the dataset starting with the heuristic as it is set now. Regarding to this specific case what prevents adding a specific cost to the length function?
This bug was previously known as tracker item 2926187 at http://sourceforge.net/support/tracker.php?aid=2926187
The original assignee of this bug does not have
an account here. Reassigning to the default assignee
for the component, firstname.lastname@example.org.
Previous assignee was email@example.com.