case performance

Roberto Cornacchia roberto.cornacchia at
Mon Oct 7 17:30:32 CEST 2019

Just to mention, you can also force a tuple-at-the-time evaluation in pure

declare cs boolean;
set cs=true;


CREATE OR REPLACE FUNCTION mycontains(s STRING, p STRING, casesensitive
  IF casesensitive
  THEN RETURN SELECT s like '%'||p||'%';
  ELSE RETURN SELECT s ilike '%'||p||'%';

INSERT INTO t VALUES ('apple'),('pear'),('banana'),('orange');

explain select * from t where mycontains(s,'an',cs);;
-- END

If you look at this explain, you'll see it makes a loop that calls the
mycontains() function. This does avoid evaluating both branches. However
the explicit interpreted loop won't make it very fast, perhaps see on your
data. (Note: when the function is simpler, the loop is actually done in C)

On Mon, 7 Oct 2019 at 16:53, Daniel Zvinca <daniel.zvinca at> wrote:

> I wasn't sure if this question should be posted here or on regular user
> list, considering the way statements are generated and possible connections
> with internal functionality.
> I understand the explanations. I should have checked the explain feature
> myself. I think that I expected the optimizer to notice a constant
> expression and evaluate it before choosing which real columnar operation
> needs to be performed (obviously only one). Instead all members are
> evaluated, the whole explain statement is self-explanatory, indeed. (a
> delayed bat evaluation mechanism would have helped, I guess, for this case)
> I use capi intensively, but for string columns the capi memory management
> can be a serious limitation for large sets.
> Anyway, your answer is very helpful for me, it makes me consider adjusting
> my approach from a straight translation into one SQL statement into using
> an intermediary language that would build the optimal statement instead of
> stretching the SQL to get what I need.
> Thank you,
> On Mon, Oct 7, 2019 at 4:44 PM Sjoerd Mullender <sjoerd at>
> wrote:
>> On 07/10/2019 14.33, Roberto Cornacchia wrote:
>> > Hi Daniel,
>> [...]
>> > In this case, you would get better performance by implementing a BAT
>> > function (not a function that works on a single value, but on a column
>> > of values).
>> > You would write that in C using the GDK api
>> > (
>> ),
>> > and evaluate the condition i the loop, per value.
>> > However, this requires you to recompile MonetDB to include your new
>> > function.
>> Recompilation of MonetDB should not be necessary.  See [1].
>> [1]
>> --
>> Sjoerd Mullender
>> _______________________________________________
>> developers-list mailing list
>> developers-list at
> _______________________________________________
> developers-list mailing list
> developers-list at
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the developers-list mailing list