case performance

Daniel Zvinca daniel.zvinca at logbis.com
Tue Oct 8 08:04:38 CEST 2019


Your last reply, Roberto, makes me wonder if it would be an idea to
implement a scalar capi as well (for now is bat only). Then for sure the
loop will be entirely done in C.

BAT operations are faster for most of the operations, but when formulas
include multiple conditional evaluations, the parallelism and vectorization
advantage is clearly lost and loop × scalar might be the winner, don't you
think?




On Mon, Oct 7, 2019, 18:39 Roberto Cornacchia <roberto.cornacchia at gmail.com>
wrote:

> Just to mention, you can also force a tuple-at-the-time evaluation in pure
> SQL:
>
> -- BEGIN
> declare cs boolean;
> set cs=true;
>
> START TRANSACTION;
>
> CREATE OR REPLACE FUNCTION mycontains(s STRING, p STRING, casesensitive
> boolean)
> RETURNS BOOLEAN
> BEGIN
>   IF casesensitive
>   THEN RETURN SELECT s like '%'||p||'%';
>   ELSE RETURN SELECT s ilike '%'||p||'%';
>   END IF;
> END;
>
> CREATE TABLE t(s STRING);
> INSERT INTO t VALUES ('apple'),('pear'),('banana'),('orange');
>
> explain select * from t where mycontains(s,'an',cs);;
> -- END
>
> If you look at this explain, you'll see it makes a loop that calls the
> mycontains() function. This does avoid evaluating both branches. However
> the explicit interpreted loop won't make it very fast, perhaps see on your
> data. (Note: when the function is simpler, the loop is actually done in C)
>
>
> On Mon, 7 Oct 2019 at 16:53, Daniel Zvinca <daniel.zvinca at logbis.com>
> wrote:
>
>> I wasn't sure if this question should be posted here or on regular user
>> list, considering the way statements are generated and possible connections
>> with internal functionality.
>>
>> I understand the explanations. I should have checked the explain feature
>> myself. I think that I expected the optimizer to notice a constant
>> expression and evaluate it before choosing which real columnar operation
>> needs to be performed (obviously only one). Instead all members are
>> evaluated, the whole explain statement is self-explanatory, indeed. (a
>> delayed bat evaluation mechanism would have helped, I guess, for this case)
>>
>> I use capi intensively, but for string columns the capi memory management
>> can be a serious limitation for large sets.
>>
>> Anyway, your answer is very helpful for me, it makes me consider
>> adjusting my approach from a straight translation into one SQL statement
>> into using an intermediary language that would build the optimal statement
>> instead of stretching the SQL to get what I need.
>>
>> Thank you,
>>
>> On Mon, Oct 7, 2019 at 4:44 PM Sjoerd Mullender <sjoerd at monetdb.org>
>> wrote:
>>
>>> On 07/10/2019 14.33, Roberto Cornacchia wrote:
>>> > Hi Daniel,
>>>
>>> [...]
>>>
>>> > In this case, you would get better performance by implementing a BAT
>>> > function (not a function that works on a single value, but on a column
>>> > of values).
>>> > You would write that in C using the GDK api
>>> > (
>>> https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/UserDefinedFunction
>>> ),
>>> > and evaluate the condition i the loop, per value.
>>> > However, this requires you to recompile MonetDB to include your new
>>> > function.
>>>
>>> Recompilation of MonetDB should not be necessary.  See [1].
>>>
>>> [1] https://www.monetdb.org/hg/MonetDB-extend/
>>>
>>> --
>>> Sjoerd Mullender
>>>
>>> _______________________________________________
>>> developers-list mailing list
>>> developers-list at monetdb.org
>>> https://www.monetdb.org/mailman/listinfo/developers-list
>>>
>> _______________________________________________
>> developers-list mailing list
>> developers-list at monetdb.org
>> https://www.monetdb.org/mailman/listinfo/developers-list
>>
> _______________________________________________
> developers-list mailing list
> developers-list at monetdb.org
> https://www.monetdb.org/mailman/listinfo/developers-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/developers-list/attachments/20191008/37679a1e/attachment-0001.htm>


More information about the developers-list mailing list