case performance

Daniel Zvinca daniel.zvinca at logbis.com
Tue Oct 8 14:05:31 CEST 2019


I totally agree with you, Roberto. Every argument you came up with works
for simple cases and conditions like the mentioned one.

In a more general context, the reason I was looking for a scalar capi
solution is the real life conditions that I face on daily basis for ERP
data, especially for large sets. My example "Contains" it is just as one of
the many possible "lines" from a complex "where" clause that are in fact a
collection of conditions glued together by AND and OR operators. That can
lead to possible memory overhead if the default optimizer functionality is
used. That happens when many temporary bats are built during computational
stage.

A scalar capi solution should not exclude the parallelism IMO (a parallel
loop is often possible), and SIMD vectorization is usually broken due
checks against NULLs, for instance.

Thank you,
Dan










For instance a matrix oriented calculation engine as Eigen


On Tue, Oct 8, 2019 at 12:45 PM Roberto Cornacchia <
roberto.cornacchia at gmail.com> wrote:

> Daniel,
>
> Actually I wasn't considering that you can also keep the casesensitive
> parameter as scalar (and the pattern as well), so have mycontains(bat *ret,
> const bat *s, const str pattern, const bit casesensitive).
> So that optimization I was mentioning would still be there, but simpler:
>
> This is essentially what manifold would provide you with:
>
> loop(string)
>   if (cs)
>     res[i] = strstr(string[i],pattern);
>   else
>     res[i] = strcasestr(string[i],pattern);
>
> This is what you could do with a BAT implementation:
>
> if (cs)
>   loop(string)
>     res[i] = strstr(string[i],pattern);
> else
>   loop(string)
>     res[i] = strcasestr(string[i],pattern);
>
> Roberto
>
>
> On Tue, 8 Oct 2019 at 10:07, Roberto Cornacchia <
> roberto.cornacchia at gmail.com> wrote:
>
>> Daniel,
>>
>> I personally would put it the other way around. The scalar version should
>> in principle always be there, as it guarantees that every query works.
>>
>> Then you look at optimizing performance.
>> 1) leave it as is and let it wrap in a MAL loop
>> 2) If it is a C function, it will likely be wrapped in a C loop by
>> manifold.
>> 3) For best performance, you may want to implement the loop yourself with
>> a BAT implementation.
>>
>> When doing what depends on the nature of the function.
>> If every scalar call requires costly initialisations or allocations, then
>> the first two options can be very slow. In that case you want to choose
>> option 3 and do all initialisations and allocations outside the loop.
>> Your function is essentially a tight loop around a strstr() call, so in
>> principle I don't expect much difference between 2) and 3).
>>
>> There is a catch though. You have a boolean parameter to control
>> casessensitive.
>> So in case of a BAT implementation, you have 3 columns in input (*also
>> for the possibly constant boolean column*) and your loop looks like
>> (pseudocode):
>>
>> loop(string,pattern,cs)
>>   if (cs)
>>     strstr(sring,pattern);
>>   else
>>     strcasestr(string,pattern)
>>
>> Conditions inside loops are not your best friends.
>> Given that in most use cases your boolean column will be constant (either
>> all true or all false), you can optimize this by doing:
>>
>> if (cs.sorted && cs.revsorted) /* cs is constant */ {
>>   if (cs[0]) {
>>     loop(string,pattern,cs)
>>       strstr(sring,pattern);
>>   } else {
>>     loop(string,pattern,cs)
>>       strcasestr(string,pattern)
>>   }
>> } else { /* cs not constant, check at every tuple */
>>   loop(string,pattern,cs)
>>     if (cs)
>>       strstr(sring,pattern);
>>     else
>>       strcasestr(string,pattern)
>> }
>>
>> The point here is that option 3) is the only one that allows you to play
>> this way.
>> I normally use option 3) whenever I see a chance to get some work out of
>> the loop. In all other cases there is no real need.
>>
>> > BAT operations are faster for most of the operations, but when formulas
>> include multiple conditional evaluations, the parallelism and vectorization
>> advantage is clearly lost and loop × scalar might be the winner, don't you
>> think?
>>
>> Notice that when you manage to go back to a very tight loop, you got your
>> vectorization opportunities back.
>>
>> As for parallelism opportunities, I doubt that option 2) would allow any.
>> Parallelism in MonetDB is handled at MAL level. So option 1), if any, would
>> be your best shot for this. Especially if your function can be written in
>> SQL or MAL (these are rare), and they are simple enough to be inlined.
>>
>> Best regards,
>> Roberto
>>
>> On Tue, 8 Oct 2019 at 08:06, Daniel Zvinca <daniel.zvinca at logbis.com>
>> wrote:
>>
>>> Your last reply, Roberto, makes me wonder if it would be an idea to
>>> implement a scalar capi as well (for now is bat only). Then for sure the
>>> loop will be entirely done in C.
>>>
>>> BAT operations are faster for most of the operations, but when formulas
>>> include multiple conditional evaluations, the parallelism and vectorization
>>> advantage is clearly lost and loop × scalar might be the winner, don't you
>>> think?
>>>
>>>
>>>
>>>
>>> On Mon, Oct 7, 2019, 18:39 Roberto Cornacchia <
>>> roberto.cornacchia at gmail.com> wrote:
>>>
>>>> Just to mention, you can also force a tuple-at-the-time evaluation in
>>>> pure SQL:
>>>>
>>>> -- BEGIN
>>>> declare cs boolean;
>>>> set cs=true;
>>>>
>>>> START TRANSACTION;
>>>>
>>>> CREATE OR REPLACE FUNCTION mycontains(s STRING, p STRING, casesensitive
>>>> boolean)
>>>> RETURNS BOOLEAN
>>>> BEGIN
>>>>   IF casesensitive
>>>>   THEN RETURN SELECT s like '%'||p||'%';
>>>>   ELSE RETURN SELECT s ilike '%'||p||'%';
>>>>   END IF;
>>>> END;
>>>>
>>>> CREATE TABLE t(s STRING);
>>>> INSERT INTO t VALUES ('apple'),('pear'),('banana'),('orange');
>>>>
>>>> explain select * from t where mycontains(s,'an',cs);;
>>>> -- END
>>>>
>>>> If you look at this explain, you'll see it makes a loop that calls the
>>>> mycontains() function. This does avoid evaluating both branches. However
>>>> the explicit interpreted loop won't make it very fast, perhaps see on your
>>>> data. (Note: when the function is simpler, the loop is actually done in C)
>>>>
>>>>
>>>> On Mon, 7 Oct 2019 at 16:53, Daniel Zvinca <daniel.zvinca at logbis.com>
>>>> wrote:
>>>>
>>>>> I wasn't sure if this question should be posted here or on regular
>>>>> user list, considering the way statements are generated and possible
>>>>> connections with internal functionality.
>>>>>
>>>>> I understand the explanations. I should have checked the explain
>>>>> feature myself. I think that I expected the optimizer to notice a constant
>>>>> expression and evaluate it before choosing which real columnar operation
>>>>> needs to be performed (obviously only one). Instead all members are
>>>>> evaluated, the whole explain statement is self-explanatory, indeed. (a
>>>>> delayed bat evaluation mechanism would have helped, I guess, for this case)
>>>>>
>>>>> I use capi intensively, but for string columns the capi memory
>>>>> management can be a serious limitation for large sets.
>>>>>
>>>>> Anyway, your answer is very helpful for me, it makes me consider
>>>>> adjusting my approach from a straight translation into one SQL statement
>>>>> into using an intermediary language that would build the optimal statement
>>>>> instead of stretching the SQL to get what I need.
>>>>>
>>>>> Thank you,
>>>>>
>>>>> On Mon, Oct 7, 2019 at 4:44 PM Sjoerd Mullender <sjoerd at monetdb.org>
>>>>> wrote:
>>>>>
>>>>>> On 07/10/2019 14.33, Roberto Cornacchia wrote:
>>>>>> > Hi Daniel,
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>> > In this case, you would get better performance by implementing a BAT
>>>>>> > function (not a function that works on a single value, but on a
>>>>>> column
>>>>>> > of values).
>>>>>> > You would write that in C using the GDK api
>>>>>> > (
>>>>>> https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/UserDefinedFunction
>>>>>> ),
>>>>>> > and evaluate the condition i the loop, per value.
>>>>>> > However, this requires you to recompile MonetDB to include your new
>>>>>> > function.
>>>>>>
>>>>>> Recompilation of MonetDB should not be necessary.  See [1].
>>>>>>
>>>>>> [1] https://www.monetdb.org/hg/MonetDB-extend/
>>>>>>
>>>>>> --
>>>>>> Sjoerd Mullender
>>>>>>
>>>>>> _______________________________________________
>>>>>> developers-list mailing list
>>>>>> developers-list at monetdb.org
>>>>>> https://www.monetdb.org/mailman/listinfo/developers-list
>>>>>>
>>>>> _______________________________________________
>>>>> developers-list mailing list
>>>>> developers-list at monetdb.org
>>>>> https://www.monetdb.org/mailman/listinfo/developers-list
>>>>>
>>>> _______________________________________________
>>>> developers-list mailing list
>>>> developers-list at monetdb.org
>>>> https://www.monetdb.org/mailman/listinfo/developers-list
>>>>
>>> _______________________________________________
>>> developers-list mailing list
>>> developers-list at monetdb.org
>>> https://www.monetdb.org/mailman/listinfo/developers-list
>>>
>> _______________________________________________
> developers-list mailing list
> developers-list at monetdb.org
> https://www.monetdb.org/mailman/listinfo/developers-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/developers-list/attachments/20191008/c38b38da/attachment.htm>


More information about the developers-list mailing list