Why aren't hashes preserved by joins?

Roberto Cornacchia roberto.cornacchia at gmail.com
Fri Apr 22 12:53:36 CEST 2016


Actually, the selection can reuse the hash only when mitosis is not active.
Perhaps this makes sense.

sql>set optimizer='sequential_pipe';
operation successful (0.830ms)
sql>select value from obj_string where value = 'apple' limit 1;
+-------+
| value |
+=======+
| apple |
+-------+
1 tuple (570.217ms)
sql>select value from obj_string where value = 'apple' limit 1;
+-------+
| value |
+=======+
| apple |
+-------+
1 tuple (1.991ms)


But still, this isn't happening with joins, even with sequential pipe.
One difference I noticed is that subselect takes both the tid and the
persistent bat as direct inputs, while subjoin takes the leftfetchjoin
between the same tid and persistent bat as input. Can that be the reason?


On 21 April 2016 at 18:07, Roberto Cornacchia <roberto.cornacchia at gmail.com>
wrote:

> I just noticed that PERSISTENTHASH (checked by BAThash) in gdk_private is
> not defined (commented). But this wouldn't explain the different between
> selection and join.
>
>
>
> On 21 April 2016 at 18:02, Roberto Cornacchia <
> roberto.cornacchia at gmail.com> wrote:
>
>> I noticed a difference between
>> - a hash-based string selection from a persistent, read-only table
>> - a hash-based join on the same table and same column
>>
>> They both build a hash on the same string column (verified with gdb), but
>> the select can reuse the hash (second call is almost free), while the join
>> keeps rebuilding the hash.
>>
>> Is this expected?
>>
>> Roberto
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/users-list/attachments/20160422/53acc034/attachment.html>


More information about the users-list mailing list