Hash join implementation

Xu,Wenjian zeroxwj at gmail.com
Sat Jul 11 09:23:40 CEST 2015


Hi Stefan,

Thank you very much for your suggestions. They are very helpful.

I am also trying to understand the hashjoin() implementation in
gdk/gdk_join.c. I am aware of the classical *multi-pass partitioned
hash join* as described in the literature:

[1] Main memory hash joins on multi-core CPUs: tuning to the
underlying hardware. Cagri Balkesen et al. ICDE13

[2] Optimizing main-memory join on modern hardware. Stefan Manegold et
al. TKDE (2002)

I was trying to follow the idea in these papers to understand the
implementation of hash join in MonetDB but failed. Do you have any
suggestions? Are there any other papers/documents I can refer to?
Thanks.


Best regards,

XU Wenjian


>Hi Wenjian,

>you might want to consider studying the logical (relational) >and physical (MAL) plans, or debug your SQL queries; for >details see
>https://www.monetdb.org/Documentation/Manuals/SQLreference/Runtime

>You might also want to consider using the MAL debugger >directly; cf.,
>https://www.monetdb.org/Documentation/Manuals/MonetDB/debugger

>Also, a combination of mserver5's --trace and ->gdk_debug=2097152 ("ALGOMASK")
>might be useful.

>Best,
>Stefan

----- On Jul 8, 2015, at 11:42 AM, Xu,Wenjian zeroxwj at gmail.com
<https://www.monetdb.org/mailman/listinfo/developers-list> wrote:

>>* Hi,
*>> >>* My objective is to investigate that, given an SPJ query, how are BATs
*>>* *physically* processed (changed) among relational algebra operators (e.g.,
*>>* select, join). I do not care about how query is translated into
execution plan
*>>* in MAL, what I care is the interaction of operators in the kernel
(aka., gdk) .
*>> >>* My current plan is to add some debug information in those
operators (e.g.,
*>>* functions in src/gdk_join.c, src/gdk_select.c) and re-compile the
system. Then,
*>>* I will pose an SPJ query (through console) and check the system
log to see how
*>>* those operators are invoked in sequence.
*>> >>* But this method seems inefficient and I would like to ask if
there are any
*>>* alternatives, for example, is it possible to debug the system
*step-by-step*
*>>* given an input query (or its logical query plan).
*>> >>* Thank you very much for any kind help.
*>> >>* Best regards,
*>>* Wenjian
*>> >> >*>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/developers-list/attachments/20150711/0a01ab78/attachment.html>


More information about the developers-list mailing list