Bug 7001 - crossproduct generated for a simple (semi-)join
Summary: crossproduct generated for a simple (semi-)join
Alias: None
Product: SQL
Classification: Unclassified
Component: all (show other bugs)
Version: 11.39.5 (Oct2020)
Hardware: x86_64 (amd64/em64t) Linux
: Normal normal
Assignee: SQL devs
Depends on:
Reported: 2020-10-22 12:10 CEST by Roberto Cornacchia
Modified: 2020-10-28 16:38 CET (History)
2 users (show)


Note You need to log in before you can comment on or make changes to this bug.
Description Roberto Cornacchia 2020-10-22 12:10:23 CEST
User-Agent:       Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.125 Safari/537.36
Build Identifier: 

-- generate 5 integer ids and join with a row_number() on a table
 ids as (select * from sys.generate_series(1,5)),
 t as (select row_number() over() as r, * from sys.functions)
select name from t where r in (select value from ids);

The semijoin between t and ids is translated to a selection over crossproduct (the same with an innerjoin).

function user.main():void;
"    X_1:void := querylog.define(""explain with\\nids as (select * from sys.generate_series(1,5)),\\nt as (select row_number() over() as r, * from sys.functions)\\nselect name from t where r in (select value from ids);"":str, ""sequential_pipe"":str, 37:int);"
"    X_55:bat[:str] := bat.pack(""sys.t"":str);"
"    X_56:bat[:str] := bat.pack(""name"":str);"
"    X_57:bat[:str] := bat.pack(""varchar"":str);"
    X_58:bat[:int] := bat.pack(256:int);
    X_59:bat[:int] := bat.pack(0:int);
    X_4:int := sql.mvc();
"    C_5:bat[:oid] := sql.tid(X_4:int, ""sys"":str, ""functions"":str);"
"    X_17:bat[:str] := sql.bind(X_4:int, ""sys"":str, ""functions"":str, ""name"":str, 0:int);"
"    (X_19:bat[:oid], X_20:bat[:str]) := sql.bind(X_4:int, ""sys"":str, ""functions"":str, ""name"":str, 2:int);"
"    X_18:bat[:str] := sql.bind(X_4:int, ""sys"":str, ""functions"":str, ""name"":str, 1:int);"
    X_21:bat[:str] := sql.delta(X_17:bat[:str], X_19:bat[:oid], X_20:bat[:str], X_18:bat[:str]);
    X_22:bat[:str] := algebra.projection(C_5:bat[:oid], X_21:bat[:str]);
    X_26:bat[:int] := batsql.row_number(X_22:bat[:str], false:bit, false:bit);
    C_47:bat[:oid] := bat.mirror(X_26:bat[:int]);
    X_112:int := calc.int(1:bte);
    X_113:int := calc.int(5:bte);
    X_34:bat[:int] := generator.series(X_112:int, X_113:int);
    (X_35:bat[:oid], X_36:bat[:oid]) := algebra.crossproduct(X_26:bat[:int], X_34:bat[:int], false:bit);
    X_37:bat[:int] := algebra.projection(X_35:bat[:oid], X_26:bat[:int]);
    X_39:bat[:int] := generator.projection(X_36:bat[:oid], X_34:bat[:int]);
    X_40:bat[:bit] := batcalc.==(X_37:bat[:int], X_39:bat[:int]);
    C_43:bat[:oid] := algebra.select(X_40:bat[:bit], true:bit, true:bit, true:bit, true:bit, false:bit);
    X_46:bat[:oid] := algebra.projection(C_43:bat[:oid], X_35:bat[:oid]);
    C_48:bat[:oid] := algebra.intersect(C_47:bat[:oid], X_46:bat[:oid], nil:BAT, nil:BAT, false:bit, false:bit, nil:lng);
    X_53:bat[:str] := algebra.projection(C_48:bat[:oid], X_22:bat[:str]);
    sql.resultSet(X_55:bat[:str], X_56:bat[:str], X_57:bat[:str], X_58:bat[:int], X_59:bat[:int], X_53:bat[:str]);
end user.main;

Reproducible: Always

$ mserver5 --version
MonetDB 5 server 11.39.4 (hg id: e82bd9b8a27) (64-bit, 128-bit integers)
This is an unreleased version
Copyright (c) 1993 - July 2008 CWI
Copyright (c) August 2008 - 2020 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 15.5GiB available memory, 4 available cpu cores
  libpcre: 8.44 2020-02-12
  openssl: OpenSSL 1.1.1g FIPS  21 Apr 2020
  libxml2: 2.9.10
Compiled by: roberto@tardis.spinque.com (x86_64-pc-linux-gnu)
Compilation: /usr/bin/cc  -Werror -Wall -Wextra -Werror-implicit-function-declaration -Wpointer-arith -Wundef -Wformat=2 -Wformat-overflow=1 -Wno-format-truncation -Wno-format-nonliteral -Wno-cast-function-type -Winit-self -Winvalid-pch -Wmissing-declarations -Wmissing-format-attribute -Wmissing-prototypes -Wno-missing-field-initializers -Wold-style-definition -Wpacked -Wunknown-pragmas -Wvariadic-macros -Wstack-protector -fstack-protector-all -Wpacked-bitfield-compat -Wsync-nand -Wjump-misses-init -Wmissing-include-dirs -Wlogical-op -Wduplicated-cond -Wduplicated-branches -Wrestrict -Wnested-externs -Wmissing-noreturn -Wuninitialized -Wno-char-subscripts -Wunreachable-code
Linking    : /usr/bin/ld
Comment 1 Roberto Cornacchia 2020-10-26 11:41:40 CET
A somehow simpler test, by materializing one of the two subqueries into a table. It still uses a cartesian product.

start transaction;

create table i as select * from (VALUES (1),(2),(3)) as x(n);

with t as (select row_number() over() as r, * from sys.functions)
select name from t where r in (select n from i);
Comment 2 Roberto Cornacchia 2020-10-26 11:49:10 CET
Even simpler, no need for row_number():

create table i as select * from (VALUES (1),(2),(3)) as x(n);

t as (select 1 as r, * from sys.functions)
select name from t where r in (select n from i);
Comment 3 Pedro Ferreira 2020-10-26 13:03:35 CET
I'm just finishing something, then I will look into this.
Comment 4 MonetDB Mercurial Repository cwiconfidential 2020-10-28 16:36:58 CET
Changeset ca9824e7f56a, made by Niels Nes <niels@cwi.nl> in the MonetDB repo, refers to this bug.

For complete details, see https://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=ca9824e7f56a

Changeset description:

	fixed bug 7001
	(improved checks for when to run a binary join)
Comment 5 Niels Nes cwiconfidential 2020-10-28 16:38:18 CET
The join was indeed incorrectly handled (ie split in cross and selection). Fixed the check for this.