Hi,

I have been studying on multiple join performances in MonetDB. I am running the latest version of MonetDB Oct2014 - SP1 (MonetDB-11.19.3) on a CentOS machine with 128 GB RAM.  

All my tables have 100 to 300 columns and around 35 lakh rows. 

I have the following query with 5 joins and a few criteria. 

SELECT * FROM table1 
INNER JOIN table2 on table1.col1=table2.col1 
LEFT JOIN table3 ON table1.col2=table3.col1 
LEFT JOIN table3 ON table1.col3=table3.col1 
LEFT JOIN table4 ON table2.col2=table4.col1 
LEFT JOIN table5 ON table2.col3=table5.col1  
WHERE  (((table2.col10 like '%a%' OR table5.col11 like '%s%') and ((table2.col12 like '%w%' and table4.col13 like '%k') OR table2.col14 like '%h%') and (table1.col21 = 1) AND (table1.col22 IN (1,2,3))) AND (((table1.col23 >= 15842000000000000) AND (table1.col23 <= 15842999999999999)) OR ((table1.col23 >= 0) AND (table1.col23 <= 999999999999)))) ORDER BY 1 DESC LIMIT 10 OFFSET 0;

This query has only 2000 matching rows and it returns the result in 2.9 seconds

In the same query, if I remove one criteria alone (the bolded part), the result would still match only 2000 rows, but the result would come in 300 milliseconds even. 

I used the PLAN statement and found that the query plan that the executor takes for the first query is a much complex path and it performs 3 joins for all the 35 lakh rows (i.e. before executing the criteria even) and hence the time has spiked to 3 seconds. May be the hot data of 35 lakh tuples would have not fit into the memory?? Not sure. 

But if I remove the highlighted criteria, the executor evaluates the criteria first and it performs the joins for only 7000 rows and hence it is faster 1000 times. 

Is this a bug? 
Or is there any way to optimise my query to match the executor to choose the right plan? 

I am looking on to install MonetDB in our production setup, but finding this as a showstopper. 

Any help much appreciated. I could even demo this. 


Thanks & Regards,

Vijayakrishna.P.
Mobile : (+91) 9500402305.