The MonetDB team at CWI/MonetDB BV is pleased to announce the
Oct2014-SP2 bugfix release of the MonetDB suite of programs.
More information about MonetDB can be found on our website at
<http://www.monetdb.org/>.
For details on this release, please see the release notes at
<http://www.monetdb.org/Downloads/ReleaseNotes>.
As usual, the download location is <http://dev.monetdb.org/downloads/>.
Oct 2014-SP2 bugfix release
Client Package
* Changes to the Perl interface, thanks to Stefan O'Rear: 1. removes
"use sigtrap", because this has global effects and should not be
used by modules, only by the application. 2. allows Perl 5.8.1+
Unicode strings to be passed to quote() and included in statements
(UTF-8 encoded, as expected by Monet's str module) 3. quote and
unquote now use the same quoting rules as the MonetDB server,
allowing for all characters except NUL to be round-tripped 4.
several character loops have been reimplemented in regex for much
greater performance 5. micro-optimizations to the result fetch loop
6. block boundaries are preserved in piggyback data so that Xclose
is not appended or prepended to a SQL command 7. diagnostic
messages #foo before a result header are ignored, this is necessary
to use recycler_pipe 8. fail quickly and loudly if we receive a
continuation prompt (or any other response that starts with a
non-ASCII character) 9. header lines must start with %, not merely
contain %, fixing a bug when querying a table where string values
contain % 10. after closing a large resultset, account for the fact
that a reply will come and do not lose sync 11. allow a MAPI_TRACE
environment variable to dump wire protocol frames to standard
output 12. fixes maximum MAPI block size to match the server limit
of 16k. previously would crash on blocks larger than 16k
SQL
* Fixed a typo in a column name of the sys.tablestoragemodel view
(auxillary changed to auxiliary).
Bug Fixes
* 3467: Field aliases with '#' character excise field names in result
set.
* 3605: relational query without result
* 3619: Missing dll on MonetDB Start
* 3622: Type resolution error
* 3624: insert of incomplete or invalid ip address values in inet
column is silently accepted but the values are not stored (they
become/show nil)
* 3626: casting a type without alias results in program contains
errors
* 3628: mclient and ODBC driver report 'type mismatch' when
stddev_pop used in a select which returns 0 rows
* 3629: IF THEN ELSEIF always evaluates the first test as true
* 3630: segv on rel_order_by_column_exp
* 3632: running make clean twice gives an error in
clients/ruby/adapter
* 3633: Wrong result for HAVING with floating-point constant
* 3640: Missing implementation of scalar function: sql_sub(<date>,
<month interval>)
* 3641: SQL lexer fails to detect string end if it the last character
is U+FEFF ZERO WIDTH NO-BREAK SPACE
* 3642: Combined WHERE conditions less-than plus equals-to produce
incorrect results
* 3643: Missing implementations of scalar function: sql_sub(<timetz>,
arg2)
* 3644: COPY INTO fails to import "inet" data type when value has
prefix length in CIDR notation
* 3646: ORDER BY clause does not produce proper results on 'inet'
datatype
* 3649: recycler crashes with concurrent transactions
Hi Sjoerd,
This is very welcome!
If I read well, this will use, in order of preference:
1. binary search (if l is sorted)
2. imprints if available
3. nested loop otherwise
We use rangejoin extensively within Spinque and the previous one (just
nested loop) has never an option.
So far we have been using our own version which is perhaps naive but proved
to be effective:
- sort all inputs if not sorted already
- perform a mergejon-like fast scan
Though simple, I have not found so far a case where this strategy would not
outperform nested loops by far. The cost of sorting was always far less
than the cost of a full nested loop.
I was wondering what your thoughts about this would be. Could this strategy
replace the number 3 above? Does it make sense to keep the vanilla nested
loop?
Best, Roberto
On 27 January 2015 at 14:09, Sjoerd Mullender <commits(a)monetdb.org> wrote:
> Changeset: 5147add3bb38 for MonetDB
> URL: http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=5147add3bb38
> Modified Files:
> gdk/ChangeLog.Oct2014
> gdk/gdk_join.c
> gdk/gdk_private.h
> gdk/gdk_select.c
> Branch: Oct2014
> Log Message:
>
> Reimplemented rangejoin, now using imprints or binary search if possible.
>
>
I have been integrating MonetDB into a Perl system for the last few
months, and have made a few changes to the Perl driver to improve
performance and reliability.
I hereby release these changes under the same license as MonetDB
itself. I have permission from my employer to do this.
Is this to correct place to send and/or discuss patches?
In order of patch hunks; more details will be provided for any part if
requested:
1. removes "use sigtrap", because this has global effects and should
not be used by modules, only by the application.
2. allows Perl 5.8.1+ Unicode strings to be passed to quote() and
included in statements (UTF-8 encoded, as expected by Monet's str
module)
3. quote and unquote now use the same quoting rules as the MonetDB
server, allowing for all characters except NUL to be round-tripped
4. several character loops have been reimplemented in regex for much
greater performance
5. micro-optimizations to the result fetch loop
6. block boundaries are preserved in piggyback data so that Xclose is
not appended or prepended to a SQL command
7. diagnostic messages #foo before a result header are ignored, this
is necessary to use recycler_pipe
8. fail quickly and loudly if we receive a continuation prompt (or any
other response that starts with a non-ASCII character)
9. header lines must start with %, not merely contain %, fixing a bug
when querying a table where string values contain %
10. after closing a large resultset, account for the fact that a reply
will come and do not lose sync
11. allow a MAPI_TRACE environment variable to dump wire protocol
frames to standard output
12. fixes maximum MAPI block size to match the server limit of 16k.
previously would crash on blocks larger than 16k
Hi,
I have been studying on multiple join performances in MonetDB. I am running
the latest version of MonetDB Oct2014 - SP1 (MonetDB-11.19.3) on a CentOS
machine with 128 GB RAM.
All my tables have 100 to 300 columns and around 35 lakh rows.
I have the following query with 5 joins and a few criteria.
SELECT * FROM table1
INNER JOIN table2 on table1.col1=table2.col1
LEFT JOIN table3 ON table1.col2=table3.col1
LEFT JOIN table3 ON table1.col3=table3.col1
LEFT JOIN table4 ON table2.col2=table4.col1
LEFT JOIN table5 ON table2.col3=table5.col1
WHERE (((table2.col10 like '%a%' OR table5.col11 like '%s%') and
((table2.col12 like '%w%' *and table4.col13 like '%k'*) OR table2.col14
like '%h%') and (table1.col21 = 1) AND (table1.col22 IN (1,2,3))) AND
(((table1.col23 >= 15842000000000000) AND (table1.col23 <=
15842999999999999)) OR ((table1.col23 >= 0) AND (table1.col23 <=
999999999999)))) ORDER BY 1 DESC LIMIT 10 OFFSET 0;
This query has only 2000 matching rows and it returns the result in *2.9
seconds*.
In the same query, if I remove one criteria alone (the bolded part), the
result would still match only 2000 rows, but the result would come in *300
milliseconds* even.
I used the PLAN statement and found that the query plan that the executor
takes for the first query is a much complex path and *it performs 3 joins
for all the 35 lakh rows* (i.e. before executing the criteria even) and
hence the time has spiked to 3 seconds. May be the hot data of 35 lakh
tuples would have not fit into the memory?? Not sure.
But if I remove the highlighted criteria, the executor evaluates the
criteria first and *it performs the joins for only 7000 rows* and hence it
is faster 1000 times.
Is this a bug?
Or is there any way to optimise my query to match the executor to choose
the right plan?
*I am looking on to install MonetDB in our production setup, but finding
this as a showstopper. *
Any help much appreciated. I could even demo this.
Thanks & Regards,
Vijayakrishna.P.
Mobile : (+91) 9500402305.