Hi all,
we're testing MonetDB performances in order to understand if it could be possible to use it as data mart in our BI software.
We're considering a set of queries executed in our test environment (CentOS 7, 32 GB RAM, 4 cpu).
All the executed queries have quite good performances (between 10 and 45 seconds) except the following one that is taking several minutes in order to be executed:
SELECT id, count(distinct value)
FROM table
GROUP BY id;
The table contains 100 millions of data. It has about 130 columns (varchar, int, bigint, smallint, tinyint, timestamp, datetime). We have no BLOB/CLOB/Text types.
The fields used in the query have these data types:
id: int
value: bigint
Can you suggest us any optimizations in order to have this query run faster? Do you know some possible reasons why it's so slow?
Kind Regards,
Alfio Bettini
The information contained in this message, together with any attachments, may be privileged and confidential and is intended for the use of the addressee(s) only. Any review, copying, use or distribution of this email by others (including any of its attachments) is strictly prohibited. If you are not the intended recipient, please destroy this email and its attachments and contact the sender immediately. This email was sent by EidosMedia SpA<http://www.eidosmedia.com>
Hi all –
I have always used the COPY BINARY INTO … commands to load my 2.0 Billion row genetic data into a monetdb table. With 135 columns, it has been blindingly fast.
Last week I moved from the June2016-SP2 release to dec2016-SP2. My binary loads are taking WAY longer. I killed one after 3 hours (via “call sys.stop(pid)” so it could clean up properly). I then started the load again, thinking perhaps the problem was related to the new columns I was adding.
I have since dropped the table and remade it using the same data and scripts that worked in just over 3 minutes in February on the jun2016-SP2 load. It is really chugging along – I’m up to 30 minutes and counting. I don’t have access to the sql log files, but the Merovingian.log shows nothing.
I do notice that previously the binary files, once loaded, were removed from the loading directly. This does not happen now. Were these files previously “moved” and now they are copied?
Has anyone see this performance issue with Dec2016-SP2 COPY BINARY INTO …. Commands?
Thanks - Lynn
Hi all,
I am generating MAL queries in Java and I am using the JDBC interface to
execute them. This works fine for small tables. But if the resultSet has
more than 100 entries MonetDB is printing the first 100 rows and then
gives me the following error:
...
[ 19, "undergraduateDegreeFrom", 1509 ]
[ 19, "memberOf", 613 ]
[ 19, "takesCourse", 556 ]
[ 19, "advisor", 600 ]
[ 20, "undergraduateDegreeFrom", 1493 ]
[ 20, "memberOf", 613 ]
[ 20, "takesCourse", 529 ]
[ 20, "takesCourse", 578 ]
[ 20, "takesCourse", 557 ]
[ 20, "advisor", 1315 ]
MAPI = (monetdb) /tmp/.s.monetdb.50000
QUERY = sql.resultSet(table_names, table_attrs, table_type, table_lens,
table_scales, start_id, type, end_id);
ERROR = !SyntaxException:parseError:Xclose 0
!SyntaxException:parseError: ^';' expected
Could you help me to avoid this error?
This is the full query:
sql.init();
tbd := sql.mvc();
start_id:bat[:int] := sql.bind(tbd,"sys","relations","start_id",0:int);
type:bat[:str] := sql.bind(tbd,"sys","relations","type",0:int);
end_id:bat[:int] := sql.bind(tbd,"sys","relations","end_id",0:int);
table_names := bat.new(nil:str);
table_names := bat.append(table_names, "start_id");
table_names := bat.append(table_names, "type");
table_names := bat.append(table_names, "end_id");
table_attrs := bat.new(nil:str);
table_attrs := bat.append(table_attrs, "str");
table_attrs := bat.append(table_attrs, "str");
table_attrs := bat.append(table_attrs, "str");
table_type := bat.new(:str);
table_type := bat.append(table_type, "int");
table_type := bat.append(table_type, "varchar");
table_type := bat.append(table_type, "int");
table_lens := bat.new(nil:int);
table_lens := bat.append(table_lens, 1);
table_lens := bat.append(table_lens, 1);
table_lens := bat.append(table_lens, 1);
table_scales := bat.new(nil:int);
table_scales := bat.append(table_scales, 0);
table_scales := bat.append(table_scales, 0);
table_scales := bat.append(table_scales, 0);
sql.resultSet(table_names, table_attrs, table_type, table_lens,
table_scales, start_id, type, end_id);
I am using "MonetDB Database Server v1.7 (Dec2016-SP5)" on Ubuntu 14.04.
What am I trying to do:
The goal of the project is to use Cypher as an input language for
MonetDB. I have a Cypher optimizer written in Java. The output of this
optimizer should be used as input for MonetDB.
In the end we want to compare the the runtime of a query in MonetDB and
in Neo4J.
Thanks and regards,
Dennis
All,
I would like to know what is the MonetDB header datatype if no alias or column name provided? In my BI application, am getting an error while using SELECT SUM(DISTINCT COLUMN_NAME) FROM TABLE_NAME. I see it works in 11.27.9 build when installed in CentOS. However, when i download the source tarball and recompile the code, this fox seems to be lost. Is there anywhere this can be identified? Or fixed?
Regards,
Sreejith
Hi,
i am using MonetDB v11.23.7 (Jun2016-SP1) version. in that when is
reboot my ubuntu machine, my date columns in all tables becoming null .
is there any issue related to version?
Warm Regards,
*Amit Suryakant Ambekar***
*/Technology Associate/*
Office: Ist floor, Poddar Chambers, Mathruradas Mill Compound,
Lower Parel, Mumbai 400013 | Landline: +91 22 43470408
Cell: +91 7741823310 | Email: amit(a)g-square.in | Website:www.g-square.in
All,
I have a valid scenario where our application generates a SQL which is around 4.4MB and contains 75K of combination of AND and OR operators. This SQL fails with "SELECT: too many nested operators" error while executing this SQL.
While I went through the code, I noticed any thread size above THREAD_STACK_SIZE - 16 * 1024 is getting aborted. Is there any possibility that we can change this? Any reason why it's set to 16MB? As server our server has enough memory and storage and we need to understand if this can be handled in code or via parameter as this is valid scenario. Also, we can't split the SQL too.
Regards,
Sreejith
All,
Have a quick question-
I have been trying to perform a test on bigger SQL and I see SQL's are failing which are above 4 MB with error "too many nested operators". There are many AND and OR operators. Not sure if there is any limitations.
Regards,
Sreejith
The MonetDB team at CWI/MonetDB BV is pleased to announce the
Jul2017-SP3 bugfix release of the MonetDB suite of programs.
More information about MonetDB can be found on our website at
<http://www.monetdb.org/>.
For details on this release, please see the release notes at
<http://www.monetdb.org/Downloads/ReleaseNotes>.
As usual, the download location is <http://dev.monetdb.org/downloads/>.
Jul 2017-SP3 bugfix release (11.27.11)
MonetDB Common
* Reimplemented summing of a column of floating point (flt and dbl)
values. The old code could give wildly inaccurate results when
adding up lots and lots of values due to lack of precision. Try
SELECT sum(c) FROM t; where t is 100,000,000 rows, c is of type
REAL and all values are equal to 1.1. (The old code returned
33554432 instead of 1.1e8.)
Bug Fixes
* 3898: Deadlock on insertion
* 6429: ROUND produces wrong data type
* 6436: Query sequence with 2x ifthenelse() and next nullif() causes
mserver5 Segmentation fault
* 6437: System schemas "profiler" and "json" shouldn't be allowed to
be dropped.
* 6439: Invalid references to sys.columns.id from
sys.statistics.column_id
* 6442: SEGFAULT with COPY INTO BEST EFFORT and skipping input
columns
* 6443: complex(?) query forgets(?) column name
* 6444: Using 'with' keyword with table returning function crashes
monetdb
* 6445: Sqlitelogictest crash in MySQL query
* 6446: sql_parser.y bug?
* 6448: 'insert into' with multiple rows containing subqueries
crashes
* 6449: Assertion error in rel_dce_refs (sqlsmith)
* 6450: Assertion error in exp_bin (sqlsmith)
* 6451: Assertion error in sql_ref_dec (sqlsmith)
* 6453: Assertion error in rel_rename_exps (sqlsmith)
* 6454: SQL lexical error
* 6455: Assertion error in rel_apply_rewrite (sqlsmith)
* 6456: NULL becomes 0 in outer join
* 6459: Assertion error in exp_bin (sqlsmith)
* 6462: large virtual memory spike on BLOB column select
* 6465: appending to variables sized atom bats other than str bats
with force flag may result in corrupted heap
* 6467: date_to_str formatter is wrong
* 6470: mitosis gets in the way of simple select
* 6471: calls to sys.generate_series should auto-convert arguments
* 6472: Assertion failure in rel_rename (Sqlsmith)
* 6477: assertion eror rel_push_project_up (sqlsmith)
* 6478: Crash with nested order by/ limit offset
* 6479: Mserver receives an assertion error on a procedure call
* 6480: Segfault in mvc_find_subexp (sqlsmith)
Hi all,
we're testing MonetDB performances in order to understand if it could be possible to use it as data mart in our BI software.
We're considering a set of queries executed in our test environment (CentOS 7, 32 GB RAM, 4 cpu).
All the executed queries have quite good performances (between 10 and 45 seconds) except the following one that is taking several minutes in order to be executed:
SELECT id, count(distinct value) FROM table GROUP BY id;
The table contains 100 millions of data. It has about 130 columns (varchar, int, bigint, smallint, tinyint, timestamp, datetime). We have no BLOB/CLOB/Text types.
Can you suggest us any optimizations in order to have this query run faster? Do you know some possible reasons why it's so slow?
Kind Regards,
Alfio Bettini
The information contained in this message, together with any attachments, may be privileged and confidential and is intended for the use of the addressee(s) only. Any review, copying, use or distribution of this email by others (including any of its attachments) is strictly prohibited. If you are not the intended recipient, please destroy this email and its attachments and contact the sender immediately. This email was sent by EidosMedia SpA<http://www.eidosmedia.com>
Hi all,
we're testing MonetDB performances in order to understand if it could be possible to use it as data mart in our BI software.
We're considering a set of queries executed in our test environment (CentOS 7, 32 GB RAM, 4 cpu).
All the executed queries have quite good performances (between 10 and 45 seconds) except the following one that is taking several minutes in order to be executed:
SELECT id, count(distinct value)
FROM table
GROUP BY id;
The table contains 100 millions of data. It has about 130 columns (varchar, int, bigint, smallint, tinyint, timestamp, datetime). We have no BLOB/CLOB/Text types.
The fields used in the query have these data types:
id: int
value: bigint
Can you suggest us any optimizations in order to have this query run faster? Do you know some possible reasons why it's so slow?
Kind Regards,
Alfio Bettini
The information contained in this message, together with any attachments, may be privileged and confidential and is intended for the use of the addressee(s) only. Any review, copying, use or distribution of this email by others (including any of its attachments) is strictly prohibited. If you are not the intended recipient, please destroy this email and its attachments and contact the sender immediately. This email was sent by EidosMedia SpA<http://www.eidosmedia.com>