Hi all –
I have always used the COPY BINARY INTO … commands to load my 2.0 Billion row genetic data into a monetdb table. With 135 columns, it has been blindingly fast.
Last week I moved from the June2016-SP2 release to dec2016-SP2. My binary loads are taking WAY longer. I killed one after 3 hours (via “call sys.stop(pid)” so it could clean up properly). I then started the load again, thinking perhaps the problem was related to the new columns I was adding.
I have since dropped the table and remade it using the same data and scripts that worked in just over 3 minutes in February on the jun2016-SP2 load. It is really chugging along – I’m up to 30 minutes and counting. I don’t have access to the sql log files, but the Merovingian.log shows nothing.
I do notice that previously the binary files, once loaded, were removed from the loading directly. This does not happen now. Were these files previously “moved” and now they are copied?
Has anyone see this performance issue with Dec2016-SP2 COPY BINARY INTO …. Commands?
Thanks - Lynn
Hi, all,
Consider the following scenario. I have a bunch of user defined functions, f_i, that create, populate, and use a local temporary table with the same name "res":
create function f_i()
returns table( ... )
begin
# Result stats table.
create local temporary table res(
param1 int,
param2 int
);
/* Populate res */
/* Use res in some queries */
return table(
/* Query involving res */
);
end;
My first question is: Does "create local temporary table x" create a table that is disposed/dropped as soon as the function where it was declared finishes execution? I assumed this was the case, but I'm now skeptical as we are running into all sorts of concurrency issues that point to the "res" table not being cleared out properly from memory.
This takes me to the second question: How do I drop a temporary table if such table exists? The reason for this is to prevent "table create" exceptions (which we are currently running into). I'd like to achieve something along these lines:
create function f_i()
returns table( ... )
begin
# Drop local temporary table if such table exists.
drop local temporary table res; <-------- How do I do this in MonetDB?
# Result stats table.
create local temporary table res(
param1 int,
param2 int
);
/* Populate res */
/* Use res in some queries */
return table(
/* Query involving res */
);
end;
Where, as you can see, I want to explicitly drop the local temporary table "res" if such exists. If I leave the "local temporary" markers in the drop statement, MonetDB complains that it found unexpected "LOCAL" or "TEMPORARY". I wouldn't like to remove these markers because a statement like:
drop table res;
will drop any table in the default function schema with that name.
Any help and hints are well appreciated!
Thanks for the help in advance
~ Luis Angel
The information transmitted, including any attachments, is intended only for the individual or entity to which it is addressed, and may contain confidential and/or privileged information. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by individuals or entities other than the intended recipient is prohibited, and all liability arising therefrom is disclaimed. If you have received this communication in error, please delete the information from any computer and notify the sender.
I'm seeing a strange behaviour with COLcopy(). (Dec2016, optimized,
non-devel compilation)
In short, it seems to take almost half second to copy a 1-tuple string
(view) bat.
Inspected with gdb, I see that the copy falls in "(3) we can copy the heaps
(memcopy, or even VM page sharing)", with the following values:
cnt = 1
bunstocopy = BUN_NONE
isVIEW(b) = TRUE
VIEWtparent(b) = 0
b->T.heap.size = 1024
b->T.vheap.size = 1094320128
The actual tail and heap copy then takes place:
heapcopy(bn, "tail", &bthp, &b->theap)
heapcopy(bn, "theap", &thp, b->tvheap)
Does this mean that a heap of almost 1GB has been copied for a 1-tuple view?
Roberto
The MonetDB team at CWI/MonetDB BV is pleased to announce the
Dec2016-SP3 bugfix release of the MonetDB suite of programs.
More information about MonetDB can be found on our website at
<http://www.monetdb.org/>.
For details on this release, please see the release notes at
<http://www.monetdb.org/Downloads/ReleaseNotes>.
As usual, the download location is <http://dev.monetdb.org/downloads/>.
Dec 2016-SP3 bugfix release
MonetDB Common
* Fixed a bug when appending string bats that are fully duplicate
eliminated. It could happend that the to-be-appended bat had an
empty string at an offset and at that same offset in the
to-be-appended-to bat there happened to be a (sequence of) NULL(s).
Then this offset would be used, even though it might nog be the
right offset for the empty string in the to-be-appended-to bat.
This would result in multiple offsets for the empty string,
breaking the promise of being duplicate eliminated.
Bug Fixes
* 6138: Weak duplicate elimination in string heaps > 64KB
* 6183: ResultSet returns double quoted column name if name contains
space characters
* 6216: Assertion raised (sqlsmith)
* 6219: Crash in rel_optimizer (sqlsmith)
* 6227: Monetdb fails on remote tables
* 6228: mclient crashes if real column is multiplied by it itself
* 6229: ANALYZE, unexpected end of input
* 6230: ANALYZE, syntax error
* 6237: semijoin with empty right bat does not return immediately
* 6242: Crash on rel_reduce_groupby_exps (sqlsmith)
* 6243: Static optimization gives wrong result (1 + NULL = -127)
* 6245: Nested query crashes all versions of MonetDB or gives wrong
result starting from Dec2016-SP2
* 6246: update statements: references to a table do not bind to its
alias
* 6247: Type analysis issue (sqlsmith)
* 6248: update statements: the semantic stage does not resolve the
relation in the from clause
* 6250: Assertion failure when querying a Blob column with order by
DESC
* 6251: Crash after adding an ordered index on sys.statistics column
and querying sys.statistics
* 6253: FITS Data Vaults does not work when using user/pw and other
than sys schema name
Hi there,
We have an issue about the remote table on monetdb version of 2016 (sp1
and sp2)
on Ubuntu 14.04 and on Windows Server2012 R2
The scenario for reproducing the issue with two nodes
on node1
Create a table on remote node and fill it:
CREATE TABLE test(id int not null, name text, valid boolean);
INSERT INTO test (id, name) VALUES (1, '1');
INSERT INTO test (id, name) VALUES (2, '2');
INSERT INTO test (id) VALUES (3);
on node2
CREATE REMOTE TABLE test(id int not null, name text, valid boolean) ON
'mapi:monetdb://node1:50000/dbfarm';
Then on node2:
select * from test;
+------+------+-------+
| id | name | valid |
+======+======+=======+
| 1 | 1 | null |
| 2 | 2 | null |
| 3 | null | null |
+------+------+-------+
It works fine, but:
select * from test where name is null;
+----+------+-------+
| id | name | valid |
+====+======+=======+
+----+------+-------+
id 3 should appear here. Furthermore:
select * from test where name is not
null;
(mapi:monetdb://monetdb@192.168.254.31/reports2) Cannot
register
project
(
select
(
table(sys.test) [ test.id NOT NULL, test.name, test.valid ]
COUNT
) [ clob "NULL" ! <= test.name ! <= clob "NULL"
]
) [ test.id NOT NULL, test.name, test.valid ] REMOTE
mapi:monetdb://.../...
select * from test where valid is null;
illegal input, not a JSON header (got '')
and node1 is crashed (need : monetdb start farm).
After downgrading on ubuntu 14.04 to version of 2015 (SP4) this
scenraio works fine.
Thanks,
SG
Dear All,
Is there a way to truncate the table size when deleting some of the
records? In my application, I delete and insert ~1 million records daily.
But effectively, it should result in few hundreds additional records. But,
I see the size of the table increases by a millions ( that can be seen in
storage(), as well as hard disk size ). While the "select count(*) from
table" reflects correct number. Is there a way, I can stop increase in
table size in my scheme?
Regards,
Manish
Dear all,
I am looking at some memory profiling of monet database.
Basically, the database size at disk is ~160 GB ( Although I am not very
convinced with this big data size, there are 200M records with ~70 columns
distributed among two big tables and several smaller ( relatively ) tables)
.
Right now, I have 64GB physical RAM dedicated machine for this database,
but soon after firing queries on these tables ( with all sort of
permutations on columns, but no joins between tables ), the memory is
almost fully occupied, and resource crunch kills the performance.
Is there any utility, which shows which all columns are in-memory and what
is the size? And better some setting through which I can guide which
columns should remain in memory and which should immediately trimmed after
query returns?
I am using windows 16 core machine on 64 bit architecture.
Regards,
Manish
Dear all,
I am checking the performance of merge table vs single table on single
machine with 4 cores. The fact table contains ~10million records. I am
using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core
i7 16GB physical ram ).
I have created a merge table which contains fact table and 3 blank copies
of fact table. I am seeing surprisingly slow performance of merge table
queries compared to base table queries.
Base table query
=================
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group
by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+
| L2 | L5 | f_1_1_7 | f_1_1_5 |
+===========+======+=========+=========+
| 125885351 | 14 | SALE | 8084869 |
| 125629268 | 15 | SALE | 6852518 |
| 121733416 | 16 | SALE | 5440605 |
| 110328868 | 13 | SALE | 4842411 |
| 100848236 | 11 | SALE | 8068509 |
| 100719291 | 16 | SALE | 5822736 |
| 93127875 | 11 | SALE | 6529612 |
| 91034094 | 13 | SALE | 3401293 |
| 90766181 | 10 | SALE | 8084900 |
| 88848574 | 10 | SALE | 2660811 |
+-----------+------+---------+---------+
10 tuples (44.2s)
Merge Table Query
=================
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by
f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+
| L2 | L5 | f_1_1_7 | f_1_1_5 |
+===========+======+=========+=========+
| 125885351 | 14 | SALE | 8084869 |
| 125629268 | 15 | SALE | 6852518 |
| 121733416 | 16 | SALE | 5440605 |
| 110328868 | 13 | SALE | 4842411 |
| 100848236 | 11 | SALE | 8068509 |
| 100719291 | 16 | SALE | 5822736 |
| 93127875 | 11 | SALE | 6529612 |
| 91034094 | 13 | SALE | 3401293 |
| 90766181 | 10 | SALE | 8084900 |
| 88848574 | 10 | SALE | 2660811 |
+-----------+------+---------+---------+
10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and
3 blank copies of base table ), merge table query is taking more than 2.5
times the base table runtime. Why is it so?
Thanks,
Mohit
Hi,
I'm attempting to declare a query string to execute in MonetDB. The documentation is not clear if this is possible, but other engines allow this functionality (such as MySQL and MS SQL Sever). Trying the following
execute 'select * from tables';
fails with the message
Error: syntax error, unexpected STRING, expecting IDENT or sqlINT
The reason behind this is that I need to declare a variable *schema name*, such that I can execute a query in the following way:
declare s varchar(32);
set s = 'the_schema';
execute 'select * from ' || the_schema || '.the_table';
Please let me know if this is even possible in MonetDB or if you have some tips about that. I tried with a *prepared statement*, as given in MonetDB's documentation -- but the following code cannot execute
prepare 'select * from ' || the_schema || '.the_table';
because prepare expects an actual query, not a string.
I'd like to achieve this with only SQL stored functions. The client I have has to execute the SQL functions directly, and there's no intermediate Java/PHP/etc scripts to build the SQL on the fly before they're being sent to MonetDB server. Thus, I should create a function like so:
create function getData( dataSchema varchar(32) )
returns bigint
begin
declare query varchar(128);
set query = 'select count(*) from ' || dataSchema || '.the_table';
return( execute query );
end;
My function is more complicated than this, with other tables and even functions. The issue is that functions belong to one schema, and data tables belong to another schema, which is not known in 'compile' time.
Thanks for the help!
Luis Ángel
The information transmitted, including any attachments, is intended only for the individual or entity to which it is addressed, and may contain confidential and/or privileged information. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by individuals or entities other than the intended recipient is prohibited, and all liability arising therefrom is disclaimed. If you have received this communication in error, please delete the information from any computer and notify the sender.
hi, i am trying to test something with this version of mserver on windows:
http://monetdb.cwi.nl/testweb/web/63169:4438fea68169/MonetDB5-SQL-Installer…
it installed fine, but when i load up an mserver session, it says:
# MonetDB 5 server v11.26.0 "default-4438fea68169"
# Serving database 'demo', using 8 threads
# Compiled for x86_64-pc-winnt/64bit
# Found 15.914 GiB available main-memory.
# Copyright (c) 1993-July 2008 CWI.
# Copyright (c) August 2008-2017 MonetDB B.V., all rights reserved
# Visit http://www.monetdb.org/ for further information
!SyntaxException:parseError:
!LoaderException:loadLibrary:Loading error failed to open library pyapi
(from within file 'C:\Program
Files\MonetDB\MonetDB5\lib\monetdb5\lib_pyapi.dll'): The specified module
could not be found.
!SyntaxException:parseError:
!<identifier> expected
>
and then if i click on the window, it disappears.
if i try to connect to an mserver session through MonetDB.R, the mserver
window repeats these two lines forever:
# SQL catalog created, loading sql scripts once
# loading sql script: 09_like.sql
here's a screenshot: http://tinypic.com/r/s6rtjd/9
any ideas about what i should try? maybe installing python? but i've
never had to do so before, so maybe i'm doing something else dumb.. thanks!