Hi all –
I have always used the COPY BINARY INTO … commands to load my 2.0 Billion row genetic data into a monetdb table. With 135 columns, it has been blindingly fast.
Last week I moved from the June2016-SP2 release to dec2016-SP2. My binary loads are taking WAY longer. I killed one after 3 hours (via “call sys.stop(pid)” so it could clean up properly). I then started the load again, thinking perhaps the problem was related to the new columns I was adding.
I have since dropped the table and remade it using the same data and scripts that worked in just over 3 minutes in February on the jun2016-SP2 load. It is really chugging along – I’m up to 30 minutes and counting. I don’t have access to the sql log files, but the Merovingian.log shows nothing.
I do notice that previously the binary files, once loaded, were removed from the loading directly. This does not happen now. Were these files previously “moved” and now they are copied?
Has anyone see this performance issue with Dec2016-SP2 COPY BINARY INTO …. Commands?
Thanks - Lynn
Hi, all,
Consider the following scenario. I have a bunch of user defined functions, f_i, that create, populate, and use a local temporary table with the same name "res":
create function f_i()
returns table( ... )
begin
# Result stats table.
create local temporary table res(
param1 int,
param2 int
);
/* Populate res */
/* Use res in some queries */
return table(
/* Query involving res */
);
end;
My first question is: Does "create local temporary table x" create a table that is disposed/dropped as soon as the function where it was declared finishes execution? I assumed this was the case, but I'm now skeptical as we are running into all sorts of concurrency issues that point to the "res" table not being cleared out properly from memory.
This takes me to the second question: How do I drop a temporary table if such table exists? The reason for this is to prevent "table create" exceptions (which we are currently running into). I'd like to achieve something along these lines:
create function f_i()
returns table( ... )
begin
# Drop local temporary table if such table exists.
drop local temporary table res; <-------- How do I do this in MonetDB?
# Result stats table.
create local temporary table res(
param1 int,
param2 int
);
/* Populate res */
/* Use res in some queries */
return table(
/* Query involving res */
);
end;
Where, as you can see, I want to explicitly drop the local temporary table "res" if such exists. If I leave the "local temporary" markers in the drop statement, MonetDB complains that it found unexpected "LOCAL" or "TEMPORARY". I wouldn't like to remove these markers because a statement like:
drop table res;
will drop any table in the default function schema with that name.
Any help and hints are well appreciated!
Thanks for the help in advance
~ Luis Angel
The information transmitted, including any attachments, is intended only for the individual or entity to which it is addressed, and may contain confidential and/or privileged information. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by individuals or entities other than the intended recipient is prohibited, and all liability arising therefrom is disclaimed. If you have received this communication in error, please delete the information from any computer and notify the sender.
Greetings Eyal,
/Thank you/ a lot for your quick /response/,
>Are you interested in that for purposes of academic research?
Yes, If you could suggest me similar work, that is opensource,
I would very much appreciate it.
Best Regards,
Mahmoud Mohsen
>1. Column-at-a-time.
>2. I'm almost certain the answer is "No", but even if it had been "Yes",
>the changes in MonetDB since then would make it unrealistic to use it
>with today's code without rewriting a lot of it if not all of it. And
>the changes since those days in the closed-source Vectorwise/Actian
>Vector have also been significant
>Are you interested in that for purposes of academic research?
>Eyal
On 04/14/2017 07:39 PM, Mahmoud Mohsen wrote:
>/Greetings, />//>/The current query engine of the MonetDB...what model of processing does />/it follow....(column-at-a-time, tuple-at-a-time, vector-at-a-time)? />//>/one last question.... />//>/Has the X100 query engine ever been published as opensource in the older />/releases of MonetDB? />//>/Thanks a lot in advance, />//>/Best Regards, />//>/Mahmoud Mohsen /
Greetings,
The current query engine of the MonetDB...what model of processing does
it follow....(column-at-a-time, tuple-at-a-time, vector-at-a-time)?
one last question....
Has the X100 query engine ever been published as opensource in the older
releases of MonetDB?
Thanks a lot in advance,
Best Regards,
Mahmoud Mohsen
I'm curious to know whether optimistic concurrency control applies across
remote tables.
For example:
db1 contains tableA
db2 defines remote tableA-> db1
say that db1 modifies tableA, while db2 already had an open transaction on
the same table. Will the transaction on db2 fail?
Roberto
I'm seeing a strange behaviour with COLcopy(). (Dec2016, optimized,
non-devel compilation)
In short, it seems to take almost half second to copy a 1-tuple string
(view) bat.
Inspected with gdb, I see that the copy falls in "(3) we can copy the heaps
(memcopy, or even VM page sharing)", with the following values:
cnt = 1
bunstocopy = BUN_NONE
isVIEW(b) = TRUE
VIEWtparent(b) = 0
b->T.heap.size = 1024
b->T.vheap.size = 1094320128
The actual tail and heap copy then takes place:
heapcopy(bn, "tail", &bthp, &b->theap)
heapcopy(bn, "theap", &thp, b->tvheap)
Does this mean that a heap of almost 1GB has been copied for a 1-tuple view?
Roberto
Could you give me a feedback on whether this is a possible bug or not?
In the following excerpt:
X_377=<tmp_2516>[9]:bat[:str] :=
mat.packIncrement(X_376=<tmp_2516>[9]:bat[:str],X_335=<tmp_20775>[0]:bat[:str]);"]
...
...
X_100=<tmp_2516>[10]:bat[:str] :=
mat.packIncrement(X_377=<tmp_2516>[10]:bat[:str],X_336=<tmp_24175>[1]:bat[:str]);"]
X_377 has 9 rows.
But when it's used as a parameter for X_100, it is reported to have 10 rows.
In between the two instructions above, X_377 is never used.
Roberto