Hi,
As per MonetDB Solution suggestion, we are asking here about this topic.
We are studying how MonetDB store its database on disk.
So far we were able to understand pretty much all the files (BAT,
journal, BBP.dir, heap, hash, imprints, and so on). We have custom tools
to decode and dump them (experimental at this point).
There is one thing however that is unclear: we thought that "delta BAT",
which are the BAT assigned to an SQL tables to list the OIDs of rows
that were deleted (named "D_<schema>_<table>" internaly), would contains
only *uniques* values (since you can never delete twice the same row).
But on several databases that we are running, we found that some of
theses BATs contain duplicates. Lot of duplicates actually. Especially
on D_sys__columns and D_sys__tables BATs (respectively for the
sys._columns and sys._tables system tables). Some databases do not have
this "issue" (while they all have the same schema, and process the same
kind of data as the other ones).
Can someone explain if duplicates are expected in theses BATs ?
Here is an excerpt of D_sys__tables:
# hexdump -C 02/210.tail
00000000 2e 00 00 00 00 00 00 00 2f 00 00 00 00 00 00 00 |......../.......|
00000010 30 00 00 00 00 00 00 00 31 00 00 00 00 00 00 00 |0.......1.......|
00000020 32 00 00 00 00 00 00 00 33 00 00 00 00 00 00 00 |2.......3.......|
00000030 34 00 00 00 00 00 00 00 35 00 00 00 00 00 00 00 |4.......5.......|
00000040 36 00 00 00 00 00 00 00 37 00 00 00 00 00 00 00 |6.......7.......|
00000050 42 00 00 00 00 00 00 00 43 00 00 00 00 00 00 00 |B.......C.......|
00000060 44 00 00 00 00 00 00 00 45 00 00 00 00 00 00 00 |D.......E.......|
00000070 46 00 00 00 00 00 00 00 47 00 00 00 00 00 00 00 |F.......G.......|
00000080 48 00 00 00 00 00 00 00 49 00 00 00 00 00 00 00 |H.......I.......|
00000090 4a 00 00 00 00 00 00 00 4b 00 00 00 00 00 00 00 |J.......K.......|
000000a0 4c 00 00 00 00 00 00 00 4d 00 00 00 00 00 00 00 |L.......M.......|
000000b0 4e 00 00 00 00 00 00 00 4f 00 00 00 00 00 00 00 |N.......O.......|
000000c0 50 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00 |P.......P.......|
000000d0 51 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00 |Q.......P.......|
000000e0 51 00 00 00 00 00 00 00 52 00 00 00 00 00 00 00 |Q.......R.......|
000000f0 50 00 00 00 00 00 00 00 51 00 00 00 00 00 00 00 |P.......Q.......|
00000100 52 00 00 00 00 00 00 00 53 00 00 00 00 00 00 00 |R.......S.......|
00000110 50 00 00 00 00 00 00 00 51 00 00 00 00 00 00 00 |P.......Q.......|
00000120 52 00 00 00 00 00 00 00 53 00 00 00 00 00 00 00 |R.......S.......|
...
You can clearly see duplicates OID (the first one being 0x50 at 0xc8).
Its entry in BBP.dir (split into sections):
136 32 tmp_210 tmpr_210 02/210 610523782 2 171807 0 0 171807 172032 0 0 0 0
void 0 1 1793 0 0 0 0 0 1000651 0 0 0
oid 8 0 1024 24 25 27 1 46 773603235 1374456 1376256 1
We are using: MonetDB 5 server v11.21.14 (64-bit, 64-bit oids, 128-bit integers).
--
Frédéric Jolliton
Sécuractive
Hi,
Last year, I was looking for the MAL definition of group_concat and
Nik Schuiling gave me the MAL definition which is below.
We compile MonetDB from source and this definition was working on
MonetDB-11.19.9 (Oct2014-SP2). Since this version, there has been a
number of newer versions of MonetDB. So, I was trying to upgrade our
system to the latest version of MonetDB (11.21.13). After the
installation of the newer version, I ran:
select "DataSetId", group_concat("FieldName") from datasetmarkermeta
group by "DataSetId";
from mclient interface and got the following error message:
TypeException:kddart.subgroup_concat[5]:'algebra.uselect' undefined
in: TIDs:any := algebra.uselect(g:bat[:oid,:oid],grpid:oid);
MALException:kddart.subgroup_concat[0]:Error in cloned function
TypeException:user.s2_1[15]:'kddart.subgroup_concat' undefined in:
X_27:bat[:oid,:str] :=
kddart.subgroup_concat(X_26:bat[:oid,:str],X_16:bat[:oid,:oid],r1_16:bat[:oid,:oid],true:bit);
program contains errors
It says algebra.uselect function is not defined. How do I fix this problem?
group_concat MAL definition start
-------------------------------------------
module kddart;
command group_concat(b:bat[:oid,:str]):str
address KDDARTgroup_concat
comment "Mal kddart";
function group_concat(b:bat[:oid, :str]):str;
value := "";
barrier (h, t) := iterator.new(b);
value := value + t;
value := value + "";
redo (h, t) := iterator.next(b);
exit (h, t);
return value;
end group_concat;
function subgroup_concat(b:bat[:oid,:any_1],g:bat[:oid,:oid],e:bat[:oid,:any_2],skip_nils:bit)
:bat[:oid,:str];
nw := aggr.count(e);
nl := calc.lng(nw);
bn := bat.new(:oid, :str, nl);
# check 'e' has some values - this is the list of group IDs in the head
# 'g' is the group to the data BAT head ID mapping
# 'b' is the data column BAT that we are aggregating over
barrier (grpid, t) := iterator.new(e);
# select GID from TID->GID map to get matching TIDs
TIDs := algebra.uselect(g,grpid);
# get DATA for matching TIDs
b_data := algebra.kintersect(b,TIDs);
# aggregate
grpval := group_concat(b_data);
# Store the result for this group
bat.insert(bn, grpid, grpval);
redo (grpid, t) := iterator.next(e);
exit (grpid, t);
return bn;
end subgroup_concat;
How to find all running processes and their statuses in MonetDB??
I found this...
https://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/monitor
But I couldn't get this work... I ran a query (which took some 20 seconds
to complete) in a window and connected the same database in another
terminal and ran "select * from sys.queue" (several times during those 20
seconds) but I could get only this...
sql>select * from sys.queue;
+-------+---------+----------------------------+----------------------------+----------+---------+------------+--------------------------+
| qtag | user | started | estimate
| progress | status | tag | query |
+=======+=========+============================+============================+==========+=========+============+==========================+
| 27339 | monetdb | 2015-08-13 05:35:40.000000 | null
| null | running | 12532134@0 | select * from sys.queue; |
+-------+---------+----------------------------+----------------------------+----------+---------+------------+--------------------------+
I am using Oct-2014 version of monetdb on CentOS...
Is there a way to find all running processes and their statuses in
MonetDB??
Thanks & Regards,
Vijayakrishna.P.
Mobile : (+91) 9500402305.
Hello List,
I'm running monetdb on a Debian server and the software is managed via
Debian packages. A few days ago I installed an update for the
monetdb5-server package. After that I noticed a problem with queries
with the function lower() in the WHERE clause:
>>WHERE LOWER(“fieldname”) LIKE '%lower_string%'<<
without any change to the data, the requests became much, much slower
then before (from some hundred microseconds to some minutes). Not only
in my application but also in the monetdb client...
I could solve the problem via a switch from using LIKE to ILIKE and no
lower()-function, but I'm curious what could be the cause for this
dramatic performance breakdown? I searched the release history but I
found nothing...
Has anybody experienced the same effect? Anny ideas on this phenomenon?
Greetings from Germany,
Andreas (theafh)
Hi,
I am looking for an Epoch (bigint) to date/timestamp converter in MonetDB -
something similar to unix_timestamp() in MySQL.
I referred this page -
https://www.monetdb.org/Documentation/SQLreference/Temporal and tried out a
few functions, but it seems to work only with the existing Timestamp
columns.
Am I missing something here?
I am using MonetDB Jan2014 release on a Mac OS-X.
Any help much appreciated.
Thanks & Regards,
Vijayakrishna.P.
Mobile : (+91) 9500402305.
Hi all,
I am having a strange problem regarding prepared queries with Umlaute (e.g., ä). It seems that the rule for type casting strings correspond to the length of the string. The problem is
that an Umlaut seems to count for two "normal" characters. Here is an example. Consider the following simple table:
create table A (a varchar(4));
Now if we create a prepared statement (ultimately I want to do this from Java) such as
prepare select * from A where a = ?;
we can execute the statement and supply 4 characters as the argument:
exec 3('1234');
This works fine. However, if I am using an Umlaut within the parameter, then things don't work out. For example
exec 4('ä123');
fails with the following error message
EXEC: wrong type for argument 1 of prepared statement: char, expected varchar
However, if I now shorten the parameter by "1" character, it works again. That is,
exec 5('ä12');
is ok.
Can anybody explain this behavior? Is this a bug? And is there a workaround? The only thing I came up with so far is to double the size of every column.
Best regards,
Arno
Hi,
Do you think that many ( very many ) inserts / updates on this table ,
inserts that were aborted by network error could corrupt data and cause
it? Remember , table have 100M rows. I have table with 30M rows, and run
well work. Att,
--
Luciano Sasso Vieira
Data Scientist & Solutions Architect
luciano(a)gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833
| cel: 17 99706-9335
www.gsgroup.com.br <http://www.gsgroup.com.br>
---
Este email foi escaneado pelo Avast antivírus.
https://www.avast.com/antivirus
Hi all,
Is it possible to create a file system level backup of a running MonetDB
database? What are the downsides of doing this?
For what it's worth, our databases are generally never updated outside of
pre-fixed windows so only SELECT queries will be executed during any file
system level backup.
Best regards,
Dennis Pallett
they do work in the latest release but they crash monetdblite, making me
wonder about the root culprit.. thanks all
create table my_table (my_id clob, my_double double);
SELECT COUNT( DISTINCT my_id ) AS unique_enrollees , QUANTILE( my_double ,
0.25 ) AS some_quantile FROM my_table;