hi, monetdb does not support SUM() OVER commands.. does anyone have a
smart alternative to implement this? thanks
SELECT
batch_id,
job_count,
SUM(job_count) OVER (ORDER BY duration) as cumjobs,
SUM(job_count) over () as totjobs,
duration
FROM
test_data ;
Hi,
As per MonetDB Solution suggestion, we are asking here about this topic.
We are studying how MonetDB store its database on disk.
So far we were able to understand pretty much all the files (BAT,
journal, BBP.dir, heap, hash, imprints, and so on). We have custom tools
to decode and dump them (experimental at this point).
There is one thing however that is unclear: we thought that "delta BAT",
which are the BAT assigned to an SQL tables to list the OIDs of rows
that were deleted (named "D_<schema>_<table>" internaly), would contains
only *uniques* values (since you can never delete twice the same row).
But on several databases that we are running, we found that some of
theses BATs contain duplicates. Lot of duplicates actually. Especially
on D_sys__columns and D_sys__tables BATs (respectively for the
sys._columns and sys._tables system tables). Some databases do not have
this "issue" (while they all have the same schema, and process the same
kind of data as the other ones).
Can someone explain if duplicates are expected in theses BATs ?
Here is an excerpt of D_sys__tables:
# hexdump -C 02/210.tail
00000000 2e 00 00 00 00 00 00 00 2f 00 00 00 00 00 00 00 |......../.......|
00000010 30 00 00 00 00 00 00 00 31 00 00 00 00 00 00 00 |0.......1.......|
00000020 32 00 00 00 00 00 00 00 33 00 00 00 00 00 00 00 |2.......3.......|
00000030 34 00 00 00 00 00 00 00 35 00 00 00 00 00 00 00 |4.......5.......|
00000040 36 00 00 00 00 00 00 00 37 00 00 00 00 00 00 00 |6.......7.......|
00000050 42 00 00 00 00 00 00 00 43 00 00 00 00 00 00 00 |B.......C.......|
00000060 44 00 00 00 00 00 00 00 45 00 00 00 00 00 00 00 |D.......E.......|
00000070 46 00 00 00 00 00 00 00 47 00 00 00 00 00 00 00 |F.......G.......|
00000080 48 00 00 00 00 00 00 00 49 00 00 00 00 00 00 00 |H.......I.......|
00000090 4a 00 00 00 00 00 00 00 4b 00 00 00 00 00 00 00 |J.......K.......|
000000a0 4c 00 00 00 00 00 00 00 4d 00 00 00 00 00 00 00 |L.......M.......|
000000b0 4e 00 00 00 00 00 00 00 4f 00 00 00 00 00 00 00 |N.......O.......|
000000c0 50 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00 |P.......P.......|
000000d0 51 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00 |Q.......P.......|
000000e0 51 00 00 00 00 00 00 00 52 00 00 00 00 00 00 00 |Q.......R.......|
000000f0 50 00 00 00 00 00 00 00 51 00 00 00 00 00 00 00 |P.......Q.......|
00000100 52 00 00 00 00 00 00 00 53 00 00 00 00 00 00 00 |R.......S.......|
00000110 50 00 00 00 00 00 00 00 51 00 00 00 00 00 00 00 |P.......Q.......|
00000120 52 00 00 00 00 00 00 00 53 00 00 00 00 00 00 00 |R.......S.......|
...
You can clearly see duplicates OID (the first one being 0x50 at 0xc8).
Its entry in BBP.dir (split into sections):
136 32 tmp_210 tmpr_210 02/210 610523782 2 171807 0 0 171807 172032 0 0 0 0
void 0 1 1793 0 0 0 0 0 1000651 0 0 0
oid 8 0 1024 24 25 27 1 46 773603235 1374456 1376256 1
We are using: MonetDB 5 server v11.21.14 (64-bit, 64-bit oids, 128-bit integers).
--
Frédéric Jolliton
Sécuractive
I noticed a difference between
- a hash-based string selection from a persistent, read-only table
- a hash-based join on the same table and same column
They both build a hash on the same string column (verified with gdb), but
the select can reuse the hash (second call is almost free), while the join
keeps rebuilding the hash.
Is this expected?
Roberto
Hi,
Last year, I was looking for the MAL definition of group_concat and
Nik Schuiling gave me the MAL definition which is below.
We compile MonetDB from source and this definition was working on
MonetDB-11.19.9 (Oct2014-SP2). Since this version, there has been a
number of newer versions of MonetDB. So, I was trying to upgrade our
system to the latest version of MonetDB (11.21.13). After the
installation of the newer version, I ran:
select "DataSetId", group_concat("FieldName") from datasetmarkermeta
group by "DataSetId";
from mclient interface and got the following error message:
TypeException:kddart.subgroup_concat[5]:'algebra.uselect' undefined
in: TIDs:any := algebra.uselect(g:bat[:oid,:oid],grpid:oid);
MALException:kddart.subgroup_concat[0]:Error in cloned function
TypeException:user.s2_1[15]:'kddart.subgroup_concat' undefined in:
X_27:bat[:oid,:str] :=
kddart.subgroup_concat(X_26:bat[:oid,:str],X_16:bat[:oid,:oid],r1_16:bat[:oid,:oid],true:bit);
program contains errors
It says algebra.uselect function is not defined. How do I fix this problem?
group_concat MAL definition start
-------------------------------------------
module kddart;
command group_concat(b:bat[:oid,:str]):str
address KDDARTgroup_concat
comment "Mal kddart";
function group_concat(b:bat[:oid, :str]):str;
value := "";
barrier (h, t) := iterator.new(b);
value := value + t;
value := value + "";
redo (h, t) := iterator.next(b);
exit (h, t);
return value;
end group_concat;
function subgroup_concat(b:bat[:oid,:any_1],g:bat[:oid,:oid],e:bat[:oid,:any_2],skip_nils:bit)
:bat[:oid,:str];
nw := aggr.count(e);
nl := calc.lng(nw);
bn := bat.new(:oid, :str, nl);
# check 'e' has some values - this is the list of group IDs in the head
# 'g' is the group to the data BAT head ID mapping
# 'b' is the data column BAT that we are aggregating over
barrier (grpid, t) := iterator.new(e);
# select GID from TID->GID map to get matching TIDs
TIDs := algebra.uselect(g,grpid);
# get DATA for matching TIDs
b_data := algebra.kintersect(b,TIDs);
# aggregate
grpval := group_concat(b_data);
# Store the result for this group
bat.insert(bn, grpid, grpval);
redo (grpid, t) := iterator.next(e);
exit (grpid, t);
return bn;
end subgroup_concat;
After declaring a wrapper for pcre.replace:
sql>create function pcre_replace(s string, pattern string, repl string,
flags string) returns string external name pcre."replace";
operation successful (6.187ms)
I cannot get replacements with references to captures work (but the code in
pcre.c seems to support it):
sql>select pcre_replace('Apple Pear', '^([a-zA-Z]+)\\s*([a-zA-Z]+)$', '\\1
\\2', '');
+---------------------------+
| pcre_replace_single_value |
+===========================+
| \1 \2 |
+---------------------------+
1 tuple (0.309ms)
The pattern is matched, but the back references are replaced literally.
Am I not getting the syntax correct?
>From http://www.pcre.org/original/doc/html/pcresyntax.html I see I could
use any of \1, \g1, \g{1] for back references, but none seem to work.
Roberto
In postgres I could use explain analyze to see actions performed by
database and their cost. I was not able to run explain analyze. I ran
EXPLAIN, but can not see any plan or cost. Any ideas how to get costs and
query plan like in Postgres?
Hi Guys,
I have got my Crystal Language bindings for MonetDB now working quite well.
https://github.com/puppetpies/crystal-monetdb-libmapi
I've seen options in the mapi code for a variable called mode that allow
you format the response data in XML but it doesn't seem to be implemented
fully in the mapi code.
Are there any guide lines to processing this RAW query output.
% threatmonitor.fruits, threatmonitor.fruits, threatmonitor.fruits,
threatmonitor.fruits, threatmonitor.fruits # table_name
% name, price, weight, comments, id # name
% varchar, varchar, int, clob, int # type
% 9, 4, 2, 11, 1 # length
[ "Apple", "9.99", 50, NULL, 1 ]
[ "Bananna", "3.99", 30, NULL, 2 ]
[ "Orange", "7.99", 60, NULL, 3 ]
[ "Peach", "5.00", 80, NULL, 4 ]
[ "Kiwi", "9.00", 20, NULL, 5 ]
[ "Tomato", "2.00", 20, "Yes a fruit", 6 ]
[ "Pear", "4.00", 30, "Juicy", 7 ]
[ "Nectarine", "6.00", 50, "Juicy", 8 ]
Regards,
Brian Hood
Hi All!
We are facing a problem, we used to work with mysql with the case
insensitive. Now we have moved into MonetDB and we are facing a problem
with this issue.
Some of our queries are not bringing any data because of the sensitive..
SELECT * FROM Country = 'jamaica' ... 0 Records
SELECT * FROM Country = 'Jamaica' ... 10 Records
Some cases we solved it using the ILIKE, but we still have some code on our
app that cannot be modified.
Does anybody know if there is any flag or setup or config for the Database
in Monet to make it insensitive ?
Thanks in advance for the help!
All,
Does the current version of MonetDB support multi-column aggregate functions? I need to create a function which returns the "quantiles" of one column using another column as a weighting. It looks like PostgreSQL supports multi-column aggregates as of 8.2.
I could write an embedded R function to compute this, but want to know if it is even possible to pass in multiple columns.
Thanks,
Dave
An alternative way is to use temp tables to get the same result:
select c1, sum(c2) over(partition by c3) from t1; ==>
create temp table tmp1 (sum_c2 int, c3 int);
insert into tmp1 select sum(c2), c3 from t1 group by c3;
select c1, sum_c2, c3 from t1 join tmp1 on t1.c3=tmp1.c3;
----- 原始邮件 -----
发件人:Roberto Cornacchia <roberto.cornacchia(a)gmail.com>
收件人:Communication channel for MonetDB users <users-list(a)monetdb.org>
主题:Re: SUM() OVER (ORDER BY..)
日期:2016-4-29 15:46:57
No, indeed, this syntax is not supported.
On 29 April 2016 at 09:41, Anthony Damico <ajdamico(a)gmail.com> wrote:
> hmm, that gives me the same error..
>
> SELECT mpg, wt, SUM(wt) OVER (PARTITION BY hp) AS cum_wgt, SUM(wt) over ()
> AS tot_wgt, hp FROM mtcars;
>
>
> Error in .local(conn, statement, ...) :
> Unable to execute statement 'SELECT
> mpg,
> wt,
> SUM(wt) OVER (PARTITION BY hp) AS cum_wgt,
> SUM(wt) over () AS tot_wgt,
> hp
> FROM mtc...'.
> Server says 'ParseException:SQLparser:SELECT: function 'sum' not found'.
>
>
> sorry if i'm doing something silly..
>
>
>
>
> On Fri, Apr 29, 2016 at 3:37 AM, Roberto Cornacchia <
> roberto.cornacchia(a)gmail.com> wrote:
>
>> I think what you are looking for is SUM(..) OVER(PARTITION BY ..)
>>
>> Otherwise where would the groups for the sum come from?
>>
>> Roberto
>>
>> On 29 April 2016 at 09:28, Anthony Damico <ajdamico(a)gmail.com> wrote:
>>
>>> hi, monetdb does not support SUM() OVER commands.. does anyone have a
>>> smart alternative to implement this? thanks
>>>
>>>
>>> SELECT
>>> batch_id,
>>> job_count,
>>> SUM(job_count) OVER (ORDER BY duration) as cumjobs,
>>> SUM(job_count) over () as totjobs,
>>> duration
>>> FROM
>>> test_data ;
>>>
>>>
>>> _______________________________________________
>>> users-list mailing list
>>> users-list(a)monetdb.org
>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>
>>>
>>
>> _______________________________________________
>> users-list mailing list
>> users-list(a)monetdb.org
>> https://www.monetdb.org/mailman/listinfo/users-list
>>
>>
>
> _______________________________________________
> users-list mailing list
> users-list(a)monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
>
_______________________________________________
users-list mailing list
users-list(a)monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list