Hi there,
We have an issue about the remote table on monetdb version of 2016 (sp1
and sp2)
on Ubuntu 14.04 and on Windows Server2012 R2
The scenario for reproducing the issue with two nodes
on node1
Create a table on remote node and fill it:
CREATE TABLE test(id int not null, name text, valid boolean);
INSERT INTO test (id, name) VALUES (1, '1');
INSERT INTO test (id, name) VALUES (2, '2');
INSERT INTO test (id) VALUES (3);
on node2
CREATE REMOTE TABLE test(id int not null, name text, valid boolean) ON
'mapi:monetdb://node1:50000/dbfarm';
Then on node2:
select * from test;
+------+------+-------+
| id | name | valid |
+======+======+=======+
| 1 | 1 | null |
| 2 | 2 | null |
| 3 | null | null |
+------+------+-------+
It works fine, but:
select * from test where name is null;
+----+------+-------+
| id | name | valid |
+====+======+=======+
+----+------+-------+
id 3 should appear here. Furthermore:
select * from test where name is not
null;
(mapi:monetdb://monetdb@192.168.254.31/reports2) Cannot
register
project
(
select
(
table(sys.test) [ test.id NOT NULL, test.name, test.valid ]
COUNT
) [ clob "NULL" ! <= test.name ! <= clob "NULL"
]
) [ test.id NOT NULL, test.name, test.valid ] REMOTE
mapi:monetdb://.../...
select * from test where valid is null;
illegal input, not a JSON header (got '')
and node1 is crashed (need : monetdb start farm).
After downgrading on ubuntu 14.04 to version of 2015 (SP4) this
scenraio works fine.
Thanks,
SG
Mark,
I think there is a bug in the implementation of the PYTHON_MAP aggregate functions.
The return value of a PYTHON_MAP aggregate function can change if there are other functions in the query.
The return value of my function weighted_percentile_0 (definition below) changes if I include the function median() in my query. The results are incorrect if median() isn't in the query, but are correct if median() is in the query. In this test case, the weights (v2) are all 1 and the values (v1) are uniformly sampled integers from [1,10]. Number of rows for each fctr is ~1000, and my function should reduce to min(v1) in this case.
Is some information bleeding between threads? Should I submit a bug report? This is on the default branch.
Thanks,
Dave
sql>select fctr,weighted_percentile_0(v1,v2),min(v1) from mini where fctr < 10 group by fctr order by fctr;
+------+--------------------------+------+
| fctr | L1 | L2 |
+======+==========================+======+
| 1 | 3 | 1 |
| 2 | 2 | 1 |
| 3 | 2 | 1 |
| 4 | 1 | 1 |
| 5 | 3 | 1 |
| 6 | 3 | 1 |
| 7 | 2 | 1 |
| 8 | 1 | 1 |
| 9 | 2 | 1 |
+------+--------------------------+------+
9 tuples (126.928ms)
sql>select fctr,weighted_percentile_0(v1,v2),min(v1),median(v1) from mini where fctr < 10 group by fctr order by fctr;
+------+--------------------------+------+------+
| fctr | L1 | L2 | L3 |
+======+==========================+======+======+
| 1 | 1 | 1 | 5 |
| 2 | 1 | 1 | 5 |
| 3 | 1 | 1 | 5 |
| 4 | 1 | 1 | 5 |
| 5 | 1 | 1 | 5 |
| 6 | 1 | 1 | 5 |
| 7 | 1 | 1 | 5 |
| 8 | 1 | 1 | 5 |
| 9 | 1 | 1 | 5 |
+------+--------------------------+------+------+
9 tuples (519.195ms)
sql>
CREATE AGGREGATE weighted_percentile_0(a DOUBLE, w DOUBLE)
RETURNS DOUBLE
LANGUAGE PYTHON_MAP {
import numpy as np
# Standardize and sort based on values in a
q = np.array([0]) / 100.0
idx = np.argsort(a)
a_sort = a[idx]
w_sort = w[idx]
# Get the cumulative sum of weights
ecdf = np.cumsum(w_sort)
# Find the percentile index positions associated with the percentiles
p = q * (w_sort.sum() - 1)
# Find the bounding indices (both low and high)
idx_low = np.searchsorted(ecdf, p, side='right')
idx_high = np.searchsorted(ecdf, p + 1, side='right')
idx_high[idx_high > ecdf.size - 1] = ecdf.size - 1
# Calculate the weights
weights_high = p - np.floor(p)
weights_low = 1.0 - weights_high
# Extract the low/high indexes and multiply by the corresponding weights
x1 = np.take(a_sort, idx_low) * weights_low
x2 = np.take(a_sort, idx_high) * weights_high
wp = np.add(x1,x2)
return(wp[0])
};
hi, monetdb does not support SUM() OVER commands.. does anyone have a
smart alternative to implement this? thanks
SELECT
batch_id,
job_count,
SUM(job_count) OVER (ORDER BY duration) as cumjobs,
SUM(job_count) over () as totjobs,
duration
FROM
test_data ;
Hi!
We are using the
COPY *subquery* INTO *file_name *[ [USING] DELIMITERS *field_separator*
[',' *record_separator* [ ',' *string_quot*e ]]] [ NULL AS *null_string* ]
BUT we need to have in the file the name of the fields on the first row.
Im looking on the documentation, but I cannot find anything related to
this, can someone help me with this?
Thks in advance!
I have a column defined as follows:
"div5tailnum" VARCHAR(10) NOT NULL DEFAULT ''
(that's the output of \d mytable), or if you like:
Div5TailNum VARCHAR(10) NOT NULL DEFAULT ''
(that's from the CREATE TABLE command). I'm trying to load data from a
CSV file using the following SQL:
COPY OFFSET 2 INTO mytable FROM file.csv USING DELIMITERS ',','\n','"'
NULL AS '';
but I get:
INSERT INTO: NOT NULL constraint violated for column ontime.div5tailnum
shouldn't that be impossible, seeing how I've provided a default?
[Sorry if this question was already asked and answered before - I'm new to
this mailing list.]
Recently servers with up to 6TB RAM became relatively affordable to be used
with MonetDB
(considering total cost of server hw + DB sw license) - but will MonetDB
scale well to use
all that hardware?
I know it can be tried on the cloud using (for example) x1.32xlarge EC2
spot instances but
before doing that I'd like to ask people with more knowledge/experience
with MonetDB if it
at least looks feasible, if there are any important tuning parameters for
such hw etc.
If you've tired similar hardware and/or know some relevant implementaton
details maybe
you can share your experience?
Thanks.
Igor
Hi,
I am delete rows from table, with "delete from mytable where id=x", and
rows deleted no return in select "select * from mytable where id=x",
its OK !
Version "Jul2015-SP1"
But, when restart database, deleted rows return in select "select * from
mytable where id=x".
Its bug ?, or i have execute vacuum command, how to execute vacuum
command. ?
--
Luciano Sasso Vieira
Data Scientist & Solutions Architect
luciano(a)gsgroup.com.br <http://www.gsgroup.com.br> | tel: 17 3353-0833
| 17 3202-4040 | cel: 17 99706-9335
www.gsgroup.com.br <http://www.gsgroup.com.br>
---
Este email foi escaneado pelo Avast antivírus.
https://www.avast.com/antivirus