Hi all,
Not really a bug report because I did not manage to figure out the cause.
However, after upgrading from FC31 to FC32 I could not login any more, due
to SELinux problems. Auto-relabeling did not work, nothing really...
... until I did dnf uninstall MonetDB-selinux.
I came to this point because trying to give systemd services the correct
labels with restorecon failed with an error referencing a monetdb specific
file.
I do not have the details unfortunately, but if you get problems, beware
that MonetDB SELinux package and systemd may interfere in some way beyond
my knowledge of these services.
Best regards,
Arjen
PS: Some output from logs:
sudo ausearch -c monetdb -m AVC,SELINUX_ERR
[..]
----
time->Sat May 2 20:57:01 2020
type=AVC msg=audit(1588445821.693:203): avc: denied { open } for
pid=1232 comm="monetdbd" path="/etc/resolv.conf" dev="dm-0" ino=3409775
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:default_t:s0 tclass=file permissive=1
----
time->Sat May 2 21:12:56 2020
type=AVC msg=audit(1588446776.043:1194): avc: denied { execute } for
pid=2861 comm="(monetdbd)" name="monetdbd" dev="dm-0" ino=2147256
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1
trawcon="unconfined_u:object_r:monetdbd_exec_t:s0"
----
time->Sat May 2 21:12:56 2020
type=AVC msg=audit(1588446776.043:1195): avc: denied { execute_no_trans }
for pid=2861 comm="(monetdbd)" path="/usr/bin/monetdbd" dev="dm-0"
ino=2147256 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1
trawcon="unconfined_u:object_r:monetdbd_exec_t:s0"
----
time->Sat May 2 21:12:56 2020
type=AVC msg=audit(1588446776.044:1196): avc: denied { map } for
pid=2861 comm="monetdbd" path="/usr/bin/monetdbd" dev="dm-0" ino=2147256
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1
trawcon="unconfined_u:object_r:monetdbd_exec_t:s0"
----
time->Sat May 2 21:12:56 2020
type=AVC msg=audit(1588446776.714:1197): avc: denied { remove_name } for
pid=1232 comm="monetdbd" name="merovingian.pid" dev="tmpfs" ino=34369
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=dir permissive=1
trawcon="system_u:object_r:monetdbd_var_run_t:s0"
----
time->Sat May 2 21:12:56 2020
type=AVC msg=audit(1588446776.714:1198): avc: denied { unlink } for
pid=1232 comm="monetdbd" name="merovingian.pid" dev="tmpfs" ino=34369
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1
----
time->Sat May 2 21:12:56 2020
type=AVC msg=audit(1588446776.714:1199): avc: denied { write } for
pid=1232 comm="monetdbd" name=".merovingian_lock" dev="dm-0" ino=5899443
scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:unlabeled_t:s0 tclass=file permissive=1
trawcon="system_u:object_r:monetdbd_lock_t:s0"
----
time->Sat May 2 21:13:15 2020
type=AVC msg=audit(1588446795.214:1209): avc: denied { read } for
pid=2925 comm="(monetdbd)" name="passwd" dev="dm-0" ino=524514
scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:var_t:s0
tclass=file permissive=1
----
time->Sat May 2 21:13:15 2020
type=AVC msg=audit(1588446795.214:1210): avc: denied { open } for
pid=2925 comm="(monetdbd)" path="/var/lib/sss/mc/passwd" dev="dm-0"
ino=524514 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:var_t:s0 tclass=file permissive=1
----
time->Sat May 2 21:13:15 2020
type=AVC msg=audit(1588446795.214:1211): avc: denied { map } for
pid=2925 comm="(monetdbd)" path="/var/lib/sss/mc/passwd" dev="dm-0"
ino=524514 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:var_t:s0 tclass=file permissive=1
----
time->Sat May 2 21:14:24 2020
type=AVC msg=audit(1588446864.487:1281): avc: denied { read } for
pid=3072 comm="(monetdbd)" name="passwd" dev="dm-0" ino=524514
scontext=system_u:system_r:init_t:s0 tcontext=system_u:object_r:var_t:s0
tclass=file permissive=1
----
time->Sat May 2 21:14:24 2020
type=AVC msg=audit(1588446864.487:1282): avc: denied { open } for
pid=3072 comm="(monetdbd)" path="/var/lib/sss/mc/passwd" dev="dm-0"
ino=524514 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:var_t:s0 tclass=file permissive=1
----
time->Sat May 2 21:14:24 2020
type=AVC msg=audit(1588446864.487:1283): avc: denied { map } for
pid=3072 comm="(monetdbd)" path="/var/lib/sss/mc/passwd" dev="dm-0"
ino=524514 scontext=system_u:system_r:init_t:s0
tcontext=system_u:object_r:var_t:s0 tclass=file permissive=1
--
====================================================================
ICIS, office M1.00.05 Radboud University
Mercator 1 Faculty of Science
Toernooiveld 212 arjen(a)cs.ru.nl
NL-6525 EC Nijmegen, The Netherlands +31-(0)24-365 2354
===================== http://www.informagus.nl/ ====================
--
====================================================================
ICIS, office M1.00.05 Radboud University
Mercator 1 Faculty of Science
Toernooiveld 212 arjen(a)cs.ru.nl
NL-6525 EC Nijmegen, The Netherlands +31-(0)24-365 2354
===================== http://www.informagus.nl/ ====================
Hi,
select count(*) from table where field = 'some value';
--> a few ms
update table set field = 'value';
--> update 2 million rows out of 1 billion rows
select count(*) from table where field = 'some value';
--> almost a minute :(
Analyze did not help and sys.vacuum could not be used as the table has
indexes (foreign key).
Re-starting the DB helped ;)
What's the recommended way to make MonetDB perform well after lots of
updates?
Kind regards,
Daniel
This is a question about join order. In general, about how it changed from
Nov2019 to Jun2020 releases.
In particular, with respect to custom joins (filter functions).
With a schema:
CREATE TABLE t1(s string);
CREATE TABLE t2(s string);
Consider the following 2 queries, which only differ for having swapped
conditions:
Q1:
SELECT t1.s, t2.s
FROM t1, t2
WHERE t1.s <> t2.s
AND [t1.s] maxlevenshtein [t2.s, 1];
Q2:
SELECT t1.s, t2.s
FROM t1, t2
WHERE [t1.s] maxlevenshtein [t2.s, 1]
AND t1.s <> t2.s;
[t1.s] maxlevenshtein [t2.s, 1] is equivalent to levenshtein(t1.s, t2.s) <=1
(i.e. the two strings have a levenshtein distance at most 1)
This is a relatively expensive and selective function.
In Nov2019, both Q1 and Q2 are translated to:
- "maxlevenshtein" custom join
- "!=" selection on the result
In Jun2020, the two queries happen to be evaluated in the same order as
they are written. Which means that Q2 is evaluated as:
- "!=" join
- "maxlevenshtein" selection
This last evaluation plan is unfortunately not viable at all. The first
join is not very selective, and the "maxlevenshtein" selection is run on
way too many pairs, without the optimizations that can be exploited in a
join (in a join, it is possible to skip the actual levenshtein computation
for most combinations). Q2 in Jun2020 is 2 orders of magnitude slower than
Q1, which quickly leads to unreasonably long query times.
Of course, this is just one specific case. A very unfortunate one, due to
the combination of a couple of factors:
- The "!=" join is not selective enough
- The custom function is an expensive one
I guess my real questions are:
- Is it by chance (or, as a by-product of more generic join ordering rules)
that Nov2019 executes custom joins first in both Q1 and Q2, or was it an
intentional choice to first execute custom joins?
- What would a reasonable approach be?
Is it reasonable to assume that if one writes a custom join, this is
expected to use an expensive comparison function and that the join
implementation can be much more efficient than the selection implementation
(by skipping unnecessary comparisons)?
If no assumptions can be made, can there be a way to annotate custom
implementations with information on selectivity and cost?
Thanks for you input,
Roberto
Hi,
there seems to be a limitation for Python-based loader functions wrt
timestamps.
Looking at the implementation in
sql/backends/monet5/UDF/pyapi3/convert_loops.h suggests that timestamps are
indeed not covered.
Both of the the 2 following examples fail with "Failed conversion:
MALException:pyapi3.eval:PY000!Unrecognized type. Could not convert to
NPY_UNICODE.".
CREATE LOADER array_loader() LANGUAGE PYTHON {
from datetime import datetime
_emit.emit( { 'a': [1,2,3], 'b': [datetime.utcnow(), datetime.utcnow(),
datetime.utcnow()], 'c': ['1', '2', '3']})
};
CREATE LOADER array_loader() LANGUAGE PYTHON {
import numpy as np
from datetime import datetime
dt = datetime.utcnow()
_emit.emit( { 'a': [1,2,3], 'b': np.array([dt, dt, dt],
dtype=np.datetime64), 'c': ['1', '2', '3']})
};
Is anyone aware of a workaround? An obvious workaround is to insert the
timestamps as strings and convert them via SQL but that's much less
efficient.
Kind regards,
Daniel
Hi, I was looking at pcre.c for some inspiration and found something
suspicious.
At the end of both pcre_likeselect() and re_likeselect():
if (bn && !msg) {
BATsetcount(bn, BATcount(bn)); /* set some properties */
bn->tsorted = true;
bn->trevsorted = bn->batCount <= 1;
bn->tkey = true;
bn->tseqbase = bn->batCount == 0 ? 0 : bn->batCount == 1 ? * (oid *)
Tloc(bn, 0) : oid_nil;
}
Which I read as: if everything went well, then the result is sorted and
key. But I miss why it should be sorted and key.
Roberto
Hi,
This looks very promising so far. Building on Power 8 was easy, just following the official guide from the wiki.
https://www.monetdb.org/wiki/MonetDB:Building_from_sources
So far I’ve only loaded larger amounts of data and the timings look good.
Will perform more tests over the next days but overall this looks like absolutely usable for production.
Kind regards,
Daniel
Hi,
Thanks a lot for confirming this.
We’ll perform some basic tests on our Power 8 system to see if we’re able to build and run it.
Kind regards,
Daniel
Hi folks,
I was wondering if one could run MonetDB on IBM Power architecture (Power 8
and later).
I guess there's no official build for this? Did anyone try this out?
Kind regards,
Daniel
*Without addBatch / executeBatch my program runs without problems:*
[main] Start
[main] tickerSymbolExtern='monetdb'
[main] Start MonetDB
[main] database table COUNTRY - 200 rows in total
[main] database table TIMEZONE - 11 rows in total
[main] database table COUNTRY_STATE - 500 rows so far
[main] database table COUNTRY_STATE - 600 rows in total
[main] database table CITY - 500 rows so far
[main] database table CITY - 1000 rows so far
[main] database table CITY - 1500 rows so far
[main] database table CITY - 1800 rows in total
[main] database table COMPANY - 500 rows so far
[main] database table COMPANY - 1000 rows so far
[main] database table COMPANY - 1500 rows so far
[main] database table COMPANY - 2000 rows so far
[main] database table COMPANY - 2500 rows so far
[main] database table COMPANY - 3000 rows so far
[main] database table COMPANY - 3500 rows so far
[main] database table COMPANY - 4000 rows so far
[main] database table COMPANY - 4500 rows so far
[main] database table COMPANY - 5000 rows so far
[main] database table COMPANY - 5400 rows in total
[main] duration in seconds: 52
[main] End MonetDB
[main] End
*With addBatch / executeBatch and a batch size of 1000 I get the following
error:*
[main] Start
[main] tickerSymbolExtern='monetdb'
[main] Start MonetDB
[main] database table COUNTRY - 200 rows in total
[main] database table TIMEZONE - 11 rows in total
[main] database table COUNTRY_STATE - 500 rows so far
[main] database table COUNTRY_STATE - 600 rows in total
[main] database table CITY - 500 rows so far
[main] database table CITY - 1000 rows so far
[main] database table CITY - 1500 rows so far
java.sql.BatchUpdateException: Error(s) occurred while executing the batch,
see next SQLExceptions for details
at nl.cwi.monetdb.jdbc.MonetStatement.executeBatch(Unknown Source)
at
ch.konnexions.db_seeder.jdbc.AbstractJdbcSeeder.createDataInsert(AbstractJdbcSeeder.java:501)
at
ch.konnexions.db_seeder.jdbc.AbstractJdbcSeeder.createData(AbstractJdbcSeeder.java:404)
at
ch.konnexions.db_seeder.jdbc.AbstractJdbcSeeder.createData(AbstractJdbcSeeder.java:362)
at
ch.konnexions.db_seeder.DatabaseSeeder.main(DatabaseSeeder.java:141)
Processing of the script was aborted, error code=1
*Both times the connection runs with autoCommit.*