Hi Stefan,
Thanks for your reply!
We have run the query a few times with different size of data. There we
used 16G RAM(actually 13.5G was used), and find the size of 10G's data
is the critical point that can run the query. All of the data files'
size are listed below, each file name is a table name(there are only a
few tables are refered -- store_sales, date_dim, item, customer,
catalog_sales, web_sales):
7.4K call_center.dat
1.6M catalog_page.dat
212M catalog_returns.dat
2.9G catalog_sales.dat
27M customer_address.dat
64M customer.dat
77M customer_demographics.dat
9.9M date_dim.dat
77B dbgen_version.dat
149K household_demographics.dat
328B income_band.dat
2.6G inventory.dat
28M item.dat
61K promotion.dat
1.7K reason.dat
1.1K ship_mode.dat
27K store.dat
323M store_returns.dat
3.8G store_sales.dat
4.9M time_dim.dat
1.2K warehouse.dat
19K web_page.dat
98M web_returns.dat
1.5G web_sales.dat
12K web_site.dat
So we guess that the monetdb has no memory management?
For the output of `mserver5 --version` is:
MonetDB 5 server v11.27.13 "Jul2017-SP4" (64-bit, 128-bit integers)
Copyright (c) 1993 - July 2008 CWI
Copyright (c) August 2008 - 2018 MonetDB B.V., all rights reserved
Visit https://www.monetdb.org/ for further information
Found 17.0GiB available memory, 40 available cpu cores
Libraries:
libpcre: 8.38 2015-11-23 (compiled with 8.38)
openssl: OpenSSL 1.0.2g 1 Mar 2016 (compiled with OpenSSL 1.0.2g 1
Mar 2016)
libxml2: 2.9.3 (compiled with 2.9.3)
Compiled by: monetdb(a)MonetDB-0.0 (x86_64-pc-linux-gnu)
Compilation: gcc -g -O2
Linking : /usr/bin/ld -m elf_x86_64
And the size of processes is not limited.
To let you reproduce the problem conveniently, I'll provide more details
here:
you can get tpc-ds from its website(we use version 2.6.0).
Install the tpc-ds, access the directory v2.6.0/tools and run `./dsdgen
-scale 10 -dir /home/monetdb/tpc-ds_test_data10G` to generate the data.
When data has been generated, using the script /expe.sh/ to create
tables and load the data. The query script is 123.tpcds.23.sql.(The
syntaxs of other queries that tpc-ds generates is not suitable for
monetdb all, we don't modify them all when the problem occurred).
One more question, I can't get your reply email, so I don't know how to
reply you, for this case, I could only send a new mail echo time.
Thanks!
Regards,
Rancho
Hello all,
Here a simple issue I have with performance :
1/ I create a table A and populate it with a bulk load
2/I create a table B, I populate the table with bulk load
3/I create table C with a CTAS statement. I analyze the statistics and set the primary key
4/I try a "count distinct" type query (on all the table) on the PK Field. It is pretty slow, with a high sollicitation of my hard drive for several dozens of second.
the "select count" is however fast.
So I have these following questions :
1/why despite statistics the count distinct query is slow? I thought the idea of statistics was to get some metrics about the table?
2/if the field I make a count distinct on is the PK (or the set of field is the PK), it's exactly the same than a count non null. So why the time are not the same?
3/if these issues are design issue or bugs, how can I ask to make it solved?
For information : Monetdb 11.31.13 on Windows 10
Best regards,
Simon AUBERT
[cid:691B9F4089B9F34B8E0C22B39B7117EC@eurprd07.prod.outlook.com]
Bourse Maritime - 1, place Lainé - 33 000 Bordeaux
Mob : +33(0)6 66 28 52 04
Hi,
I'm upgrading from an old MonetDB (11.26.0 source snapshot) to the latest proper release. However, my Python aggregate functions appear to have stopped working (Program contains errors.:(NONE).multiplex):
Running mserver5 with:
/usr/local/bin/mserver5 --daemon=yes --set embedded_py=true --dbpath=/data/monetdb-farm/ --set gdk_nr_threads=1 --set max_clients=6
Reproduce with:
CREATE FUNCTION test_udf("conum" int) RETURNS STRING LANGUAGE PYTHON3 {{
}};
Create table test("id" uuid, "date" DATE, "number" INT);
select test_udf("number") from test;
Gives:
Program contains errors.:(NONE).multiplex
I'm running it via Docker (stock Debian) with only MonetDB installed via apt-get.
Not sure if this is connected: https://www.monetdb.org/bugzilla/show_bug.cgi?id=6378 <https://www.monetdb.org/bugzilla/show_bug.cgi?id=6378>
Anyone having any similar issues?
Tried compiling from source (April 2019 bransch) as well, no luck
Regards,
Niklas
Hello,
I have a large table (~700 million rows, ~150 columns). Initially after
import, the database is very fast on primary key lookups, however after a
few thousand creates/updates/deletes on the database, "sorted" (as reported
by sys.storage), flips from true to false. This results in major
performance degradation.
What are the conditions under which a database can lose it's sorted-ness on
the primary key column?
eg:
sql>select schema, table, column, sorted from sys.storage where schema =
'test1' and table = 'cmdb' and column = 'pk_id';
+---------+-------+--------+--------+
| schema | table | column | sorted |
+=========+=======+========+========+
| test1 | cmdb | pk_id | false |
+---------+-------+--------+--------+
Thanks,
-Jeremy Norris
Hello,
I have a crash after about 5 minutes when running in transaction
replication mode. After the crash I cannot access the db.
I am running:
mserver5 --trace --dbpath=/data/monetdb/dbfarm/test_db/ --set
gdk_keep_persisted_log_files=1000 --set gdk_logdir=/mnt/g/monet/wal --set
sql_debug=4
I am using GlusterFS under /mnt/g to manage the replication.
I am running on AWS Linux 2. and I have free space on in both the data and
wal directories.
(I assume this is enough space):
Filesystem Size Used Avail Use% Mounted on
10.0.1.68:/gv0 8.0G 7.4G 669M 92% /mnt/g
bash-4.2$
the db only has one test table:
sql>create table t (n float, t text) ;
operation successful
sql>select * from t ;
+--------------------------+------+
| n | t |
+==========================+======+
| 1.1 | blah |
| 1.1 | blah |
| 1 | blah |
+--------------------------+------+
3 tuples
any help is appreciated!
-mj