Hello,
I'm performing some bulk inserts into some tables (COPY INTO). Meanwhile
some JOINs between those tables are executed and also some DELETEs.
During this JOIN's, the disk begins to fill up and only stops after there
is no more space available (60GB in my case). Same for RAM. After a (long)
while the space gets freed up.
The COPY INTO's are done without any significant delay.
The files that use that space are some .tail and .head in the database bat
folder.
If the queries are not concurrent there aren't any problems. The JOIN it's
not using any PK or FK.
Have some one noticed this before? Should I open a bug report?
Ruben Silva
*
*
Hello,
I want to start using the GeoSpatial features of MonetDB and according
to http://www.monetdb.org/Documentation/Extensions/GIS I need to rebuild
MonetDB with the --enable-geom argument to do so. Does this mean that
all the existing binaries published do not include the GeoSpatial
functionality?
If so, I need to follow the guide at
http://www.monetdb.org/Developers/SourceCompile to build my own version
of MonetDB, correct?
Best regards,
Dennis Pallett
Forwarding to user-list as this seems to have gotten lost
Actually, I'm running a few different versions of MonetDB on different
environments. July2012SP2, Feb2013SP4 & SP5. July2012SP2 on Windows and
Linux (Ubuntu); Feb2013SP5 on Windows and Linux, and Feb2013SP4 on Linux.
All were installed using pre-built packages/installers. The trace files
themselves are from a Linux Feb2013SP4 installation. The Feb2013SP5
install was an inplace install on top of July2012SP2 and wasn't sure if
different "data" versions would cloud the trace.
To Martins questions (which I never got in my inbox...or junk) yes,
date_hour_id is a unique key for the dim_date_hour table, but obviously
duplicated within the fact_utilization_hourly table.
My question was more generalized, but if a specific solution can be found,
fantastic. In general I am seeing joins drastically reduce performance
overall.
All are running on EC2; m1.xlarge with High I/O
Thanx,
Bryan
On Mon, Oct 28, 2013 at 9:10 AM, Bryan Senseman <monetdb(a)openbi.com> wrote:
> Forwarding to user-list as this seems to have gotten lost
>
> Actually, I'm running a few different versions of MonetDB on different
> environments. July2012SP2, Feb2013SP4 & SP5. July2012SP2 on Windows and
> Linux (Ubuntu); Feb2013SP5 on Windows and Linux, and Feb2013SP4 on Linux.
> All were installed using pre-built packages/installers. The trace files
> themselves are from a Linux Feb2013SP4 installation. The Feb2013SP5
> install was an inplace install on top of July2012SP2 and wasn't sure if
> different "data" versions would cloud the trace.
>
> To Martins questions (which I never got in my inbox...or junk) yes,
> date_hour_id is a unique key for the dim_date_hour table, but obviously
> duplicated within the fact_utilization_hourly table.
>
> My question was more generalized, but if a specific solution can be found,
> fantastic. In general I am seeing joins drastically reduce performance
> overall.
>
> All are running on EC2; m1.xlarge with High I/O
>
> Thanx,
> Bryan
>
> ---------- Forwarded message ----------
> From: Bryan Senseman <monetdb(a)openbi.com>
> Date: Sat, Oct 19, 2013 at 8:51 PM
> Subject: Re: Are joins really this expensive?
> To: Stefan Manegold <Stefan.Manegold(a)cwi.nl>
>
>
> Ask, and ye shall receive, let me know if you need anything else; and I
> can't share the data, nor would you want a many GB data bomb.
>
> Bryan
>
>
> On Sat, Oct 19, 2013 at 3:17 PM, Stefan Manegold <Stefan.Manegold(a)cwi.nl>wrote:
>
>> Dear Bryan,
>>
>> unless you could share your data such that we could reproduce the problem,
>> could you please profile both queries using TRACE, stethoscope or
>> tomograph;
>> cf.,
>> http://www.monetdb.org/Documentation/Cookbooks/SQLrecipes/QueryTiming
>> and share the respective output?
>>
>> Then we might be able analyze what exactly causes the performance
>> difference between both queries.
>>
>> Thanks,
>> Stefan
>>
>>
>> Bryan Senseman <monetdb(a)openbi.com> wrote:
>>>
>>> I have created a traditional star-schema in MonetDB, using both Linux
>>> and Windows. We are using Mondrian OLAP on top so it generates the
>>> standard OLAP-y type of query using group-bys of dimensional values. What
>>> I’m finding is that doing joins in MonetDB is extremely expensive, for
>>> example:
>>>
>>>
>>>
>>> The following query against a fact table with 261 million rows returns
>>> in 9-11seconds consistently
>>>
>>> select date_hour_id/10000 as date_month_id, avg(seconds) from
>>> fact_utilization_hourly group by date_month_id;
>>>
>>> returning 41 rows, the query includes a “Trick” to get to a month_year
>>> value because the time_ids are smart keys… yyyyMMddhh.
>>>
>>>
>>>
>>> With the following query, using standard OLAP-y syntax, the query takes
>>> over 5 mins consistently.
>>>
>>> select year_month_value, avg(seconds) from fact_utilization_hourly f
>>> join dim_date_hour d on f.date_hour_id = d.date_hour_id group by
>>> year_month_value;
>>>
>>> returning the same 41 rows.
>>>
>>>
>>>
>>> I’m at a loss to figure out why this join causes a 3200% percent
>>> increase in response time. And this seems to have gotten worse in SP5 from
>>> anecdotal evidence.
>>>
>>>
>>>
>>> Bryan
>>>
>>> ------------------------------
>>>
>>> This transmission is confidential and intended solely for the use of the
>>> recipient named above. It may contain confidential, proprietary, or legally
>>> privileged information. If you are not the intended recipient, you are
>>> hereby notified that any unauthorized review, use, disclosure or
>>> distribution is strictly prohibited. If you have received this transmission
>>> in error, please contact the sender by reply e-mail and delete the original
>>> transmission and all copies from your system.
>>>
>>> ------------------------------
>>>
>>> users-list mailing list
>>> users-list(a)monetdb.org
>>>
>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>
>>>
>> --
>> | Stefan.Manegold(a)CWI.nl | Database Architectures (DA) |
>> | www.CWI.nl/~manegold | Science Park 123 (L321) |
>> | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
>>
>
>
>
--
------------------------------
This transmission is confidential and intended solely for the use of the
recipient named above. It may contain confidential, proprietary, or legally
privileged information. If you are not the intended recipient, you are
hereby notified that any unauthorized review, use, disclosure or
distribution is strictly prohibited. If you have received this transmission
in error, please contact the sender by reply e-mail and delete the original
transmission and all copies from your system.
Hello,
we will run MonetDB on two servers and while installing and so on went fine, I struggle with the backup solution. For this, we normally add a separate user with full rights. The dump seems to fail on the sys.db_user_info :
mclient --format=sql --user=backup > /dump-file
MAPI = backup@localhost:50000
QUERY = SELECT "name" FROM "sys"."auths" WHERE "name" NOT IN (SELECT "name" FROM "sys"."db_user_info") AND "grantor" <> 0 ORDER BY "name"
ERROR = !SELECT: access denied for backup to table 'sys.db_user_info'
Ok, let's give it a try:
GRANT ALL ON sys.db_user_info TO "backup" ;
operation successful (1.874ms)
MAPI = backup@localhost:50000
QUERY = SELECT "ui"."name", "ui"."fullname", "password_hash"("ui"."name"), "s"."name" FROM "sys"."db_user_info" "ui", "sys"."schemas" "s" WHERE "ui"."default_schema" = "s"."id" AND "ui"."name" <> 'monetdb' ORDER BY "ui"."name"
ERROR = !access denied for user 'backup'
So, my question is: How to give another user than 'monetdb' full rights to dump the whole db?
Thanks in advance und best regards
Klaus Kruse
--
Corpex Internet GmbH * Schauenburgerstrasse 6 * D-20095 Hamburg
Tel: +49 40 822268-0 * Fax: +49 40 822268-100 * http://www.corpex.de/ HRB 78752,
Amtsgericht Hamburg, Sitz: Hamburg, USt.-Id: DE 213001412
Geschäftsführer: Daniel Hanelt
Dear MonetDB users,
I have a question about accessing MonetDB via Oracle using ODBC. I have oracle on machine1 (ubuntu 64bit) and MonetDB on machine2(ubuntu 64bit)I installed Monetdb odbc driver using apt-get install libmonetdb-client-odbc
I have done all the needed configurations for the listner, tnsnames, sqlnet, initservice.ora and I used UnixODBC.
After creating the database link, I get this error when I try to connect MonetDB:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:[unixODBC][MonetDB][ODBC Driver 11.11.11]Client unable to establish connection{08001}ORA-02063: preceding 2 lines from MONETDB
I don't know if the use of (64bit) needs some special configurations?? or its du to something else.
So I need your help to figure out what is the problem....
Best regards,Baraa
Hi,
I've been trying to understand if there is any performance benefit in using
primary and foreign key constraints when performing join queries.
I've tested joining a table with 1 million rows with one with 60k. Tryed
the same queries with and without constraints and couldn't see a
significant change in execution time.
The tables are not that big but they are a sample of what we'll be working
with in average.
Should i be seeing any difference? Hows do PK and FK constraints affect
query performance in Monetdb? Will it only be noticeable with larger
datasets?
Thank you
Hilario Fernandes
I have created a traditional star-schema in MonetDB, using both Linux and
Windows. We are using Mondrian OLAP on top so it generates the standard
OLAP-y type of query using group-bys of dimensional values. What I’m
finding is that doing joins in MonetDB is extremely expensive, for example:
The following query against a fact table with 261 million rows returns in
9-11seconds consistently
select date_hour_id/10000 as date_month_id, avg(seconds) from
fact_utilization_hourly group by date_month_id;
returning 41 rows, the query includes a “Trick” to get to a month_year
value because the time_ids are smart keys… yyyyMMddhh.
With the following query, using standard OLAP-y syntax, the query takes
over 5 mins consistently.
select year_month_value, avg(seconds) from fact_utilization_hourly f join
dim_date_hour d on f.date_hour_id = d.date_hour_id group by
year_month_value;
returning the same 41 rows.
I’m at a loss to figure out why this join causes a 3200% percent increase
in response time. And this seems to have gotten worse in SP5 from
anecdotal evidence.
Bryan
--
------------------------------
This transmission is confidential and intended solely for the use of the
recipient named above. It may contain confidential, proprietary, or legally
privileged information. If you are not the intended recipient, you are
hereby notified that any unauthorized review, use, disclosure or
distribution is strictly prohibited. If you have received this transmission
in error, please contact the sender by reply e-mail and delete the original
transmission and all copies from your system.
I am using python-monetdb 11.16.0.7. I created my database farm and
database according to instructions given below (source:
http://www.monetdb.org/Documentation/monetdbd)
% monetdbd start /home/my-dbfarm
% monetdb create my-first-db
Then I tried to connect to the database using the below mentioned command
in python(https://pypi.python.org/pypi/python-monetdb/). Upon doing so I am
getting the below mentioned error:
>import monetdb.sql
>connection=monetdb.sql.connect(username="monetdb",password="monetdb",hostname="localhost",database="my-first-db");
File
"/usr/local/lib/python2.7/dist-packages/monetdb/sql/__init__.py", line 28,
in connect
return Connection(*args, **kwargs)
File
"/usr/local/lib/python2.7/dist-packages/monetdb/sql/connections.py", line
58, in __init__
unix_socket=unix_socket)
File "/usr/local/lib/python2.7/dist-packages/monetdb/mapi.py",
line 93, in connect
self.socket.connect((hostname, port))
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 111] Connection refused
Can someone please help me with this?
Also my my-first-db resides inside my db-farm? I am not getting as to what
should I do?
On 18.10.2013 12:05, Hilario Fernandes wrote:
> O made the same test in SP4 and the results were the same, sum(null) kept
> returning 0 and no results for the case. My tests in SP5 were done with a
> clean database with no data imported, so according to your tests i should
> at least have the correct value for the sum...
>
> These are the queries i've done and what i believe would be the expected
> results:
>
> create table test (number int, name varchar(50));
>
> insert into test values(1,'jonh');
> insert into test values(2,'mike');
> insert into test values(3,'mary');
> insert into test values(null,'austin');
> insert into test values(null,'jonh');
> insert into test values(null,'mary');
>
> select name, sum(number)
> from test
> group by name;
>
> Result:
> austin | 0
> jonh | 1
> mary | 3
> mike | 2
>
> Expected:
> austin | null
> jonh | 1
> mary | 3
> mike | 2
>
Hi Hilario,
I've tested your query using default branch (11.16.0) and I can confirm
that this bug also exist. So probably it depends on type of value by
which you group. Probably if you try my example from
http://bugs.monetdb.org/show_bug.cgi?id=3388 on fresh database, results
should be as expected. So group by int works, but group by varchar
doesn't :-(. I'm not sure if this is another bug than 3388 or it is the
same. It's rather another one.
Klaudiusz
Hello,
I'm trying to build a query with order by that puts nulls last always.
I tryed to use the "NULLS LAST" expression but apparently it's not
supported by monetdb.
So i came up with something like this:
select name, sum(number)
from test
group by name
order by case when sum(number) is null then 1 else 0 end, sum(number) ASC
Thing is, this works when the ordering column (number) is not aggretated,
as soon as i throw the sum in there it returns no results.
The sum of null values seems to be returning 0, is it possible to change
that behaviour trough configuration? Is there a better way to achieve this?
thanks