Hi,
I am getting some strange exceptions using MonetDB JDBC 1.8/1.9, and I
decided to look into the code. While inspecting the code I came across
these strange lines in MonetConnection.executeQuery(String[] templ,
String query) method
(http://dev.monetdb.org/hg/MonetDB/file/718da8ca0a1a/java/src/nl/cwi/monetdb…).
In my case, these lines are executed for sure when doing a batch
insert. So, suppose that the batch contains the following commands (as
constructed by the MonetStatement.executeBatch() method):
exec 1(1, 957339737330229055);
exec 1(2, 278262503670654331);
exec 1(805306369, 3763943296910752235)
The lines in question take the above commands as a string, prepend a
's' character and append a ';' at the end. The resulting commands,
which are written in the server's socket, are the following:
sexec 1(1, 957339737330229055);
exec 1(2, 278262503670654331);
exec 1(805306369, 3763943296910752235);
First of all, I am not familiar with the internals of JDBC drivers.
Taking this into account, is this what it should be? From a symmetric
point of view, I would assume that the correct would be the following:
sexec 1(1, 957339737330229055);
sexec 1(2, 278262503670654331);
sexec 1(805306369, 3763943296910752235);
That is, it should prepend a 's' character before an exec command.
Last, are these exec commands (with or without a prepending 's')
specific to MonetDB? In either case, is there any documentation to get
familiar with their meaning?
Thanks a lot,
Babis
Hi,
I'm observing disk I/O when evaluating some queries which I am unable to
explain. I have two datasets, one relatively small with the BATs taking
up 1.6 MB on disk and one fairly large with 283 MB on disk.
According to iostat, when I evaluate the attached query on the small
dataset, 100 MB are written to disk, but none are read. This is on a
Linux machine with 4 GB RAM running the Aug2011-SP3 release.
I don't understand why data is written on disk, that
(a) is never read again and
(b) would easily fit into main memory.
I am unable to evaluate the queries on the large dataset because MonetDB
writes so much data that the disk fills up after a few minutes (more
than 22GB).
Please let me know if I can provide more information.
Cheers,
Viktor
I'm running some experiments with a 1B row fact table (10 columns).
- How can I tell if a column is sorted?
- If I generate files in sorted order and use binary bulk loading, will
monetdb consider the column sorted? If not, can I force it?
I've been reading code, but understanding all the assignments to
bn->tsorted is going to take a while. I get the general theme of setting
tsorted during bat operations, but I've been unsuccessful in finding where
it is initialized when a column is first accessed.
TIA,
-david
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Hi,
MonetDB 'deduplicates' strings, is there any benchmark that outlines
the performance difference of an ENUM with a low number of
(string)items versus the use of a VARCHAR field?
The technical question would be;
Is it more efficient to do:
select * from my_table, my_enum where mytable.my_enum_id = my_enum.id
and my_enum.name = 'Something';
or
select * from my_table where my_enum_varchar = 'something';
I can imagine that the following would out perform a string
comparison, but implies hardcoded values.
select * from my_table, my_enum where mytable.my_enum_id = hardcoded;
Stefan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEAREKAAYFAk78fnkACgkQYH1+F2Rqwn3BUQCeOUtPIGzcHT1xpnsF4n78BQ/1
JWAAnjnzPeT8D7CtsjYgAiGg/WRIgNbQ
=989f
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Hi,
Just to be very sure;
I know that the DELETE, INSERT strategy to update the database will
guarantee a database that keeps on growing, because of the lack of VACUUM.
Is the normal UPDATE statement for fixed width datatypes in place, or
is it also a form of copy-on-write?
Stefan
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEAREKAAYFAk78fPwACgkQYH1+F2Rqwn2fxACfWdXvwr4pWODNo4ZH9uq5oVS9
BmwAn16XGh8h/vYsGkvthKej3MA8Nyvi
=KK5T
-----END PGP SIGNATURE-----
Hi,
is there a maximum size for an SQL command (e.g., INSERT INTO
statement) that is imposed in mclient? The connection is terminated
when I execute several commands from a sql file one of which is quite
long. The point where I get the error is at the 8192th byte (which is
the last of 8KBs).
Babis
Hi,
I get an error message when I retrieve a MAL plan over JDBC after the 250th
call to ResultSet.next(). The relevant part of the stack trace is:
java.lang.AssertionError: block 0 should have been fetched by now :(
at nl.cwi.monetdb.jdbc.MonetConnection$ResultSetResponse.getLine(MonetConnection.java:1585)
at nl.cwi.monetdb.jdbc.MonetResultSet.absolute(MonetResultSet.java:194)
at nl.cwi.monetdb.jdbc.MonetResultSet.relative(MonetResultSet.java:2149)
at nl.cwi.monetdb.jdbc.MonetResultSet.next(MonetResultSet.java:2116)
The call to EXPLAIN works in mclient and returns a MAL plan that is 402 lines
long. I can reproduce the error when I try to create the plan via jdbcclient,
except that it says that block 1 should have been fetched and the exporter
stops after line 160 or so.
However, other SQL queries that also produce more than 250 results work over
JDBC. Not sure what the difference is, but these queries only return column of
integers.
I'm using the Aug2011-SP3 release on OS X 10.6.8. Please let me know if I
should provide other information like the SQL query or the MAL plan.
Cheers,
Viktor
In a large scale time series row-store DB (Oracle), I'm accustomed to
partitioning fact tables on time. This has has a couple of benefits:
- since queries always include predicates on the time field, partition
elimination can help a lot
- when the time based partitions map one-to-one with tablespaces, removing
old data is extremely fast. (when you've reached the maximum data
retention period, you need to remove data as fast as you can load it).
Are there any features in monetdb that would facilitate these needs (or
ameliorate them)?
TIA,
-david
Hi,
I know that when a table is sorted on a column, value look-ups on that
column will use a binary search instead of a linear search. Is there a
similar benefit for joins? E.g. if I have 2 tables that contain a sorted
column C and I do an equality join on this column (or a self-join using
2 aliases of the same table), I would expect a traditional DBMS to do a
merge-join.
Cheers,
Viktor