I am trying to bulk-load a list of objects into a one-column (primary key)
db. The only reason is to remove duplicates. I can't load the list in
memory, because the file size is way greater than my memory size.
I use monetdb's
COPY-INTO<http://www.monetdb.org/Documentation/Manuals/SQLreference/CopyInto>command,
but I don't want it to fail when there is a duplicate. I want it
to add everything that is not a duplicate and skip the duplicates.
Is there any way to do that with monetdb? Any other way?
Thank you in advance!
Hi,
I am trying to implement a function that split a varchar and returns a
table with the splitted values.
this is what i have so far but i dont know how to complete the
implementation.
CREATE FUNCTION EV_SPLIT(stringtosplit varchar(255), chartospliton char)
RETURNS TABLE (id int, val varchar(255), hashval BIGINT)
BEGIN
--if(EV_TABLEEXISTS('tmp_evsplitres') = 0)
--then
-- create local temporary table tmp_evsplitres (id int, val varchar(255),
hashval BIGINT);
--end if;
declare idx int, tmpVal varchar(255);
set idx = 0;
--start transaction;
while (position(chartospliton in str)>0)
do
set tmpVal = substring(str, 0, position(',' in str)-1);
insert into tmp_evsplitres values(idx, tmpVal, hash(tmpVal));
set str = substring(str, position(',' in str), length(str));
set idx = idx + 1;
end while;
insert into tmp_evsplitres values(idx, str, hash(str));
RETURN TABLE (
select * from tmp_evsplitres
);
--commit;
END;
Hi everyone,
I have only been using MonetDB for a few weeks now and I have come across a very unexpected result when using User Defined Functions in C.
I have a query which does essentially this:
TABLE small_table (
int col1;
float p;
);
TABLE big_table (
int col1;
int col2;
float p;
);
SELECT col1,1+SUM(MY_FUNCTION(big.p,small.p))
FROM small_table small
inner join big_table big on big.col2 == small.col1
group by big.col1;
My function is f(x,y) = 0.7213*(x*ln(x/x+y) + y*ln(y/x+y))
I wrote an optimised version of this function using sse intrinsics in C and compiled it into MonetDB.
To my surprise, I discovered that writing this function inline in the query like so:
1+SUM(0.7213475*(big.p*LOG(big.p/(big.p+small.p))+small.p*LOG(small.p/(big.p+small.p))))
is a lot faster!
The query takes approx 2 seconds with the C version and less than 0.5 seconds with the inline version in SQL!
Can anyone explain this strange result?
Best regards,
Alastair
Hello,
I posted my issues with dumping a database from MonetDB a while ago. The issue is still ongoing. It reduces to the fact, that no other user than the default monetdb-user has enough rights to dump a database.
I played around with the various tables regarding users and schemas (sys.auths , sys.user_role ) but it's still failing on 'sys.db_user_info' :
msqldump -u dump test
password:
-- msqldump dump database Tue Dec 10 17:19:43 2013
-- MonetDB v11.15.7 (Feb2013-SP2), 'mapi:monetdb://dev01.picalike.corpex-kunden.de:50000/test'
START TRANSACTION;
MAPI = (dump) /tmp/.s.monetdb.50000
QUERY = SELECT "name" FROM "sys"."auths" WHERE "name" NOT IN (SELECT "name" FROM "sys"."db_user_info") AND "grantor" <> 0 ORDER BY "name"
ERROR = !SELECT: access denied for dump to table 'sys.db_user_info'
So, my question to all subscribers: Does anybody managed to create another user with full rights to dump the database and when so, how? Any advice is greatly apprecitated.
Best regards
Klaus Kruse
--
Corpex Internet GmbH * Schauenburgerstrasse 6 * D-20095 Hamburg
Tel: +49 40 822268-0 * Fax: +49 40 822268-100 * http://www.corpex.de/ HRB 78752,
Amtsgericht Hamburg, Sitz: Hamburg, USt.-Id: DE 213001412
Geschäftsführer: Daniel Hanelt
I need to update multiple columns (~250) from one table to another based on
a two-column key. In Postgres I would write this using the FROM clause, but
since MonetDB doesn't support the FROM clause, I've successfully updated
one column with:
UPDATE census.acs2010_5yr
SET b07401001 = (
SELECT CAST(b07401001 AS int)
FROM acs2010_5yr.tmp_seq0001
WHERE acs2010_5yr.stusab = tmp_seq0001.stusab AND acs2010_5yr.logrecno =
tmp_seq0001.logrecno
);
My question is whether there is any benefit to writing a multiple column
update into a single UPDATE statement:
UPDATE target_table
SET column1 = (SELECT column1 FROM source_table WHERE target_table.key =
source_table.key),
column2 = (SELECT column2 FROM source_table WHERE target_table.key =
source_table.key),
...,
columnN = (SELECT columnN FROM source_table WHERE target_table.key =
source_table.key);
On Postgres I believe the multi-column update using a table join would only
have to do the sequential scan of each table once, but with the necessity
of sub-selects in MonetDB, what is the backend doing? Is it already doing
an index scan once for each row in the single column case, then doing it
once for each row * column in the multi-column case?
Best,
--Lee
--
Lee Hachadoorian
Asst Professor of Geography, Dartmouth College
http://freecity.commons.gc.cuny.edu/
I am trying to change the default data directory of MonetDB. I installed
monetdb using the ubuntu package, by default the data is stored in:
/var/lib/monetdb
I am running out of space, and I like to migrate the data to another
folder. Does anyone know how to do that? I would welcome a solution that
doesn't involve compiling monetdb from sources...
Thanks in advance
Remi
Is there a way to list tables in a particular schema or even in all
schemas? \d and \dt seem to only return user tables in the sys schema.
Also, what does \ds return?
I have
\dS system tables
\dt user tables
\dv views
\ds ?????
\df functions
\dn schemas
Best,
--Lee
--
Lee Hachadoorian
Asst Professor of Geography, Dartmouth College
http://geospatial.commons.gc.cuny.edu/<http://freecity.commons.gc.cuny.edu/>
I need to bulk load some files with characters like ñ and é. I would do
this using the LATIN1 encoding. I read that bulk loading from file
assumes UTF8, but bulk loading from STDIN will use encoding of mclient.
I would like to know:
(a) How do I determine and change the client encoding?
(b) What is an example for piping STDIN to mclient?
Regarding the latter, I tried:
cat /my/file.csv | mclient -d mydb -u monetdb -s "COPY INTO mytable
FROM (STDIN) USING DELIMITERS ',', '\n' , '"' NULL AS '';"
I wasn't sure how the embedded " would be handled, but the terminal
basically became unresponsive until I ctl-C'ed out of it. Note the the
file was similar in size to several others (without non-UTF8 characters)
that loaded in seconds within mclient.
Best,
--Lee
--
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://geospatial.commons.gc.cuny.eduhttp://freecity.commons.gc.cuny.edu