if for example i have 300 millions of rows, i need to use something as
hadoop? is posible integrate monetdb as hadoop?
if i use only monetdb, and i have 300 millions of rows, how much memory
and processor is recommended as minimum?
Thanks,
EM
I am using the latest version. MonetDB 5 server v11.15.17 "Feb2013-SP5".
The server and client are running under my account. The files and folders were also created with my account.
Here is a simpler version which also encounters the "failed to import table" message.
CREATE TABLE staging_simple_results(
lookup varchar(20) NULL,
extra_symbol varchar(20) NULL,
measure1 tinyint NULL,
base_temp float NULL
)
;
Here are the contents of the same input, in a file called D:\junk\simple_import.csv. The file is also attached.
one,two,3,4.023456787
green,brown,2,8.3456837
This is the statement I issued for the bulk load.
copy into staging_simple_results from ('D:\junk\simple_import.csv')
delimiters ',', '\\r\\n'
;
On 2013-11-27 09:03, Colin Foss wrote:
>
> I created a table with 43 columns. The table contains char,
> varchar, and tinyint datatypes. I am running on Windows Server
> 2008 R2 64bit. The file contains two rows and is comma separated.
> The file is BCP output from MSSQL. When I tried to bulk load into
> the table from a file with a COPY INTO command I got an unhelpful
> message.
>
> "failed to import table"
>
> How can I find out why the import
failed?
Which version of MonetDB are you using? I seem to remember fixing a
problem where the actual failure message wasn't propagated to the
level where the error was reported.
I created a nontemporary table, but named it "temp" (because, after
messing around with it, I was going to drop it). The table name on
create was changed to "temporary". But the name "temp" still works in
some contexts (apparently all SQL) but not others (\d). See session
history below.
Is this by design?
--Lee
sql>CREATE TABLE geo_boundaries.temp (id int PRIMARY KEY, name text);
operation successful (55.041ms)
sql>SELECT * FROM geo_boundaries.temp;
+----+------+
| id | name |
+====+======+
+----+------+
0 tuples (1.818ms)
sql>INSERT INTO geo_boundaries.temp VALUES (1, 'name');
1 affected row (16.073ms)
sql>SELECT * FROM geo_boundaries.temp;
+------+------+
| id | name |
+======+======+
| 1 | name |
+------+------+
1 tuple (0.977ms)
sql>\d geo_boundaries.temp
table geo_boundaries.temp does not exist
sql>\d
TABLE geo_boundaries.states
TABLE geo_boundaries.temporary
sql>\d geo_boundaries.temporary
CREATE TABLE "geo_boundaries"."temporary" (
"id" INTEGER NOT NULL,
"name" CHARACTER LARGE OBJECT,
CONSTRAINT "temporary_id_pkey" PRIMARY KEY ("id")
);
sql>DROP TABLE geo_boundaries.temp;
operation successful (21.538ms)
sql>\d
TABLE geo_boundaries.states
sql>
--
Lee Hachadoorian
Assistant Professor in Geography, Dartmouth College
http://geospatial.commons.gc.cuny.eduhttp://freecity.commons.gc.cuny.edu
HI, i have one table as 15 millions of rows of all 2013. I all days delete
the las 2 months and return to reload them again. Today is table is very
slow, one select count takes more that 2 minutes, when it would normally to
a few seconds. I migrate this information to a new table and this table if
works fine and is very fast. What i need to do for to fix this and how i to
fix this table slow?
Thansk
Edgar M
I created a table with 43 columns. The table contains char, varchar, and tinyint datatypes. I am running on Windows Server 2008 R2 64bit.
The file contains two rows and is comma separated. The file is BCP output from MSSQL.
When I tried to bulk load into the table from a file with a COPY INTO command I got an unhelpful message.
"failed to import table"
How can I find out why the import failed?
Hi,
A few weeks ago we encountered an issue with monetdb when summing nulls,
grouped by some column. We managed to create a simple test case which
reproduces this scenario:
create table test (number int, name varchar(50));
>
> insert into test values(1,'jonh');
> insert into test values(2,'mike');
> insert into test values(3,'mary');
> insert into test values(null,'austin');
> insert into test values(null,'jonh');
> insert into test values(null,'mary');
>
> select name, sum(number)
> from test
> group by name;
The result of this query is:
> austin | 0
> jonh | 1
> mary | 3
> mike | 2
But the expected result is:
> austin | null
> jonh | 1
> mary | 3
> mike | 2
The version that we are using is Feb2013-SP5.
We found a bug report that could be related to this one, but we are not
sure: http://bugs.monetdb.org/show_bug.cgi?id=3388
Should we open a new issue for this one?
Thank you,
Pedro Salgueiro
I'm running into a problme with Ocelot, the MonetDB with OpenCL extension.
I apologize if this is not the right forum to post.
I'm seeing a query failure which was caused by MoneDB-OpenCL trying to
cache an empty BAT and ended up calling clCreateBuffer with a 0 size buffer
and failed. Any select with a where clause would trigger it.
Has anyone seen this too? It fails whether I use 32 bit or 64 bit oid,
Nvidia or AMD. I'm trying to figure if it's a real issue or if I'm missing
anything. Any input greatly appreciated.
The database was generated by TPC-H dbgen (details below). The query I used
was "select c_custkey, c_name from customer where c_custkey < 2", and
here's a trace of the sequence of calls leading to the failure starting
from the MAL statement:
1) X_52 := ocelot.thetasubselect(r1_10,A0,"<",0);(see log below)
2) ocl_select.c::ocl_subselect() line 142: ocl_cacheBAT
3) ocl_memmanager.c::ocl_cacheBAT() line 771: allocateBuffer
4) ocl_memmanager.c::allocateBuffer() line 432: clCreateBuffer
===== Source version and environment =====
1) Ocelot src installed from zip downloaded from monetdb-opencl:
msaecker-monetdb-opencl-b513b35eb84a.zip
2) Ocelot configure options:
./configure --prefix=/usr/local --enable-oid32 --enable-optimize
--disable-test
ing --disable-developer --disable-debug --disable-assert --enable-sql
--enable-m
onetdb5 --with-opencl=/usr
3) Operating System: Fedora 17 kernel 3.9.10-100
4) Hardware: NVIDIA GTX570 on Intel i7-3820 w/ 16GB mem
5) OpenCL: Nvidia OpenCL that came with CUDA 5.5 SDK
6) Database: generated w/ TPC-H dbgen using size factor s=0.01
===== operation log follows =====
[root@gen3-2 ocelot]# mclient -ddb001 -lmal
mal>ocelot.init();
mal>ocelot.listDevices();
# Device 0: NVIDIA Corporation - GeForce GTX 570 (default device)
mal>\q
[root@gen3-2 ocelot]# mclient -ddb001 -lsql
sql>\d
TABLE sys.customer
TABLE sys.lineitem
TABLE sys.nation
TABLE sys.orders
TABLE sys.part
TABLE sys.partsupp
TABLE sys.region
TABLE sys.supplier
sql>\d customer
CREATE TABLE "sys"."customer" (
"c_custkey" INTEGER NOT NULL,
"c_name" VARCHAR(25) NOT NULL,
"c_address" VARCHAR(40) NOT NULL,
"c_nationkey" INTEGER NOT NULL,
"c_phone" VARCHAR(15) NOT NULL,
"c_acctbal" DECIMAL(15,2) NOT NULL,
"c_mktsegment" VARCHAR(10) NOT NULL,
"c_comment" VARCHAR(117) NOT NULL,
CONSTRAINT "customer_c_custkey_pkey" PRIMARY KEY ("c_custkey"),
CONSTRAINT "customer_c_nationkey_fkey" FOREIGN KEY ("c_nationkey")
REFERENCES "sys"."nation" ("n_nationkey")
);
sql>select count(*) from customer;
+------+
| L1 |
+======+
| 1500 |
+------+
1 tuple (1.311ms)
sql>select c_custkey, c_name from customer where c_custkey < 2;
+-----------+--------------------+
| c_custkey | c_name |
+===========+====================+
| 1 | Customer#000000001 |
+-----------+--------------------+
1 tuple (1.776ms)
sql>set optimizer='cl_pipe';
operation successful (0.981ms)
sql>select c_custkey, c_name from customer where c_custkey < 2;
invalid/unknown response from server, ignoring output
? #Could not allocate OpenCL buffer of size: 0.
Error caching input BAT.
sql>
sql>explain select c_custkey, c_name from customer where c_custkey < 2;
+------------------------------------------------------------------------------+
| mal
+==============================================================================+
| function user.s9_1{autoCommit=true}(A0:int):void; |
| ocelot.beginQuery(); |
| X_3 := sql.mvc(); |
| X_7 := sql.bind(X_3,"sys","customer","c_custkey",0); |
| X_4:bat[:oid,:oid] := sql.tid(X_3,"sys","customer"); |
| X_51 := ocelot.thetasubselect(X_7,X_4,A0,"<",0); |
| (X_10,r1_10) := sql.bind(X_3,"sys","customer","c_custkey",2); |
| X_52 := ocelot.thetasubselect(r1_10,A0,"<",0); |
| X_13 := sql.bind(X_3,"sys","customer","c_custkey",1); |
| X_53 := ocelot.thetasubselect(X_13,X_4,A0,"<",0); |
| ocelot.sync(X_51); |
| ocelot.sync(X_52); |
| ocelot.sync(X_53); |
| X_15 := sql.subdelta(X_51,X_4,X_10,X_52,X_53); |
| X_17 := sql.projectdelta(X_15,X_7,X_10,r1_10,X_13); |
| X_18 := sql.bind(X_3,"sys","customer","c_name",0); |
| (X_20,r1_23) := sql.bind(X_3,"sys","customer","c_name",2); |
| X_22 := sql.bind(X_3,"sys","customer","c_name",1); |
| X_23 := sql.projectdelta(X_15,X_18,X_20,r1_23,X_22); |
| X_24 := sql.resultSet(2,1,X_17); |
| sql.rsColumn(X_24,"sys.customer","c_custkey","int",32,0,X_17); |
| sql.rsColumn(X_24,"sys.customer","c_name","varchar",25,0,X_23); |
| X_32 := io.stdout(); |
| sql.exportResult(X_32,X_24); |
| ocelot.endQuery(); |
| end s9_1; |
| # querylog.define("explain select c_custkey, c_name from customer where
c_cu |
: stkey < 2;","cl_pipe") :
+------------------------------------------------------------------------------+
27 tuples (1.408ms)
sql>
=======================================
Hi,
I am writing some automation to control monetdbd programmatically from
Python. While the rest of the monetdbd commands work fine, the command:
monetdbd start <farm>
never returns control to Python. I have tried this on both CPython 2.7 and
PyPy 2.0.4.
cnelson@hpvm:~/workspace/cql-db-py/cql $ python store.py
DEBUG:root:'/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd
set passphrase=letmein /tmp/test_warehouse'
DEBUG:root:'/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd
set control=yes /tmp/test_warehouse'
DEBUG:root:'/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd
set discovery=yes /tmp/test_warehouse'
DEBUG:root:'/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd
set port=60000 /tmp/test_warehouse'
INFO:root:Starting warehouse at '/tmp/test_warehouse'
cnelson@hpvm:~/workspace/cql-db-py/cql $ ps a
PID TTY STAT TIME COMMAND
31608 pts/3 S 0:00 python store.py
31613 pts/3 Z 0:00 [monetdbd] <defunct>
You can see that the process appears to have finished, but never returns
control to Python. If I run the 'stop' command from the prompt, everything
clears up and finishes:
cnelson@hpvm:~/workspace/cql-db-py/cql $
/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd stop
/tmp/test_warehouse
cnelson@hpvm:~/workspace/cql-db-py/cql $
DEBUG:root:'/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd
start /tmp/test_warehouse'
INFO:root:Stopping warehouse at '/tmp/test_warehouse'
ERROR:root:Unable to run command:
'/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd stop
/tmp/test_warehouse'
DEBUG:root:unable to open merovingian.pid: No such file or directory
ERROR:root:None
Traceback (most recent call last):
File "store.py", line 198, in <module>
w.stop()
File "store.py", line 192, in stop
self._control_warehouse("stop")
File "store.py", line 98, in _control_warehouse
raise subprocess.CalledProcessError(p.returncode, cmd, out)
subprocess.CalledProcessError: Command
'['/home/cnelson/workspace/cql-db-py/cql/storage_engine/bin/monetdbd',
'stop', '/tmp/test_warehouse']' returned non-zero exit status 1
The error here occurs because the next thing that the script will try to do
is 'stop' the monetdbd instance. Which of course I already did by hand. The
point is that once I stop the monetdbd instance OOB, the original monetdbd
where 'start' was called finally does return control to Python. Which
indicates that it is probably not a Python problem.
Any ideas?
Hello all,
I'm trying to use a LIMIT clause in a sub-query but MonetDB doesn't seem
to be able to handle it. A very simple example query would be:
SELECT COUNT(*) AS gt_result FROM (
SELECT * FROM nyc_buildings LIMIT 10
) AS temp;
The result of this query should be 10 and works fine in PostgreSQL.
However MonetDB gives the following error:
syntax error, unexpected LIMIT, expecting INTERSECT or EXCEPT or UNION
or ')' in: "<original query>"
Is this a bug in MonetDB? Any way to work around it?
Best regards,
Dennis Pallett