Hi,
Please verify this - when you output a file from the table with 'copy...
into...' and terminate that client - the output file is still being
created even though the client was terminated. Thanks. Dariusz.
Hi,
I'm tring to build a database with around 1B rows. I'm testing on a 64bit
AMD Opteron 2200 dual core x 2 CPU system with 16GB mem and 16G swap
connected to an external fiber storage.
I've got a set of datafiles around 8GB. each with 100M records.
I'm running the following commands.
copy 300000000 records into test_table from
'datafile-1.csv','datafile-2.csv','datafile-3.csv';
copy 300000000 records into test_table from
'datafile-4.csv','datafile-5.csv','datafile-6.csv';
copy 300000000 records into test_table from
'datafile-7.csv','datafile-8.csv','datafile-9.csv';
The first command completes around 15mins. The second also completes around
15min. However the the third command forces the system to swap and at some
point cause the system to crash.
I've also tried by NOT giving the record count i.e. copy into test_table
from 'datafile-X.csv' which were not only slowe but also resulted in the
same swap outage crash.
I need to load around 3 billion records. Any thoughts on what might be the
problem here?
Thanks in advance,
Regards,
Ukyo
Hi,
Can somebody help me with this one. I was running a very long query,
decided to terminate that job. Now when I try to get the client all I
see is blinking prompt. How do I get back the sql prompt? Thanks. Dariusz.
Hi,
Is there a limit on how many distinct values you can have in a column.
I've created a table with 100,000,000 records and one column (varchar
(16)) has 2,600,000 distinct values so when I run the count on that
column - it runs forever. When I watch memory usage it goes to few
gigabytes and then it drops to almost nothing, then it goes to few
gigs and then it drops. Thanks. Dariusz.
Thanks Stefan,
> with loading ~1.5 GB on a 32 bit machine, there is a reasonable chance that
> the loading process (that needs to be able to address all data concurrently)
> comes close to or even hits the 32-bit (i.e., max. 4 GB, possibly only 2 GB)
> address space limit.
Does this limitation refers to the size of the data that I am adding per COPY INTO (was 0.2 GB) or the final size of the table (was at 1.9).
Assuming the latter this would give a hard limit for table size under Win32. If this is the case, a workaround could be partioning the data into several tables. What is more efficient in MonetDB: chunks of rows at full columns width or chunks of columns at full row length?
> Would it be possible that you could provide us with your data?
> That would help us a lot to analyze, locate and hopefully fix the crash...
I give you access by seperate mail, you can also create the test data file using the R script snippet in my previous mail.
Have a nice weekend
Jens
Hi,
MonetDB-SQL server does not release the memory after
loading data into table (loaded 40GB)
terminating client session (closed the window waiting for count to complete)
Also after restarting the server it takes very long time for the client
to get the prompt
Why is that? Thanks. Dariusz.
Hi,
I experience predictable crashes of MonetDB5 SQLserver under Win32bit when COPY INTO in batches of 1 Mio more than ~9Mio rows, which is ~1.5 GB DB size.
Any settings to avoid that? Is this a bug? Is MonetDB size limited to available RAM? Or is this a known limit just under Win32?
Logs of sql client error messages and details for replication below.
Regards
Jens Oehlschlägel
Lenovo Thinkcentre with 3 GB RAM under Windows XP Professional SP2
RAM consumption of m5server.exe during load ~300MB according to task manager
Free RAM 2.1 GB
Size of dbfarm finally 1.9 GB
DiskFreeSpace 1.8 TB
[System Summary]
Item Value
OS Name Microsoft Windows XP Professional
Version 5.1.2600 Service Pack 2 Build 2600
OS Manufacturer Microsoft Corporation
System Name GEMUN-38396-10-
System Manufacturer LENOVO
System Model 6073AA7
System Type X86-based PC
Processor x86 Family 6 Model 15 Stepping 11 GenuineIntel ~2992 Mhz
Processor x86 Family 6 Model 15 Stepping 11 GenuineIntel ~2992 Mhz
BIOS Version/Date LENOVO 2RKT41AUS, 3/20/2008
SMBIOS Version 2.5
Windows Directory C:\WINDOWS
System Directory C:\WINDOWS\system32
Boot Device \Device\HarddiskVolume1
Locale Germany
Hardware Abstraction Layer Version = "5.1.2600.3023 (xpsp_sp2_qfe.061030-0020)"
Time Zone W. Europe Standard Time
Total Physical Memory 3,072.00 MB
Available Physical Memory 1.70 GB
Total Virtual Memory 2.00 GB
Available Virtual Memory 1.96 GB
Page File Space 5.79 GB
Page File D:\pagefile.sys
sql>START TRANSACTION;
0 tuples
Timer 0.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 67515.000 msec 0 rows
sql>COMMIT;
0 tuples
Timer 1255.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 0.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 79141.000 msec 0 rows
sql>COMMIT;
0 tuples
Timer 6041.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 0.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 74227.000 msec 0 rows
sql>COMMIT;
0 tuples
Timer 6943.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 0.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 73084.000 msec 0 rows
sql>COMMIT;
0 tuples
Timer 6599.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 0.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 67158.000 msec 0 rows
sql>COMMIT;
0 tuples
Timer 6520.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 18529.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 65106.000 msec 0 rows
sql>COMMIT;
0 tuples
Timer 5184.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 0.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 71136.000 msec 0 rows
sql>COMMIT;
0 tuples
Timer 9104.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 21021.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 69761.000 msec 0 rows
sql>COMMIT;
0 tuples
!MALException: MT_mmap: MapViewOfFileEx(1558, 2, 0, 0, 14680064, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: MT_mmap: MapViewOfFileEx(1554, 2, 0, 0, 14680064, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: cannot mmap(): name=15\1511, ext=tail
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: failed name=15\1511, ext=tail
!ERROR: MT_mmap: MapViewOfFileEx(1558, 2, 0, 0, 14680064, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: MT_mmap: MapViewOfFileEx(1554, 2, 0, 0, 14680064, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: cannot mmap(): name=15\1511, ext=tail
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: failed name=15\1511, ext=tail
!ERROR: MT_mmap: MapViewOfFileEx(1558, 2, 0, 0, 14680064, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: MT_mmap: MapViewOfFileEx(1554, 2, 0, 0, 14680064, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: cannot mmap(): name=15\1511, ext=tail
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: failed name=15\1511, ext=tail
!ERROR: MT_mmap: MapViewOfFileEx(1558, 2, 0, 0, 34603008, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: MT_mmap: MapViewOfFileEx(1554, 2, 0, 0, 34603008, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: cannot mmap(): name=11\1117, ext=tail
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: failed name=11\1117, ext=tail
!ERROR: MT_mmap: MapViewOfFileEx(1558, 2, 0, 0, 69206016, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: MT_mmap: MapViewOfFileEx(1554, 2, 0, 0, 70254592, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: MT_mmap: MapViewOfFileEx(1558, 2, 0, 0, 70254592, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: MT_mmap: MapViewOfFileEx(1554, 2, 0, 0, 70254592, 0) failed
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: cannot mmap(): name=15\1534, ext=tail
!OS: Not enough storage is available to process this command.
!ERROR: GDKload: failed name=15\1534, ext=tail
0 tuples
Timer 9276.000 msec 0 rows
sql>START TRANSACTION;
0 tuples
Timer 0.000 msec 0 rows
sql>COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/d
atabasetest1mio.csv' USING DELIMITERS ',';
Rows affected 1000000
Timer 96373.000 msec 0 rows
sql>COMMIT;
MAPI = voc@localhost:50000
ACTION= read_line
QUERY = COMMIT;
ERROR = Connection terminated
Timer 44089.000 msec 0 rows
-- restarting the crashed server looks good, but
# MonetDB server v5.8.2, based on kernel v1.26.2
# Serving database 'demo', using 1 thread
# Compiled for i686-pc-win32/32bit with 32bit OIDs dynamically linked
# Copyright (c) 1993-July 2008 CWI.
# Copyright (c) August 2008- MonetDB B.V., all rights reserved
# Visit http://monetdb.cwi.nl/ for further information
#warning: please don't forget to set your vault key!
#(see D:\MonetDB\MonetDB5\etc\monetdb5.conf)
# Listening for connection requests on mapi:monetdb://127.0.0.1:50000/
# MonetDB/SQL module v2.26.2 loaded
-- when starting mclient the server says
>!ERROR: MT_mmap: MapViewOfFileEx(1528, 2, 0, 0, 69206016, 0) failed
!OS: Not enough storage is available to process this command.
#GDKmmap(69206016) fail => BBPtrim(enter) usage[mem=165285248,vm=1495793664]
#GDKmmap(69206016) fail => BBPtrim(ready) usage[mem=165284840,vm=574619648]
#GDKmmap: recovery ok. Continuing..
-- replication instructions
-- log in to freshly installed MonetDB and create user
CREATE USER "voc" WITH PASSWORD 'voc' NAME 'VOC Explorer' SCHEMA "sys";
CREATE SCHEMA "voc" AUTHORIZATION "voc";
ALTER USER "voc" SET SCHEMA "voc";
-- log in as voc and execute the following statements
CREATE TABLE databasetest (
"id" INTEGER
, "ubyte1" SMALLINT
, "ubyte2" SMALLINT
, "ubyte3" SMALLINT
, "ubyte4" SMALLINT
, "ubyte5" SMALLINT
, "ubyte6" SMALLINT
, "ubyte7" SMALLINT
, "ubyte8" SMALLINT
, "ubyte9" SMALLINT
, "ubyte10" SMALLINT
, "ubyte11" SMALLINT
, "ubyte12" SMALLINT
, "ubyte13" SMALLINT
, "ubyte14" SMALLINT
, "ubyte15" SMALLINT
, "smallint1" SMALLINT
, "smallint2" SMALLINT
, "smallint3" SMALLINT
, "smallint4" SMALLINT
, "smallint5" SMALLINT
, "smallint6" SMALLINT
, "smallint7" SMALLINT
, "smallint8" SMALLINT
, "smallint9" SMALLINT
, "smallint10" SMALLINT
, "smallint11" SMALLINT
, "smallint12" SMALLINT
, "smallint13" SMALLINT
, "smallint14" SMALLINT
, "smallint15" SMALLINT
, "float1" DOUBLE
, "float2" DOUBLE
, "float3" DOUBLE
, "float4" DOUBLE
, "float5" DOUBLE
, "float6" DOUBLE
, "float7" DOUBLE
, "float8" DOUBLE
, "float9" DOUBLE
, "float10" DOUBLE
, "float11" DOUBLE
, "float12" DOUBLE
, "float13" DOUBLE
, "float14" DOUBLE
, "float15" DOUBLE
, "float16" DOUBLE
, "float17" DOUBLE
, "float18" DOUBLE
, "float19" DOUBLE
, "float20" DOUBLE
)
;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
START TRANSACTION;
COPY 1000000 OFFSET 2 RECORDS INTO "voc"."databasetest" FROM 'F:/Periscope/databasetest1mio.csv' USING DELIMITERS ',';
COMMIT;
-- the file databasetest1mio.csv can be created with the following R script
# this is an R comment and you find R at CRAN.r-project.org
filename <- "F:/Periscope/databasetest1mio.csv"
# create 1Mio rows in 100 batches of 10,000, total of 174MB csv file, takes ~2 minutes
system.time({
b <- 1000L # number of batches
n <- 1000L # batch size
csvfile <- file(filename, open="w")
for (i in 1:b){
x <- data.frame(
id = ((i-1L)*n+1L):(i*n) # 4-byte integer record identifier
, ubyte1 = sample(0:35, n, TRUE, prob=NULL) # unsigned byte columns (1 byte)
, ubyte2 = sample(0:1, n, TRUE, prob=NULL)
, ubyte3 = sample(0:3, n, TRUE, prob=NULL)
, ubyte4 = sample(0:9, n, TRUE, prob=NULL)
, ubyte5 = sample(0:24, n, TRUE, prob=NULL)
, ubyte6 = sample(0:99, n, TRUE, prob=NULL)
, ubyte7 = sample(-127:127, n, TRUE, prob=NULL)
, ubyte8 = sample(0:1, n, TRUE, prob=c(0.75, 0.25))
, ubyte9 = sample(0:1, n, TRUE, prob=c(0.9,0.1))
, ubyte10 = sample(0:1, n, TRUE, prob=c(0.96,0.04))
, ubyte11 = sample(0:1, n, TRUE, prob=c(0.99,0.01))
, ubyte12 = sample(0:1, n, TRUE, prob=c(0.996,0.004))
, ubyte13 = sample(0:1, n, TRUE, prob=c(0.999,0.001))
, ubyte14 = sample(0:1, n, TRUE, prob=c(0.9996,0.0004))
, ubyte15 = sample(0:1, n, TRUE, prob=c(0.9999,0.0001))
, smallint1 = sample(1:4000, n, TRUE, prob=NULL) # smallint columns (2 bytes)
, smallint2 = sample(1:2, n, TRUE, prob=NULL)
, smallint3 = sample(1:4, n, TRUE, prob=NULL)
, smallint4 = sample(1:10, n, TRUE, prob=NULL)
, smallint5 = sample(1:25, n, TRUE, prob=NULL)
, smallint6 = sample(1:100, n, TRUE, prob=NULL)
, smallint7 = sample(1:256, n, TRUE, prob=NULL)
, smallint8 = sample(1:1000, n, TRUE, prob=NULL)
, smallint9 = sample(1:32000, n, TRUE, prob=NULL)
, smallint10 = sample(0:1, n, TRUE, prob=c(0.75, 0.25))
, smallint11 = sample(0:1, n, TRUE, prob=c(0.9,0.1))
, smallint12 = sample(0:1, n, TRUE, prob=c(0.96,0.04))
, smallint13 = sample(0:1, n, TRUE, prob=c(0.99,0.01))
, smallint14 = sample(0:1, n, TRUE, prob=c(0.996,0.004))
, smallint15 = sample(0:1, n, TRUE, prob=c(0.999,0.001))
, float1 = round(runif(n, 0, 100), 1) # float columns (4 bytes)
, float2 = round(runif(n, 0, 100), 1)
, float3 = round(runif(n, 0, 100), 1)
, float4 = round(runif(n, 0, 100), 1)
, float5 = round(runif(n, 0, 100), 1)
, float6 = round(runif(n, 0, 100), 1)
, float7 = round(runif(n, 0, 100), 1)
, float8 = round(runif(n, 0, 100), 1)
, float9 = round(runif(n, 0, 100), 1)
, float10 = round(runif(n, 0, 100), 1)
, float11 = round(runif(n, 0, 100), 1)
, float12 = round(runif(n, 0, 100), 1)
, float13 = round(runif(n, 0, 100), 1)
, float14 = round(runif(n, 0, 100), 1)
, float15 = round(runif(n, 0, 100), 1)
, float16 = round(runif(n, 0, 100), 1)
, float17 = round(runif(n, 0, 100), 1)
, float18 = round(runif(n, 0, 100), 1)
, float19 = round(runif(n, 0, 100), 1)
, float20 = round(runif(n, 0, 100), 1)
)
write.table(x, file=csvfile, row.names=FALSE, col.names=(i==1L), append=i!=1L, dec=".", sep=",")
}
close(csvfile)
})