hi, monetdb does not support SUM() OVER commands.. does anyone have a
smart alternative to implement this? thanks
SELECT
batch_id,
job_count,
SUM(job_count) OVER (ORDER BY duration) as cumjobs,
SUM(job_count) over () as totjobs,
duration
FROM
test_data ;
Hi,
As per MonetDB Solution suggestion, we are asking here about this topic.
We are studying how MonetDB store its database on disk.
So far we were able to understand pretty much all the files (BAT,
journal, BBP.dir, heap, hash, imprints, and so on). We have custom tools
to decode and dump them (experimental at this point).
There is one thing however that is unclear: we thought that "delta BAT",
which are the BAT assigned to an SQL tables to list the OIDs of rows
that were deleted (named "D_<schema>_<table>" internaly), would contains
only *uniques* values (since you can never delete twice the same row).
But on several databases that we are running, we found that some of
theses BATs contain duplicates. Lot of duplicates actually. Especially
on D_sys__columns and D_sys__tables BATs (respectively for the
sys._columns and sys._tables system tables). Some databases do not have
this "issue" (while they all have the same schema, and process the same
kind of data as the other ones).
Can someone explain if duplicates are expected in theses BATs ?
Here is an excerpt of D_sys__tables:
# hexdump -C 02/210.tail
00000000 2e 00 00 00 00 00 00 00 2f 00 00 00 00 00 00 00 |......../.......|
00000010 30 00 00 00 00 00 00 00 31 00 00 00 00 00 00 00 |0.......1.......|
00000020 32 00 00 00 00 00 00 00 33 00 00 00 00 00 00 00 |2.......3.......|
00000030 34 00 00 00 00 00 00 00 35 00 00 00 00 00 00 00 |4.......5.......|
00000040 36 00 00 00 00 00 00 00 37 00 00 00 00 00 00 00 |6.......7.......|
00000050 42 00 00 00 00 00 00 00 43 00 00 00 00 00 00 00 |B.......C.......|
00000060 44 00 00 00 00 00 00 00 45 00 00 00 00 00 00 00 |D.......E.......|
00000070 46 00 00 00 00 00 00 00 47 00 00 00 00 00 00 00 |F.......G.......|
00000080 48 00 00 00 00 00 00 00 49 00 00 00 00 00 00 00 |H.......I.......|
00000090 4a 00 00 00 00 00 00 00 4b 00 00 00 00 00 00 00 |J.......K.......|
000000a0 4c 00 00 00 00 00 00 00 4d 00 00 00 00 00 00 00 |L.......M.......|
000000b0 4e 00 00 00 00 00 00 00 4f 00 00 00 00 00 00 00 |N.......O.......|
000000c0 50 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00 |P.......P.......|
000000d0 51 00 00 00 00 00 00 00 50 00 00 00 00 00 00 00 |Q.......P.......|
000000e0 51 00 00 00 00 00 00 00 52 00 00 00 00 00 00 00 |Q.......R.......|
000000f0 50 00 00 00 00 00 00 00 51 00 00 00 00 00 00 00 |P.......Q.......|
00000100 52 00 00 00 00 00 00 00 53 00 00 00 00 00 00 00 |R.......S.......|
00000110 50 00 00 00 00 00 00 00 51 00 00 00 00 00 00 00 |P.......Q.......|
00000120 52 00 00 00 00 00 00 00 53 00 00 00 00 00 00 00 |R.......S.......|
...
You can clearly see duplicates OID (the first one being 0x50 at 0xc8).
Its entry in BBP.dir (split into sections):
136 32 tmp_210 tmpr_210 02/210 610523782 2 171807 0 0 171807 172032 0 0 0 0
void 0 1 1793 0 0 0 0 0 1000651 0 0 0
oid 8 0 1024 24 25 27 1 46 773603235 1374456 1376256 1
We are using: MonetDB 5 server v11.21.14 (64-bit, 64-bit oids, 128-bit integers).
--
Frédéric Jolliton
Sécuractive
I noticed a difference between
- a hash-based string selection from a persistent, read-only table
- a hash-based join on the same table and same column
They both build a hash on the same string column (verified with gdb), but
the select can reuse the hash (second call is almost free), while the join
keeps rebuilding the hash.
Is this expected?
Roberto
Hi,
Last year, I was looking for the MAL definition of group_concat and
Nik Schuiling gave me the MAL definition which is below.
We compile MonetDB from source and this definition was working on
MonetDB-11.19.9 (Oct2014-SP2). Since this version, there has been a
number of newer versions of MonetDB. So, I was trying to upgrade our
system to the latest version of MonetDB (11.21.13). After the
installation of the newer version, I ran:
select "DataSetId", group_concat("FieldName") from datasetmarkermeta
group by "DataSetId";
from mclient interface and got the following error message:
TypeException:kddart.subgroup_concat[5]:'algebra.uselect' undefined
in: TIDs:any := algebra.uselect(g:bat[:oid,:oid],grpid:oid);
MALException:kddart.subgroup_concat[0]:Error in cloned function
TypeException:user.s2_1[15]:'kddart.subgroup_concat' undefined in:
X_27:bat[:oid,:str] :=
kddart.subgroup_concat(X_26:bat[:oid,:str],X_16:bat[:oid,:oid],r1_16:bat[:oid,:oid],true:bit);
program contains errors
It says algebra.uselect function is not defined. How do I fix this problem?
group_concat MAL definition start
-------------------------------------------
module kddart;
command group_concat(b:bat[:oid,:str]):str
address KDDARTgroup_concat
comment "Mal kddart";
function group_concat(b:bat[:oid, :str]):str;
value := "";
barrier (h, t) := iterator.new(b);
value := value + t;
value := value + "";
redo (h, t) := iterator.next(b);
exit (h, t);
return value;
end group_concat;
function subgroup_concat(b:bat[:oid,:any_1],g:bat[:oid,:oid],e:bat[:oid,:any_2],skip_nils:bit)
:bat[:oid,:str];
nw := aggr.count(e);
nl := calc.lng(nw);
bn := bat.new(:oid, :str, nl);
# check 'e' has some values - this is the list of group IDs in the head
# 'g' is the group to the data BAT head ID mapping
# 'b' is the data column BAT that we are aggregating over
barrier (grpid, t) := iterator.new(e);
# select GID from TID->GID map to get matching TIDs
TIDs := algebra.uselect(g,grpid);
# get DATA for matching TIDs
b_data := algebra.kintersect(b,TIDs);
# aggregate
grpval := group_concat(b_data);
# Store the result for this group
bat.insert(bn, grpid, grpval);
redo (grpid, t) := iterator.next(e);
exit (grpid, t);
return bn;
end subgroup_concat;
Hello,
We are trying to transition from a Postgres database to MonetDB.
We had a few R function calls, so we decided to try embedded R.
One of our R functions is designed to query the postgres database to get some more data to process. The problem, now, is how to integrate this function in Monet.
We are using embeddedR to reduce latency, so using MonetDB.R inside the embedded function (if it is even possible) to query the database would be a total waste.
Clearly the ultimate solution would be to make all the queries outside the function, but it would be really really hard to achieve this result given how the function is designed.
What I am asking is the is some solution to this, like having some way to query the database inside embedded R avoiding the overhead of MonetDB.R.
Thank you,
Stefano
Hello,
We are trying to transition from a Postgres database to MonetDB.
We had a few R function calls, so we decided to try embedded R.
One of our R functions is designed to query the postgres database to get some more data to process. The problem, now, is how to integrate this function in Monet.
We are using embeddedR to reduce latency, so using MonetDB.R inside the embedded function (if it is even possible) to query the database would be a total waste.
Clearly the ultimate solution would be to make all the queries outside the function, but it would be really really hard to achieve this result given how the function is designed.
What I am asking is the is some solution to this, like having some way to query the database inside embedded R avoiding the overhead of MonetDB.R.
Thank you,
Stefano
Hello,
I have the following query:
CREATE TABLE mytable AS
(
SELECT <fields….>
FROM
table1 AS a,
my_R_function(a.field1, ...) AS asd
WHERE <some conditions>
) WITH DATA;
MonetDB says:
SELECT: no such column ‘a.field1’
The issue is in how I call the R function, since when I remove it everything’s just fine.
My aim is to call the function for each row of table1. The new table will have a couple of fields from table1 and a bunch of other fields computed from the R function. How am I supposed to do that?
Thank you,
Stefano
Hello. Maybe someone have experience with this kind of problem:
I’ve downloaded drivers from http://dev.monetdb.org/downloads/Java/Jul2015-SP4/ <http://dev.monetdb.org/downloads/Java/Jul2015-SP4/> added them to KNIME and now, when i’m trying to make something more difficult than just counting rows in table, I see an error:
ERROR Database Reader 0:3 SQL Exception reading Object of type "2005": Operation getCharacterStream() currently not supported - all further errors are suppressed and reported on debug level only
ERROR Database Reader 0:3 Execute failed: Cell at index 1 is null!
Hi all -
I apologize for all the emails, but I am banging on monetdb a lot lately.
Today I've run into a problem storing a -1 in a "tinyint" column. My java code writes bytes to a binary file. This file will be loaded via binary bulk load into a "tinyint" column. The values in this column range from -1 to 3.
I am writing the negative value to the byte file as below:
byte missingData = (byte)-1;
writer.writeByte(missingData);
The data loads correctly into monetdb, and I can print it:
sql>\d minusonetest
CREATE TABLE "sys"."minusonetest" (
"chr" INTEGER,
"pos" INTEGER,
"coelorachis" TINYINT
);
sql>
sql>select * from minusonetest where chr=2 limit 10;
+------+------+-------------+
| chr | pos | coelorachis |
+======+======+=============+
| 2 | 1 | -1 |
| 2 | 2 | -1 |
| 2 | 3 | -1 |
| 2 | 4 | -1 |
| 2 | 5 | -1 |
| 2 | 6 | -1 |
| 2 | 7 | -1 |
| 2 | 8 | -1 |
| 2 | 9 | -1 |
| 2 | 10 | -1 |
+------+------+-------------+
10 tuples (10.7s)
sql>
BUT, when I run the select statement below, it tosses me out of the DB and prints errors to the log file:
select * from minusone test where chr=2 and coelorachis > -1;
The select fails for any value in the "coelorachis > x" clause above (i.e., > -1, > 0, > 1, etc)
If I run the above select and have NOT written any negative numbers to the column, all is good. It only fails if there are negative numbers stored in the tinyint column.
The merovingian.log file has these messages:
2016-05-25 12:47:23 ERR testDB[21730]: *** Error in `/usr/bin/mserver5': free(): invalid next size (fast): 0x00007fb0ac003920 ***
2016-05-25 12:47:23 ERR testDB[21730]: ======= Backtrace: =========
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libc.so.6(+0x7d023)[0x7fb0df3bc023]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libbat.so.12(GDKfree+0x13)[0x7fb0e1b21233]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libbat.so.12(BATsubunique+0x18a9)[0x7fb0e1d23f69]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libbat.so.12(BATsubunique+0xbe7)[0x7fb0e1d232a7]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libbat.so.12(BATimprints+0x581)[0x7fb0e1cef791]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libbat.so.12(BATsubselect+0x4ecf)[0x7fb0e1ac532f]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libbat.so.12(BATthetasubselect+0x14b)[0x7fb0e1ac78eb]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libmonetdb5.so.19(ALGthetasubselect2+0x1c4)[0x7fb0e210db64]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libmonetdb5.so.19(runMALsequence+0xdeb)[0x7fb0e20ac55b]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libmonetdb5.so.19(+0x45e5b)[0x7fb0e20ade5b]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libpthread.so.0(+0x7dc5)[0x7fb0df707dc5]
2016-05-25 12:47:23 ERR testDB[21730]: /lib64/libc.so.6(clone+0x6d)[0x7fb0df43528d]
2016-05-25 12:47:23 ERR testDB[21730]: ======= Memory map: ========
2016-05-25 12:47:23 ERR testDB[21730]: 00400000-00405000 r-xp 00000000 08:03 22036485 /usr/bin/mserver5
2016-05-25 12:47:23 ERR testDB[21730]: 00604000-00605000 r--p 00004000 08:03 22036485 /usr/bin/mserver5
2016-05-25 12:47:23 ERR testDB[21730]: 00605000-00606000 rw-p 00005000 08:03 22036485 /usr/bin/mserver5
2016-05-25 12:47:23 ERR testDB[21730]: 00606000-00608000 rw-p 00000000 00:00 0
2016-05-25 12:47:23 ERR testDB[21730]: 00892000-01e3e000 rw-p 00000000 00:00 0 [heap]
2016-05-25 12:47:23 ERR testDB[21730]: 7fabab28a000-7fabab29f000 r-xp 00000000 08:03 22020111 /usr/lib64/libgcc_s-4.8.3-20140911.so.1
2016-05-25 12:47:23 ERR testDB[21730]: 7fabab29f000-7fabab49e000 ---p 00015000 08:03 22020111 /usr/lib64/libgcc_s-4.8.3-20140911.so.1
2016-05-25 12:47:23 ERR testDB[21730]: 7fabab49e000-7fabab49f000 r--p 00014000 08:03 22020111 /usr/lib64/libgcc_s-4.8.3-20140911.so.1
2016-05-25 12:47:23 ERR testDB[21730]: 7fabab49f000-7fabab4a0000 rw-p 00015000 08:03 22020111 /usr/lib64/libgcc_s-4.8.3-20140911.so.1
Any ideas? There is no problem if I use Byte.MIN_VALUE instead of -1. This translates to NULL and is selectable.
Lynn
After all this, it seems I was doing something stupid. I had originally
used Float.MIN_VALUE, which is the smallest POSITIVE minimum value. I
also tried hard coding -3.40282e+38, but this had issues as Stephan
pointed out.
What I neglected to try, and what works for float is
public static float flt_nil = -Float.MAX_VALUE;
Sorry for all the confusion, but I learned something. Null values for
float types DO work.
Thanks for your patience and help - Lynn
On 5/24/16, 10:44 AM, "users-list on behalf of Lynn Carol Johnson"
<users-list-bounces+lcj34=cornell.edu(a)monetdb.org on behalf of
lcj34(a)cornell.edu> wrote:
>Yes, we’re using float as 32 bit, storing into a real.
>
>Still no sunshine with your suggestions below. Java doesn’t allow me to
>assign
> public static float flt_nil = (float)
>-340282346638528859811704183484516925440;
>It complains “the literal -340282346638528859811704183484516925440 of type
>int is out of range”. This isn’t surprising as 32 bits will only hold up
>to -2^31 (-2,147,483,648)
>
>Using static float flt_nil = (float) 0xff7fffff; gives me the value
>-8388609.0, which is incorrect.
>
>Does storing null work for you using “float” types? Is this just a
>java-to-C issue? Still looks like we’ll need to pick a default other than
>NULL for our “real” data.
>
>Thanks - Lynn
>
>
>On 5/24/16, 9:53 AM, "users-list on behalf of Stefan Manegold"
><users-list-bounces+lcj34=cornell.edu(a)monetdb.org on behalf of
>Stefan.Manegold(a)cwi.nl> wrote:
>
>>Hi Lynn,
>>
>>public static float flt_nil = (float) -3.40282e+38;
>>
>>does not necessarily create the exact bit pattern
>>as C macro -FLT_MAX, mainly as "-3.40282e+38" is only an
>>approximation of the exact value:
>>"-340282346638528859811704183484516925440"
>>
>>You can either try
>>
>>public static float flt_nil = (float)
>>-340282346638528859811704183484516925440;
>>
>>or use the actual bit pattern in hex notation, e.g.,
>>
>>public static float flt_nil = (float) 0xff7fffff;
>>
>>Please take my Java(?) code snippets with a grain of salt;
>>I'm everything but a Java expert --- or better, I'm a Java
>>illiterate ...
>>
>>Be also aware that
>>C type float, MAL typr :flt and SQL type REAL are 32-bit,
>>while
>>C type double, MAL type :dbl, and SQL types FLOAT & DOUBLE are 64-bit
>>
>>Hence, for 64-bit types you'd need to use one of
>>
>>public static double dbl_nil = (double)
>>-179769313486231570814527423731704356798070567525844996598917476803157260
>>7
>>8002853876058955863276687817154045895351438246423432132688946418276846754
>>6
>>7035375169860499105765512820762454900903893289440758685084551339423045832
>>3
>>6903222948165808559332123348274797826204144723168738177180919299881250404
>>0
>>26184124858368;
>>
>>public static double dbl_nil = (double) 0xffefffffffffffff;
>>
>>
>>Hope this helps ...
>>
>>Best,
>>Stefan
>>
>>----- On May 24, 2016, at 3:42 PM, Lynn Carol Johnson lcj34(a)cornell.edu
>>wrote:
>>
>>> Thanks, Stephan.
>>>
>>> Short response good news: using int_nil = ‹2147483648 (-2^31) is
>>> translated properly to NULL for int values. The IS NULL and IS NOT
>>>NULL
>>> queries are fine.
>>>
>>> Short response bad news: I am unable to get NULL to work for Float
>>> regardless of what value I used. Details below:
>>>
>>> I replaced my Float.MIN_VALUE with flt_nil defined as below :
>>>
>>> public static float flt_nil = (float) -3.40282e+38;
>>>
>>> Ran again with just 50 rows (created as binary files). At this point
>>>my
>>> belief is NULL does not work at all for float with binary files. There
>>>are
>>> 16 of the
>>> 50 values for column amesbyd_weir that are null, but I can only select
>>> those values if I check for the column < -3.40282e+38. Here are the
>>>sql
>>> results:
>>>
>>> sql>select count(amesbyd_weir) from weirtestnull;
>>> +------+
>>>| L1 |
>>> +======+
>>>| 50 |
>>> +------+
>>> 1 tuple (1.933ms)
>>> sql>select chr,pos,amesbyd_weir from weirtestnull where amesbyd_weir IS
>>> NULL;
>>> +-----+-----+--------------+
>>>| chr | pos | amesbyd_weir |
>>> +=====+=====+==============+
>>> +-----+-----+--------------+
>>> 0 tuples (1.291ms)
>>> sql>select chr,pos,amesbyd_weir from weirtestnull where amesbyd_weir =
>>> -3.40282e+38;
>>> +-----+-----+--------------+
>>>| chr | pos | amesbyd_weir |
>>> +=====+=====+==============+
>>> +-----+-----+--------------+
>>> 0 tuples (0.528ms)
>>> sql>
>>> sql>select chr,pos,amesbyd_weir from weirtestnull where amesbyd_weir <
>>> -3.40282e+38;
>>> +------+--------+-----------------+
>>>| chr | pos | amesbyd_weir |
>>> +======+========+=================+
>>>| 10 | 228864 | -3.40282e+38 |
>>>| 10 | 228962 | -3.40282e+38 |
>>>| 10 | 228972 | -3.40282e+38 |
>>>| 10 | 229011 | -3.40282e+38 |
>>>| 10 | 229025 | -3.40282e+38 |
>>>| 10 | 229187 | -3.40282e+38 |
>>>| 10 | 229367 | -3.40282e+38 |
>>>| 10 | 229405 | -3.40282e+38 |
>>>| 10 | 229444 | -3.40282e+38 |
>>>| 10 | 229593 | -3.40282e+38 |
>>>| 10 | 229615 | -3.40282e+38 |
>>>| 10 | 229671 | -3.40282e+38 |
>>>| 10 | 229829 | -3.40282e+38 |
>>>| 10 | 230000 | -3.40282e+38 |
>>>| 10 | 230052 | -3.40282e+38 |
>>>| 10 | 230120 | -3.40282e+38 |
>>> +------+--------+-----------------+
>>> 16 tuples (2.999ms)
>>> sql>
>>>
>>>
>>> NOTE: Dumping the 50 row database to a .txt file, dropping the table
>>>and
>>> reloading into monetdb did NOT fix it as it did when I used Float.NaN
>>> values. That is actually irrelevant for us as our db tables are large
>>> (2.1 Billion rows in 1 table, 80M in another) and as we add additional
>>> columns, binary loading is the only way to go. We love the binary load
>>> option, and have been very happy with the speed of monetdb processing.
>>>
>>> For now, we¹re going to follow PLINK convention and use -99 for our
>>>Float
>>> null values.
>>>
>>> Thanks - Lynn
>>>
>>>
>>> On 5/23/16, 4:06 PM, "users-list on behalf of Stefan Manegold"
>>> <users-list-bounces+lcj34=cornell.edu(a)monetdb.org on behalf of
>>> Stefan.Manegold(a)cwi.nl> wrote:
>>>
>>>>Hi Lynn,
>>>>
>>>>thanks for diving into this!
>>>>
>>>>The information that you loaded your data using from binary files using
>>>>copy *binary* into is crucial for us to understand and investigate the
>>>>problem!
>>>>
>>>>In fact, I recently discovered and fixed a similar bug; cf.,
>>>>https://www.monetdb.org/bugzilla/show_bug.cgi?id=3937
>>>>
>>>>This fix is in the Jul2013-SP3 bug-fix release.
>>>>Hence, I'm wondering why you (still) experience this (or a similar)
>>>>problem.
>>>>
>>>>Of what type are your data / columns?
>>>>Integer? decimal? floating point? string?
>>>>
>>>>If more then one type, do you experience these problems with all types
>>>>or just with some?
>>>>If only some, which?
>>>>
>>>>If the problem occurs only with floating point numbers,
>>>>are you sure you're using the exact NULL representation that MonetDB
>>>>uses?
>>>>
>>>>Otherwise a potential source of the problem --- I'm just speculating
>>>>---
>>>>could be the following:
>>>>
>>>>You might have "non-valid" floating point numbers like NaN or INF in
>>>>your
>>>>data
>>>>that are not known to SQL and hence MonetDB.
>>>>If so,
>>>>When parsed from text, MonetDB does (seems to?) silently turn them into
>>>>(MonetDB's
>>>>internal representation of) NULL.
>>>>However, when loding your binary data, MonetDB stores these values
>>>>"as-is"
>>>>not recognizing them as NULL; then processing does not recognize them
>>>>as
>>>>NULL, either;
>>>>only rendering produces NULL from the "non-valid" internal
>>>>representation.
>>>>
>>>>For the correct internal representation of NULL values in MonetDB per
>>>>type see
>>>>https://www.monetdb.org/wiki/MonetDB_type_system
>>>>
>>>>If my speculation is true, we'd need to consider making our copy binary
>>>>into
>>>>bulk-loading for floating point numbers (more) fail-save by actually
>>>>scanning
>>>>each column and checking for "invalid" floating point number ---
>>>>whatever
>>>>that
>>>>might cost ...
>>>>
>>>>Best,
>>>>Stefan
>>>>
>>>>----- On May 23, 2016, at 8:37 PM, Lynn Carol Johnson lcj34(a)cornell.edu
>>>>wrote:
>>>>
>>>>> Hi Anthony -
>>>>>
>>>>> I¹ve narrowed down the problem. The bug isn¹t related to size, but
>>>>>rather how
>>>>> the table data was created/loaded. Loading from a .txt file gives
>>>>>good
>>>>> behavior, loading from binary does not.
>>>>>
>>>>> Normally I create binary files for each column and use the binary
>>>>>bulk
>>>>>loader to
>>>>> load. That is what was failing. When I first tested with 50 columns,
>>>>>I
>>>>>created
>>>>> the test file from sql via the command
>>>>> copy (select * Š) into Œ/workdir/ŠkellyFirst50.txt using delimiters
>>>>>Œ\t¹,¹\n¹;
>>>>>
>>>>> Then I created a new table called kellyFirst50 and used the ³COPY
>>>>>INTO
>>>>> kellyfirst50 from Œ/workdirŠkellyFirst50.txt¹ using delimiters
>>>>>Œ\t,¹\n¹; ²
>>>>> command. Checking for ³IS NULL² worked, so I thought it was a size
>>>>>problem.
>>>>>
>>>>> But, when trying to figure out at what point the file size mattered,
>>>>>I
>>>>>ran java
>>>>> scripts to create binary files of ever decreasing sizes, then loaded
>>>>>them via
>>>>> the COPY BINARY commands. Checking for IS NULL always failed here,
>>>>>even
>>>>>when I
>>>>> only loaded 50 values.
>>>>>
>>>>> Here are the results when loading from binary files into monetdb.
>>>>>Using
>>>>>IS NOT
>>>>> NULL works, but IS NULL does not:
>>>>>
>>>>>
>>>>>
>>>>> sql>\</workdir/lcj34/monetdbFiles/isNull_test/create_weirFSTTest.sql
>>>>>
>>>>> operation successful
>>>>>
>>>>> sql>\</workdir/lcj34/monetdbFiles/isNull_test/copy_weirFSTTest.sql
>>>>>
>>>>> 50 affected rows
>>>>>
>>>>>
>>>>>
>>>>> sql>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> sql>select count(*) from weirtestnull;
>>>>>
>>>>> +------+
>>>>>
>>>>>| L1 |
>>>>>
>>>>> +======+
>>>>>
>>>>>| 50 |
>>>>>
>>>>> +------+
>>>>>
>>>>> 1 tuple (2.014ms)
>>>>>
>>>>> sql>select count(amesbyd_weir) from weirtestnull;
>>>>>
>>>>> +------+
>>>>>
>>>>>| L1 |
>>>>>
>>>>> +======+
>>>>>
>>>>>| 50 |
>>>>>
>>>>> +------+
>>>>>
>>>>>
>>>>>
>>>>> 1 tuple (0.865ms)
>>>>>
>>>>> sql>select count(amesbyd_weir) from weirtestnull where amesbyd_weir
>>>>>IS
>>>>>NOT NULL;
>>>>>
>>>>> +------+
>>>>>
>>>>>| L1 |
>>>>>
>>>>> +======+
>>>>>
>>>>>| 34 |
>>>>>
>>>>> +------+
>>>>>
>>>>> 1 tuple (3.005ms)
>>>>>
>>>>>
>>>>>
>>>>> sql>
>>>>>
>>>>> sql>select * from weirtestnull where amesbyd_weir IS NULL;
>>>>>
>>>>>
>>>>>+-----+-----+---------------+--------------+--------------+-----------
>>>>>-
>>>>>--
>>>>>--+--------------------+------------------+----------------+
>>>>>
>>>>>| chr | pos | amesbycn_weir | amesbyd_weir | amesbyf_weir |
>>>>>amesbynam_weir |
>>>>>| amesbyturkeyp_weir | dentbyflint_weir | dentbynam_weir |>
>>>>>
>>>>>
>>>>>+=====+=====+===============+==============+==============+===========
>>>>>=
>>>>>==
>>>>>==+====================+==================+================+
>>>>>
>>>>>
>>>>>+-----+-----+---------------+--------------+--------------+-----------
>>>>>-
>>>>>--
>>>>>--+--------------------+------------------+----------------+
>>>>>
>>>>> 0 tuples (4.041ms) !4 columns dropped!
>>>>>
>>>>>
>>>>>
>>>>> sql>
>>>>>
>>>>>
>>>>> Dump the data into a .txt file:
>>>>>
>>>>>
>>>>>
>>>>> sql>copy (select * from weirtestnull) into
>>>>> '/workdir/db_outfiles/weirtestnull.txt' using delimiters '\t','\n';
>>>>>
>>>>> 50 affected rows (3.809ms)
>>>>>
>>>>> sql>
>>>>>
>>>>> Now drop the table, recreate it, but this time load from the txt file
>>>>>created
>>>>> above:
>>>>>
>>>>>
>>>>>
>>>>> sql>drop table weirtestnull;
>>>>>
>>>>> operation successful (1.948ms)
>>>>>
>>>>> sql>\</workdir/lcj34/monetdbFiles/isNull_test/create_weirFSTTest.sql
>>>>>
>>>>> operation successful
>>>>>
>>>>> sql>copy into weirtestnull from
>>>>>'/workdir/db_outfiles/weirtestnull.txt'
>>>>>using
>>>>> delimiters '\t','\n';
>>>>>
>>>>> 50 affected rows (57.643ms)
>>>>>
>>>>> sql>
>>>>>
>>>>>
>>>>> Re-run the commands I now get correct values for IS NULL:
>>>>>
>>>>>
>>>>>
>>>>> sql>select count(*) from weirtestnull;
>>>>>
>>>>> +------+
>>>>>
>>>>>| L1 |
>>>>>
>>>>> +======+
>>>>>
>>>>>| 50 |
>>>>>
>>>>> +------+
>>>>>
>>>>> 1 tuple (1.851ms)
>>>>>
>>>>> sql>select count(amesbyd_weir) from weirtestnull;
>>>>>
>>>>> +------+
>>>>>
>>>>>| L1 |
>>>>>
>>>>> +======+
>>>>>
>>>>>| 34 |
>>>>>
>>>>> +------+
>>>>>
>>>>> 1 tuple (0.983ms)
>>>>>
>>>>> sql>
>>>>>
>>>>> sql>select * from weirtestnull where amesbyd_weir IS NULL;
>>>>>
>>>>>
>>>>>+------+--------+------------+------------+------------+------------+-
>>>>>-
>>>>>--
>>>>>---------+-------------+-------------+-------------+-------------+----
>>>>>-
>>>>>--
>>>>>------+
>>>>>
>>>>>| chr | pos | amesbycn_w | amesbyd_we | amesbyf_we | amesbynam_ |
>>>>>amesbyturke |
>>>>>| dentbyflint | dentbynam_w | dentbyturke | flintbynam_ | flintbyturk
>>>>>|>
>>>>>
>>>>> : : : eir : ir : ir : weir : yp_weir : _weir : eir : ypen_weir : weir
>>>>>:
>>>>> eypen_weir :>
>>>>>
>>>>>
>>>>>+======+========+============+============+============+============+=
>>>>>=
>>>>>==
>>>>>=========+=============+=============+=============+=============+====
>>>>>=
>>>>>==
>>>>>======+
>>>>>
>>>>>| 10 | 228864 | 2.95496e-0 | null | null | -0.0001263 | null | null |
>>>>>-0.00028749
>>>>>| | null | -0.00029885 | null |
>>>>>
>>>>> : : : 5 : : : 25 : : : 4 : : 1 : :
>>>>>
>>>>>| 10 | 228962 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 228972 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229011 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229025 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229187 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229367 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229405 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229444 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229593 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229615 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229671 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 229829 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 230000 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 230052 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>| 10 | 230120 | null | null | null | null | null | null | null | null
>>>>>|
>>>>>null |
>>>>>| null |
>>>>>
>>>>>
>>>>>+------+--------+------------+------------+------------+------------+-
>>>>>-
>>>>>--
>>>>>---------+-------------+-------------+-------------+-------------+----
>>>>>-
>>>>>--
>>>>>------+
>>>>>
>>>>>
>>>>>
>>>>> 16 tuples (5.2
>>>>>
>>>>>
>>>>> My java program uses Float.NaN and Integer.MIN_VALUE, Byte.MIN_VALUE
>>>>>to
>>>>>get
>>>>> ³null² stored in the db.
>>>>>
>>>>> When we load directly from binary, are the null values created
>>>>>differently? Is
>>>>> something not marked that gets marked when loading from text files?
>>>>>
>>>>> With this information, can you reproduce, or do you want me to give
>>>>>you
>>>>>my
>>>>> binary files in a dropbox somewhere?
>>>>>
>>>>> Thanks - Lynn
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> users-list mailing list
>>>>> users-list(a)monetdb.org
>>>>> https://www.monetdb.org/mailman/listinfo/users-list
>>>>
>>>>--
>>>>| Stefan.Manegold(a)CWI.nl | DB Architectures (DA) |
>>>>| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
>>>>| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
>>>>_______________________________________________
>>>>users-list mailing list
>>>>users-list(a)monetdb.org
>>>>https://www.monetdb.org/mailman/listinfo/users-list
>>>
>>> _______________________________________________
>>> users-list mailing list
>>> users-list(a)monetdb.org
>>> https://www.monetdb.org/mailman/listinfo/users-list
>>
>>--
>>| Stefan.Manegold(a)CWI.nl | DB Architectures (DA) |
>>| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
>>| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
>>_______________________________________________
>>users-list mailing list
>>users-list(a)monetdb.org
>>https://www.monetdb.org/mailman/listinfo/users-list
>
>_______________________________________________
>users-list mailing list
>users-list(a)monetdb.org
>https://www.monetdb.org/mailman/listinfo/users-list