BitWise Decomposition

From MonetDB
Jump to: navigation, search

To compile BWD[edit]

Configure options used on riga:

--enable-bwd CFLAGS=-I/ufs/holger/NVIDIA_GPU_Computing_SDK/OpenCL/common/inc --with-opencl=/ufs/holger/NVIDIA_GPU_Computing_SDK/lib64/nvidia

Configure options used on bricks04:

--enable-bwd --with-opencl=/usr/lib64 CFLAGS=-I/usr/local/cuda-5.0/include

To check your BWD installation[edit]

After your code has been compiled/installed with BWD enabled, just start your monetdb server and client as usually. If BWD has been correctly installed, the following BWD functions have been added:

$ monetdb create bwd 
created database in maintenance mode: bwd
$ monetdb release bwd 
taken database out of maintenance mode: bwd
$ mclient bwd
Welcome to mclient, the MonetDB/SQL interactive terminal (Jul2012)
Database: MonetDB v11.11.5 (Jul2012), 'mapi:monetdb://bricks04.scilens:50000/bwd'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>select * from functions where mod = 'bwd';
+------+--------------------+-------------------------------------------------------------------+------+-------+------+----------+---------+
| id   | name               |                                                                   | mod  | sql   | type | side_eff | schema_ |
:      :                    :                                                                   :      :       :      : ect      : id      :
+======+====================+===================================================================+======+=======+======+==========+=========+
| 7017 | bwdecomposedec     | create function bwdecomposedec(col decimal(9,2), bits integer)    | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns varchar(4096) external name bwd.decompose;                :      :       :      :          :         :
| 7022 | bwdecomposevarchar | create function bwdecomposevarchar(col varchar(1), bits integer)  | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns varchar(4096) external name bwd.decompose;                :      :       :      :          :         :
| 7027 | bwdecomposedec9_2  | create function bwdecomposedec9_2(col decimal(9,2), bits integer) | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns varchar(4096) external name bwd.decompose;                :      :       :      :          :         :
| 7032 | bwdecomposedec8_5  | create function bwdecomposedec8_5(col decimal(8,5), bits integer) | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns varchar(4096) external name bwd.decompose;                :      :       :      :          :         :
| 7037 | bwdecomposedec7_5  | create function bwdecomposedec7_5(col decimal(7,5), bits integer) | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns varchar(4096) external name bwd.decompose;                :      :       :      :          :         :
| 7042 | bwdecomposedate    | create function bwdecomposedate(col date, bits integer)           | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns varchar(4096) external name bwd.decompose;                :      :       :      :          :         :
| 7047 | bwdecomposeint     | create function bwdecomposeint(col integer, bits integer)         | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns varchar(4096) external name bwd.decompose;                :      :       :      :          :         :
| 7057 | bwdevices          | create function bwdevices()                                       | bwd  | true  |    1 | false    |    2000 |
:      :                    : returns table(platform int, device int, devicedescription string, :      :       :      :          :         :
:      :                    :    platformdescription string) external name bwd.deviceinfo;      :      :       :      :          :         :
+------+--------------------+-------------------------------------------------------------------+------+-------+------+----------+---------+
8 tuples (4.469ms)
sql>

You can get a list of all devices that can be used by BWD. On 'bricks04', you should get the following output:

sql> select * from bwdevices();
+----------+--------+---------------------------------------------------------+-----------------------+
| platform | device | devicedescription                                       | platformdescription   |
+==========+========+=========================================================+=======================+
|        0 |      0 | 0: Tesla K20c (CL_DEVICE_TYPE_GPU), endianness: little  | OpenCL 1.1 CUDA 4.2.1 |
|        0 |      1 | 1: Tesla K20c (CL_DEVICE_TYPE_GPU), endianness: little  | OpenCL 1.1 CUDA 4.2.1 |
+----------+--------+---------------------------------------------------------+-----------------------+
2 tuples (5.4s)

Use BWD for GeoTIFF image array processing[edit]

NB: you much have the libgeotiff installed on your machine to get the geotiff module of MonetDB enabled.

The SQL statements were run on 'bricks04'.

Step 1. Start a MoentDB server[edit]

With the property 'readonly=false' (which is the default)

$ monetdbd create /[path-to]/dbfarm-clash/
$ monetdbd start /[path-to]/dbfarm-clash/
$ monetdb create bwd
created database in maintenance mode: bwd
$ monetdb release bwd
taken database out of maintenance mode: bwd

Step 2. Load a GeoTIFF image as a SciQL array[edit]

$ mclient -H bwd
Welcome to mclient, the MonetDB/SQL interactive terminal (Jul2012)
Database: MonetDB v11.11.5 (Jul2012), 'mapi:monetdb://bricks04.scilens:50000/bwd'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql> set schema rs;
sql> call rs.attach('/scratch/zhang/TrueMarble.2km.21600x10800_greyscale.tif');
sql> call rs.import(1);
sql> select count(*), 21600 * 10800 from rs.image1; -- to check loading has succeeded
+-----------+----------------------+
| L1        | sql_mul_single_value |
+===========+======================+
| 233280000 |            233280000 |
+-----------+----------------------+
1 tuple (6.834ms)
sql>select * from rs.image1 order by x desc , y desc limit 10; -- takes quite some time to reorder >233 million records
+-------+-------+-----------+
| x     | y     | intensity |
+=======+=======+===========+
| 21599 | 10799 |         1 |
| 21599 | 10798 |         1 |
| 21599 | 10797 |         1 |
| 21599 | 10796 |         1 |
| 21599 | 10795 |         1 |
| 21599 | 10794 |         1 |
| 21599 | 10793 |         1 |
| 21599 | 10792 |         1 |
| 21599 | 10791 |         1 |
| 21599 | 10790 |         1 |
+-------+-------+-----------+
10 tuples (6m 0s)

Step 2.5 This is a simple workaround of the problem that BWD doesn't support decomposing SMALLINT columns yet (the intensity column is of type SMALLINT):

sql> create array img1 (x int dimension[21600], y int dimension[10800], intensity int);
operation successful (64.060ms)
sql> insert into img1 (select * from rs.image1);
233280000 affected rows (4m 29s)

Step 3. Shutdown the MonetDB server, and then restart it with the property 'readonly=true' ![edit]

$ monetdb stop bwd
stopping database 'bwd'... done
$ monetdb get all bwd
     name          prop     source           value
bwd              name      -        bwd
bwd              type      default  database
bwd              shared    default  yes
bwd              nthreads  default  32
bwd              optpipe   default  default_pipe
bwd              readonly  default  no
bwd              nclients  default  64
$ monetdb set readonly=true bwd
$ monetdb get all bwd
     name          prop     source           value
bwd              name      -        bwd
bwd              type      default  database
bwd              shared    default  yes
bwd              nthreads  default  32
bwd              optpipe   default  default_pipe
bwd              readonly  local    yes
bwd              nclients  default  64

Step 4. Decompose all columns[edit]

$ mclient -H bwd
Welcome to mclient, the MonetDB/SQL interactive terminal (Jul2012)
Database: MonetDB v11.11.5 (Jul2012), 'mapi:monetdb://bricks04.scilens:50000/bwd'
Type \q to quit, \? for a list of available commands
auto commit mode: on
sql>set optimizer='sequential_pipe';
operation successful (0.534ms)
sql>select sys.bwdecomposeint(x, 30), sys.bwdecomposeint(y, 30), sys.bwdecomposeint(intensity, 30) from rs.img1;
+---------------------------------+---------------------------------+---------------------------------+
| bwdecomposeint_x                | bwdecomposeint_y                | bwdecomposeint_intensity        |
+=================================+=================================+=================================+
| successfully decomposed bat 848 | successfully decomposed bat 732 | successfully decomposed bat 761 |
+---------------------------------+---------------------------------+---------------------------------+
1 tuple (24.7s)

Double check if all columns have indeed been decomposed by repeating the SELECT query until it says "bat <nnn> already decomposed" for all columns:

sql>select sys.bwdecomposeint(x, 30), sys.bwdecomposeint(y, 30), sys.bwdecomposeint(intensity, 30) from rs.img1;
+----------------------------+----------------------------+----------------------------+
| bwdecomposeint_x           | bwdecomposeint_y           | bwdecomposeint_intensity   |
+============================+============================+============================+
| bat 848 already decomposed | bat 732 already decomposed | bat 761 already decomposed |
+----------------------------+----------------------------+----------------------------+
1 tuple (2.771ms)

NB: the results of the bwdecomposeint() are temporary indices, which might be removed by an "aggressive" cleaning thread at any time. To avoid this, start mserver5 manually (i.e., not through monetdb), with the option --set gdk_vmtrim=no! For more information, see Stefan M.'s email on Feb. 19, 2013: '"BLOG": disable vmtrim thread for "hot" runs with hashes or imprints'

Step 5. Now we can start using the BWD optimiser[edit]

sql>set optimizer='bwd_pipe';
operation successful (0.856ms)

Step 6. Let's try some queries[edit]

sql>explain select count(*) from rs.img1 where intensity between 200 and 255;
+---------------------------------------------------------------------------------------------------------------------+
| mal                                                                                                                 |
+=====================================================================================================================+
| function user.s4_1{autoCommit=true}(A0:int,A1:int):void;                                                            |
|     X_4:int  := sql.mvc();                                                                                          |
|     X_5:bat[:oid,:int]  := sql.bind(X_4:int ,"rs","img1","intensity",0);                                            |
|     X_39:bat[:oid,:void]  := bwd.uselectapproximate(X_5:bat[:oid,:int] ,A0:int ,A1:int ,true,true);                 |
|     X_43:bat[:oid,:int]  := bwd.semijoinapproximate(X_5:bat[:oid,:int] ,X_39:bat[:oid,:void] );                     |
|     X_10:bat[:oid,:void]  := bwd.uselectrefine(X_5:bat[:oid,:int] ,A0:int ,A1:int ,true,true,X_43:bat[:oid,:int] ); |
|     X_40:bat[:oid,:oid]  := bwd.markT(X_43:bat[:oid,:int] ,0@0:oid);                                                |
|     X_11:bat[:oid,:oid]  := algebra.markT(X_10:bat[:oid,:void] ,0@0:oid);                                           |
|     X_41:bat[:oid,:oid]  := bat.reverse(X_40:bat[:oid,:oid] );                                                      |
|     X_12:bat[:oid,:oid]  := bat.reverse(X_11:bat[:oid,:oid] );                                                      |
|     X_13:bat[:oid,:int]  := sql.bind(X_4:int ,"rs","img1","x",0);                                                   |
|     X_42:bat[:oid,:int]  := bwd.leftjoinapproximate(X_41:bat[:oid,:oid] ,X_13:bat[:oid,:int] );                     |
|     X_15:bat[:oid,:int]  := bwd.leftjoinrefine(X_12:bat[:oid,:oid] ,X_13:bat[:oid,:int] ,X_42:bat[:oid,:int] );     |
|     X_16:wrd  := aggr.count(X_15:bat[:oid,:int] );                                                                  |
|     bwd.exportValue(1,"rs.img1","L1","wrd",64,0,6,s4_1:void ,"");                                                   |
|     sql.exportValue(1,"rs.img1","L1","wrd",64,0,6,X_16:wrd ,"");                                                    |
| end s4_1;                                                                                                           |
+---------------------------------------------------------------------------------------------------------------------+
17 tuples (2.491ms)
sql>select count(*) from rs.img1 where intensity between 200 and 255;
+---------+
| L1      |
+=========+
| 8236718 |
+---------+
1 tuple (4.1s)
sql>select count(*) from rs.img1 where intensity between 230 and 255;
+---------+
| L1      |
+=========+
| 4047715 |
+---------+
1 tuple (3.7s)
sql>select count(*) from rs.img1 where intensity between 250 and 255;
+---------+
| L1      |
+=========+
| 1747364 |
+---------+
1 tuple (3.5s)
sql>select count(*) from rs.img1 where intensity between 254 and 255;
+--------+
| L1     |
+========+
| 429842 |
+--------+
1 tuple (3.2s)
sql>select count(*) from rs.img1 where intensity between 255 and 255;
+--------+
| L1     |
+========+
| 172871 |
+--------+
1 tuple (3.4s)


Compare with the time on CPU only:

sql>set optimizer='sequential_pipe';
operation successful (0.559ms)
sql>select count(*) from rs.img1 where intensity between 200 and 255;
+---------+
| L1      |
+=========+
| 8236718 |
+---------+
1 tuple (1.4s)
sql>select count(*) from rs.img1 where intensity between 230 and 255;
+---------+
| L1      |
+=========+
| 4047715 |
+---------+
1 tuple (1.2s)
sql>select count(*) from rs.img1 where intensity between 250 and 255;
+---------+
| L1      |
+=========+
| 1747364 |
+---------+
1 tuple (1.1s)
sql>select count(*) from rs.img1 where intensity between 254 and 255;
+--------+
| L1     |
+========+
| 429842 |
+--------+
1 tuple (1.0s)
sql>select count(*) from rs.img1 where intensity between 255 and 255;
+--------+
| L1     |
+========+
| 172871 |
+--------+
1 tuple (3.3s)
sql>select count(*) from rs.img1 where intensity between 255 and 255;
+--------+
| L1     |
+========+
| 172871 |
+--------+
1 tuple (23.334ms)

Compare with default MonetDB setting:

sql>set optimizer='default_pipe';
sql>select count(*) from rs.img1 where intensity between 200 and 255;
+---------+
| L1      |
+=========+
| 8236718 |
+---------+
1 tuple (129.612ms)
sql>select count(*) from rs.img1 where intensity between 230 and 255;
+---------+
| L1      |
+=========+
| 4047715 |
+---------+
1 tuple (117.408ms)
sql>select count(*) from rs.img1 where intensity between 250 and 255;
+---------+
| L1      |
+=========+
| 1747364 |
+---------+
1 tuple (114.747ms)
sql>select count(*) from rs.img1 where intensity between 254 and 255;
+--------+
| L1     |
+========+
| 429842 |
+--------+
1 tuple (89.848ms)
sql>select count(*) from rs.img1 where intensity between 255 and 255;
+--------+
| L1     |
+========+
| 172871 |
+--------+
1 tuple (85.162ms)

Oeps, need more work...

Notes[edit]

Since this technique is still highly experimental, there are several things you need to keep in mind when use it:

  • BWD can only be used in read-only mode
  • The decompositions are not persistent. So, you need to run bwdecompose...() every time the MonetDB server is restarted
  • Before decomposing the columns, you must set the optimizer to the sequential_pipe
  • When you want to run queries using the BWD technique, you must set the optimizer to the bwd_pipe
  • Sometimes, executing a query using BWD gives the error that a certain BAT is not decomposed, e.g., "while running uselect, I noticed that bat is not decomposed: 761", even if the column is successfully decomposed earlier (according to the output of bwdecompose...(). In this case, repeat Step 4 above and repeat the SELECT query until all columns are reported as "already decomposed".
  • Known feature: the decomposed columns are very in-persisten => they can magically disappear, e.g., after the SELECT query in Step 4 has been called and subsequently no query is executed for some time; or repeat the SELECT query several times and the decompositions will randomly disappear. (Reported as a Bitbucket issue "decompositions of columns randomly disappear")
  • Known feature: the bwdecompose...() functions ONLY work with the sequential_pipe. DO NOT use other optimiser pipelines!

Implementing the BWD variant of your MAL operator[edit]

The BWD optimiser walks through a MAL plan and rewrite the MAL operators it recognises with their BWD variant.

Q: How does the BWD optimiser decide/recognise which MAL operators to rewrite?

So, to start using the BWD optimiser, you need to implement the BWD variant of your MAL operator. Assume your original MAL operator has several input parameters and returns a BAT as output:

myop (p1, p2, p3, ...) => resbat:BAT

Then, you need to implement two new operators:

myopApproximate (p1, p2, p3, ...) => approximation:BAT
myopRefine (p1, p2, p3, ..., approximation:BAT) => resbat:BAT

The BWD optimiser will then replace your original MAL operator with the two new operators.

A: These are static rules that are coded into the MAL-optimizer. If you want to change the rules you can either modify the existing optimizer or, even better, create a new one that encodes your rules.


Q: Is there any registration mechanism to tell the BWD optimiser about the new operators?

A: No, not at runtime. The optimizer code has a section that looks a bit like this:

if(match_function(oldProgram[i], algebraRef, thetauselectRef)
						 || match_function(oldProgram[i], algebraRef, uselectRef)
						 || match_function(oldProgram[i], algebraRef, semijoinRef)
						 || match_function(oldProgram[i], algebraRef, selectNotNilRef)
						 || match_function(oldProgram[i], groupRef, multicolumnsRef)
						 || match_function(oldProgram[i], groupRef, doneRef)
						 || match_function(oldProgram[i], algebraRef, joinRef)
						 || match_function(oldProgram[i], aggrRef, sumRef)
....

. This is where the optimizer detects MAL-statements that can be rewritten into Approximate&Refine statements. If you add your stuff here, it might work. However, you'd better copy the relevant code your own optimizer. I (holger) won't fix your code if it was just mingled with mine.


The function myopApproximate() is a wrapper of the OpenCL kernel program, which will be executed on the GPU. See for examples of how the approximation functions should be implemented the existing approximation functions in operations.c, OpenCL documentations [1,2], and the paper submitted to SIGMOD2013 [??].

The resulting BAT of myopApproximate(), containing an approximation of the final results, is passed to the refinement function myopRefine() as the last parameter. The function myopRefine() will then refine the approximation, using the residuals, to compute the final, exact results. See refineLoop() for example of how the refinement function should be implemented.

Several auxiliary functions are defined in utilities.h, which can be used to, e.g., access the approximation and residuals of a BAT.

[1] A nice introduction: http://developer.amd.com/tools/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/introductory-tutorial-to-opencl/

[2] A nice tutorial: http://developer.amd.com/download/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf