Can I expect to be able to use BBP code without GDKinit'ing?

Hannes Mühleisen Hannes.Muehleisen at cwi.nl
Thu Nov 24 09:28:19 CET 2016


I suggest you look at the MonetDBLite init code, we worked long to make it minimal.

https://github.com/hannesmuehleisen/MonetDBLite/blob/master/src/embedded/embedded.c#L73

Hannes


----- Original Message -----
From: "Eyal Rozenberg" <E.Rozenberg at cwi.nl>
To: "MonetDB Developers" <developers-list at monetdb.org>
Sent: Wednesday, November 23, 2016 8:56:47 PM
Subject: Can I expect to be able to use BBP code without GDKinit'ing?

Bottom line question:

BBPinit() is not exposed in gdk_bbp.h, but rather only called from 
GDKinit() (which sees it through gdk_private.h). I just want to load 
data from persisted BAT files - not into MonetDB and not within the 
mserver5 process - so, speaking conceptually, I do not want to 
initialize the GDK, but do want to initialize the BBP. Questions:

* Should BBPinit() work outside the scope of a GDKinit()?
* If not, is it adaptable so as to allow this?
* More generally, how much of GDK do I need to have running, just so as 
load persisted data into memory, using the code in gdk_bbp (or a slight 
variation thereof)?

Now for the introduction and the motivation:

As I mentioned a few MADADMs ago, I'll need to load persisted MonetDB 
columns into my GPU kernel testbench. This is relatively simple for 
numeric columns, given some scripting work to extract catalog data in a 
parsable format or to build named symlinks to columns (eg 
/path/to/dbfarm/tpch-sf-1/named_bats/lineitem/l_shipdate -> 
/path/to/dbfarm/tpch-sf-1/bat/12/34.tail).

But it won't do for other kinds of data, most importantly strings - 
which I do want to work on. Plus, writing a wider-scope loader will let 
me avoid depending on MonetDB running for access to persisted columns.

So, I'm writing a (selective) loader of persisted MonetDB columns, or a 
BBP loader if you will. My strategy is the following:

1. Copying the files in the GDK codebase which are necessary for 
building a binary which can call all code in gdk_bbp.h and not have 
unmet dependencies (this is mostly done).
2. Get that leaner slice of code, with a small main(), to actually work, 
i.e. not fail due to weird errors or result in junk data.
3. Peel away functions from the code I've copied from the repository 
which are not actually used.
4. Peel away the parts of the code which are not necessary for the 
actual loading - from within functions. In this initial stage this may 
involve work that becomes unnecessary when you have auxiliary data 
obtained from querying the DB (such as table-column name to BAT filename 
mapping).
5. Expand functionality and/or optimize performance after the peeling 
and/or C++ify for integration with my code
6. If other people / MDBS are interested, collaborate on making this 
support different BBP versions - so that the result is an inter-version 
export/forensics library.
_______________________________________________
developers-list mailing list
developers-list at monetdb.org
https://www.monetdb.org/mailman/listinfo/developers-list


More information about the developers-list mailing list