MonetDB-RDF-SparQL meeting (26.11.2010)

From MonetDB
Jump to: navigation, search

Participant List: Lefteris Niels Peter Sjoerd Stefan


- SparQL syntax & semantic basics

 + Lefteris:
   Support property paths! --- the code is basically there.

- Status report by Lefteris

 + RDF document shredding into 6 permutation of SPO triple table is done,
   incl. dictionary & tokenizer --- module bug fixes & performance tuning
 + support for multiple documents / incremental shredding is missing
 + RDF SQL schema contains GRAPHS table that lists existing documents and
   their SQL SPO table sets
 * Open: which is best storage (physical clustering/ordering) and most
   efficient use of that (sorted permutations + merge-joins; hash-tables +
   hash-joins, MLA + positional-join-index)

- Architecture alternatives

 + start with existing SparQL parser, e.g., redland librdf
   => SparQL algebra
      What are the SparQL algebra operators and what is the type system?
 + which relational algebra operators are required?
 + are polimorphic types / runtime typing & casting required?
   or can we make this explicit in the generated SQL query?
   (value constructing function, comparison functions, EBV, typed literals, ...)
 = S P O t s schema:
   S: oid -> tokenizer
   P: oid -> tokenizer
   O: oid/lng
   t: bte: uri: O=oid -> tokenizer
           str: O=oid -> s
           num: O=lng as is
           XML types? -> to be seen ...
   s: str (unique?)
 = exploit shredding order ("natural" MLA)
 = start with hash rather than permutations
 = support/exploit re-using hash tables with slices
 = recycler??
 + NULL semantics?
 + triple-valued booleans?

- SparQL syntax & semantics

 (details, hurdles, etc.)