[Monetdb-developers] [Monetdb-pf-checkins] pathfinder/compiler/semantics heuristic.c

Jan Rittinger rittinge at in.tum.de
Wed Sep 12 09:54:45 CEST 2007


Hi Peter,

I have two small comments to your changes:

1. Is there a reason for including monet_utils.h here? -- In my eyes it 
introduces an unnecessary dependency to MonetDB (as we want the SQL code 
generation to compile even without MonetDB).

Index: milprint_summer.c
===================================================================
RCS file: /cvsroot/monetdb/pathfinder/compiler/mil/milprint_summer.c,v
retrieving revision 1.404
retrieving revision 1.405
diff -u -d -r1.404 -r1.405
--- milprint_summer.c	5 Sep 2007 17:06:54 -0000	1.404
+++ milprint_summer.c	11 Sep 2007 22:15:29 -0000	1.405
@@ -56,6 +56,7 @@
  #define USE_DEPRECATED_ACCESS_TO_TYPE_SYSTEM 1

  #include "pathfinder.h"
+#include "monet_utils.h"

  #include <stdio.h>
  #include <assert.h>

2. I think you should use "#pf" (PFns_pf) as prefix for variables names 
instead of "pf" (PFns_lib). Namespace PFns_lib can be used in the query 
which allows a user to create pf:heur* variables that might conflict 
lateron (whereas PFns_pf can only be created inside pathfinder).

--- NEW FILE: heuristic.c ---
...
static PFpnode_t*
var_(PFloc_t loc, int varnum)
{
     PFpnode_t *r = p_leaf(p_varref, loc);
     if (r) {
         r->sem.qname_raw.prefix = "pf";
         r->sem.qname_raw.loc = (char*) PFmalloc(8);
         if (r->sem.qname_raw.loc)
             snprintf(r->sem.qname_raw.loc, 8, "heur%03d", varnum);
     }
     return r;
}

On 09/12/2007 12:15 AM, Peter Boncz wrote with possible deletions:
> Update of /cvsroot/monetdb/pathfinder/compiler/semantics
> In directory sc8-pr-cvs16.sourceforge.net:/tmp/cvs-serv24884/compiler/semantics
> 
> Modified Files:
> 	Makefile.ag xquery_fo.c 
> Added Files:
> 	heuristic.c 
> Log Message:
> The infamous push-selections-down heuristic rewrite
> ===================================================
> 
> the good: equi comparisons on attributes, texts and elements in MXQ can now 
>           be done in sublinear time thanks to indices. This involves the
>           "reversing"of multi-step XPath paths if required.
> 
> the bad:  it is a heuristic so no performance improvement assured, though much
>           of the performance pitfalls are mitigated by a loop-lifted run-time 
>           decision whether to use the index plan or the original plan.
> 
> the ugly: well, since I can't (or couldn't) stomach XQuery core plan, it is all
>           implemeted on the absssyn level.
> 
> there are quite a few comments/explanations in compiler/semantics/heuristic.c
> 
> runtime/pathfinder.mx
> - major changes to the indexing scheme, supporting the pf:text/pf:attribute functions
> - added run-time behavior to vx_lookup: for those iterations where index lookup is 
>   not beneficial, it returns a bogus node (1 at 0, recognizable as a docnode) so
>   the original (non index-pushdown) plan is used for those iterations 
> 
> runtime/pf_support.mx
> - new xquery hash function, with less collisions and the ability to preserve equality
>   semantics for any XQuery type (currently supported..).
> - added a new dbl(byte1,...byte8) constructor that allows a IEEE double to be 
>   constructed from its 8 binary bytes (per byte to avoid little/big endian issues)
>   without any precision loss.
> - optimization for the ll_htordered_thetajoin in case we have constant join columns 
> 
> compiler/compile.c
> compiler/include/Makefile.ag
> compiler/semantics/Makefile.ag
> - add heuristic.[ch]
> - added heuristic index pushdown opt (currently only in mps, but alg/M4 should follow)
> 
> compiler/semantics/xquery_fo.c
> compiler/mil/milprint_summer.c
> - rewrote pf:atttribute and pf:text to give approximate answers
> - new pf:supernode function
> - pass parsed decimals/doubles using the new byte-wise MIL dbl() constructor
>   (to avoid precision loss in the pf -> MonetDB transition)
> - bugfix in empty ipik opt for the for-rule
> 
> compiler/debug/abssynprint.c
> - PFabssyn_stdout for debugging sanity
> compiler/debug/prettyp.c
> - margin=6 for better debug output readability 
> 

-- 
Jan Rittinger
Database Systems
Technische Universität München (Germany)
http://www-db.in.tum.de/~rittinge/




More information about the developers-list mailing list