[Monetdb-developers] Upcoming feature: Recursion in Pathfinder (algebra)

Jens Teubner jens.teubner at in.tum.de
Fri Sep 22 14:11:03 CEST 2006

Hi all,

this is to keep you informed what we are currently working on in Munich.

Pathfinder's purely relational approach so far cannot handle *recursion*
in XQuery expression in a sensible way (recursion in the W3C XQuery
language is possible by means of user-defined recursive function).  The
milprint_summer branch includes some hacky implementation (with help of
MIL PROCs) that provides recursion nevertheless.

Together with people from UVA (Loredana Afanasiev in particular), we are
currently working on support for recursion in an algebraic manner.

A generic handling of recursion seems quite difficult.  The W3C XQuery
language syntax does not pose any restrictions on the style of recursion
that can be described.  The common approaches to deal with recursion in
relational systems, in contrast, cannot cope with all these variants of
recursion.  A second problem is that the kind of recursion used in a
given query (e.g., tail recursion) used in a given query seems quite
tricky to infer (and a consistent treatment of this case in the
algebraic compiler adds another challenge).

Hence, we decided to go for a limited recursion feature in Pathfinder
first.  We may later on try to add a more generic handling.  We are
currently implementing a *syntax extension* to Pathfinder that allows to
describe a limited notion of recursion.  The new construct

  with $var seeded by expr1 recurse expr2

will have the following semantics:

 1. expr1 is evaluated once and then used as a seed to initiate the
    recursion.  $var is bound to the result of expr1 in the first
    recursion step.

 2. Then, expr2 is evaluated over and over.  Variable $var is visible
    in expr2 and for each recursion step will be bound to the union of
    all items (nodes) obtained in the previous recursion steps.  (The
    results of all recursion steps are collected into the overall result
    using set semantics.)

 3. Recursion terminates if a recursion step does not contribute any new
    items (nodes) to the overall result (i.e., we reached a fixpoint).

(Remark: There still is some discussion if (for performance reasons)
each recursion step should only observe the *new* items obtained in the
previous recursion step, or whether the entire result set collected so
far should be observable in expr2.)

The `union' semantics of this new recursion construct implies that we
can do recursion only on *nodes*, not on atomic values.  This quite
directly matches the requirements of the UVA group (and others) that are
looking for a "transitive closure" operator.

Note that the above syntax extension that we are about to introduce does
not interfere with any existing constructs.  So our changes will not
affect any (existing) valid XQuery expression.

Motivation: There is a high demand for transitive closure functionality
in the group at UVA (i.e., Loredana and others), but also in other
research groups (as I was told).  Transitive closure is, e.g., a vital
component of an XPath dialect known as "Regular XPath".

Regards from Garching


Jens Teubner
Technische Universitaet Muenchen, Department of Informatics
D-85748 Garching, Germany
Tel: +49 89 289-17259     Fax: +49 89 289-17263

Software is like sex: It's better when it's free
                               -- Linus B. Torvalds

More information about the developers-list mailing list