[Monetdb-developers] [Monetdb-pf-checkins]pathfinder/compiler/core core.c, , 1.68, 1.69 coreopt.brg, , 1.77, 1.78 fs.brg, , 1.78, 1.79 simplify.brg, , 1.44, 1.45

Jan Rittinger jan.rittinger at uni-tuebingen.de
Fri Nov 6 14:09:47 CET 2009


On Nov 6, 2009, at 11:09, Jan Flokstra wrote:

> On Thursday 05 November 2009 23:45, Jan Rittinger wrote:
>> Hi Anandeshwar,
>>
>> thanks a lot for extending pathfinder with full-text functionality.
>>
>
> Hi Jan,
>
> Thanks for your comments. Anand had a development version on his PC  
> for half a
> year which survived a couple of disk crashes and a couple of nasty
> integration problems because the compiler code changed. Because  
> Anand moves
> on soon we decided to check his stuff in. We did a thorough test of  
> his
> changes and after his changes did not break any test from the entire
> Pathfinder test collection we decided it was time to check his stuff  
> in. We
> are aware we changed the original compiler stuff at a couple of  
> places but
> most of them were inevitable.

Hi Jan & Anand,

In my eyes your checkins should have started half a year ago (in a  
branch). Then the integration and recovery problems wouldn't have  
occurred.
Inevitable does not mean that they should never be communicated  
before. To be honest I was quite surprised by this big checkin.

>
>> I'm however a little bit taken by surprise and your checkin in my
>> opinion leaves some questions open:
>
>> - Why did you change the translation of PFcore_some in core.c?
> Could this be and old re-checkin of something changed long time ago.  
> It could
> have been accidently reintroduced with one of the reintegration or  
> crash
> recoveries. This is indeed no change necessary for a fulltext  
> construction.

Are you removing this code regression?

>
>> - Why did this rewrite touch any file in the algebra folder? -
> There were some score handling issues which could be handled easyer in
> algebra. When the complete score handling and propagation in core to  
> algebra
> is finished these algebra changes can be removed. If scores are not  
> used the
> changes do not do anyting and when used they will not break anything.

Ok, I did not make myself clear enough. I did not mean core2alg.brg.

But the following files are changed:
algebra/prop/prop_unq_names.c
algebra/prop/prop_ocol.c
xmlimport/xml2lalg.c

Why? - Are there any real changes or are only whitespaces and error  
checks 'accidentally' removed?

>
>> Whitespace removal is not ftcontains support as your message  
>> suggests.
> I do not understand which message you mean? It looks like a cryptic  
> way of you
> to say the indent and whitespace layout of files are changed  
> unnecessarily.
>

see above. If a checkin does something else than its message suggests  
then it makes things harder to understand.

>> Additionally disabling consistency checks without an explanation  
>> isn't
>> a nice thing.
> This is indeed an error. A lot of checks have been disabled during  
> development
> but they should all have be switched 'on' again. I suppose you mean  
> the
> PFprop_type_of_() check where the PFoops has been disabled. There  
> should have
> been an 'incomplete' message there.

yes, see also above.

>
>> - Why did you duplicate the translation of a for loop? - Isn't the
>> score just an optional modification like the positional variable? I  
>> at
>> least don't like this code duplication as it will make code
>> maintenance a lot harder.
>> - As far as I can oversee you did break the compositionality of the
>> compilation scheme. Now the different rules provide different result
>> schemas.
>>
>
> You are right about the last two points but the way I understand it  
> it is
> necessary to do it this way because the ftcontains clause changes the
> semantics of the for loop. I heard Stefan Klinger also complain a  
> lot about
> the horrible semantics of the xquery fulltext for loops. So the new  
> rules
> indeed break the compositionality but they are supposed to :-(

The code in core2alg.brg however speaks otherwise. From the 220 lines  
of code necessary to cope with the for-score-loop 200 lines are a  
verbatim copy of the for-rule. Only a single if-block that introduces  
the score variable is added. That's what I call unncessary code  
duplication!

>
>
>> I think (mainly because of the last issue) this checkin should better
>> go into a separate branch instead of the CVS HEAD until the
>> translation is more mature.
>>
>
> We could go into a separate branch but it looks this fulltext  
> project will be
> a long project. We will have a separate branch for year with lots of  
> overhead
> for the CSV maintainers and our mailboxes. I think 'ifdeffing' most  
> of the
> fulltext stuff will give a better overview what has changed because  
> of xquery
> fulltext and what are the real problems.  Also compiler developpers  
> have a
> better overview when they change things where fulltext isssues are  
> involved,
>

Your changes make the core->algebra translation inconsistent.
As long as this is the case I object having this code in the CVS HEAD!


You introduced calls to a function attach_score. This function is  
called whenever you might find an additional score column useful --  
i.e. in an inconsistent way. If you want to introduce scores then  
introduce them everywhere and make the compilation scheme consistent  
again (with an iter|pos|item|score schema).

This problematic behavior is perhaps most prominent in logical.c where  
you try to fix inconsistencies in a constructor instead of the code  
generation. A node constructor should not know anything about the  
translation strategy. Especially using a hard-coded column name in a  
constructor, that needs to cope with arbitrary column names, is bad  
style.

Jan

-- 
Jan Rittinger
Lehrstuhl Datenbanken und Informationssysteme
Wilhelm-Schickard-Institut für Informatik
Eberhard-Karls-Universität Tübingen

http://www-db.informatik.uni-tuebingen.de/team/rittinger





More information about the developers-list mailing list