IMHO this new coersion optimization as introduced in changeset 5786f1be12bb
only works correctly for decimal casts (coersions) that do not change the scale (number of decimals),
i.e., when the first and last argument (i.e., input and output number of decimals, respectively)
are identical.
This sanity check is currently missing in the code.
Stefan
----- Original Message -----
> Changeset: 289a5a038774 for MonetDB
> URL: http://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=289a5a038774
> Modified Files:
> monetdb5/optimizer/opt_coercion.c
> Branch: default
> Log Message:
>
> coercionOptimizerCalcStep(): fixing changeset 71bfaf7e9841: also handle
> TYPE_hge
> to re-activate the intended behavior of changeset 5786f1be12bb
>
> OPEN ISSUE / QUESTION:
>
> This optimizer now (only) handles (i.e., removes) decimal (up-)casts,
> but no plain/pure integer casts at all.
>
> (a) Shouldn't we also handle (remove) plain/pure integer casts?
>
> (b) Can we always safely remove decimal (up-)casts without altering
> semantics?
>
>
> diffs (13 lines):
>
> diff --git a/monetdb5/optimizer/opt_coercion.c
> b/monetdb5/optimizer/opt_coercion.c
> --- a/monetdb5/optimizer/opt_coercion.c
> +++ b/monetdb5/optimizer/opt_coercion.c
> @@ -65,6 +65,9 @@ coercionOptimizerCalcStep(MalBlkPtr mb,
> case TYPE_sht:
> case TYPE_int:
> case TYPE_lng:
> +#ifdef HAVE_HGE
> + case TYPE_hge:
> +#endif
> break;
> case TYPE_dbl:
> case TYPE_flt:
> _______________________________________________
> checkin-list mailing list
> checkin-list(a)monetdb.org
> https://www.monetdb.org/mailman/listinfo/checkin-list
>
--
| Stefan.Manegold(a)CWI.nl | DB Architectures (DA) |
| www.CWI.nl/~manegold/ | Science Park 123 (L321) |
| +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Dear Developers of MonetDB,
?I really appreciate your excellent work on MonetDB. I am a student in University of Wisconsin Madison and now I am doing research on processing machine learning in the column-oriented Database since we have found a way that could possibly improve the performance of learning process if we use data stored in column rather than data stored in row. For the research purpose, we need to know the detail of the join algorithm in column-store database in order to analyze the cost of the process. And monetDB, as we know, has done a very good job in this part, so I hope you could tell us about the details of the join algorithms implemented in the monetDB, that would be really helpful and we would really appreciate it! Thanks.
Best Regards
Zhiwei Fan
?
Hello,
I have a function definition
command batstemmer.stem(terms:bat[:oid,:str],
stemmer_name:str):bat[:oid,:str]
address CMDbatstem
comment "Wrapper for snowball stemmer";
which internally uses the snowball stemmer (http://snowball.tartarus.org/).
When the bat to be stemmed is large enough, mitosis will split it into
chunks and call the function "stem" on each chunk, possibly in parallel.
Problem is, the snowball stemmer implementation appears to be
thread-unsafe, which causes a SIGSEGV.
Indeed, using the no_mitosis_pipe solves the issue. However, this solution
is suboptimal.
Another solution I found is to mark the mal signature as {unsafe}. This
works, although it does something a bit silly: it splits the table into
chunks, then repacks everything, and finally runs my function on the
re-packed bat (basically wasting effort on a useless split + repack).
Now, my question is: is there a more focussed property to use? {unsafe}
implies thread-unsafe, but it is actually stronger than that. For example,
it also implies that there might be side-effects. Therefore, the result
cannot be recycled. In my case, instead, the result is perfectly safe to be
reused.
Thanks for any tip.
Roberto