Thanks Martin,
I'm still a bit confused.
I assume that by "order" you mean the order with which the signatures were created. If the first matching is taken, I don't understand the substring() example that I showed. The first declared is the fully bulk version. Shouldn't that one always match (by replicating the constant columns)? I would have expected the "optimized" version with constants to be declared first, and the fully bulk version second.
Unfortunately I can't get my case working as I would expect.
To be more specific:
CREATE OR REPLACE FUNCTION tokenize(id integer, s string, d string, m tinyint, g tinyint, stemmer string, cs boolean, prob double) RETURNS TABLE (id integer, token string, prob double) EXTERNAL NAME spinque."UTF8tokenize";
# The *bold* ones are scalars
command batspinque.UTF8tokenize_c(id:bat[:int], s:bat[:str], *delims:str*, *min_tok_len:bte*, *grams:bte*, *stemmer:str*, *cs:bit*, prob:bat[:dbl]) (:bat[:int], :bat[:str], :bat[:dbl]) address SPQbat_utf8_tokenize;
sql>create table t (id int,value string,prob double); operation successful
explain select * from tokenize( (select id, value, *'SP'*, *2*, *1*, *'snowball-english'*, *false*, prob from t) ); Program contains errors.:(NONE).multiplex
(NONE).multiplex, I think, indicates that: - A fully bulk version was not found - A multiplex with a fully scalar version is attempted, but no fully scalar signature was found (BTW, a more informative error message could help here)
I don't understand why the mixed bulk/scalar MAL signature above wouldn't match the query. Again, from a logical point of view, a fully bulk signature is expected, but we saw already that in other cases (e.g. the substring() example) this shortcut is allowed. Why not here?
Roberto
On Mon, 4 Nov 2019 at 22:29, Martin Kersten martin@monetdb.org wrote:
The MAL signatures are ordered and the first one matching is used.
Such ordering is not enforced at the SQL function layer. In general it would require a type resolution that either works on matching or you have to encode the order of resolution.
On 04/11/2019 15:14, Roberto Cornacchia wrote:
Hi Panos,
Thanks for your input.
As Arjen pointed out, I completely understand why MonetDB would
replicate a constant into a column, at least from a logical point of view or as a default physical point of view.
However, this is not the only route. Take for example function
substring(start,index)
It has two implementations:
command batstr.substring( s:bat[:str], start:bat[:int],
index:bat[:int]) :bat[:str]
address STRbatsubstring;
command batstr.substring( s:bat[:str], start:int, index:int):bat[:str] address STRbatsubstringcst;
When we call it from SQL with select substring(a, 0,5) from myTable
then it can directly use the second implementation above, instead
replicating the constant parameters to fit the first implementation.
Best, Roberto
On Mon, 4 Nov 2019 at 13:15, Panagiotis Koutsourakis <
panagiotis.koutsourakis@monetdbsolutions.com mailto: panagiotis.koutsourakis@monetdbsolutions.com> wrote:
Hi Roberto, I would argue that replicating the constant is the correct behavior.
With
SELECT a, 10 FROM mytable you are actually requesting two columns, it just so happens that the
second column has a constant value. In general a SQL SELECT statement returns a SQL relation. It *has* to replicate the constant, otherwise the result would not make sense.
Would it work if you called your UDF like as follows? SELECT * FROM myfunc((SELECT a FROM mytable), 10); Best regards, Panos. On 11/4/19 12:08 PM, Roberto Cornacchia wrote: > Hi, > > When defining a UDF, e.g. > CREATE FUNCTION myfunc(a int, b int) RETURNS int > we can have: > > - a scalar implementation > command myfunc(a:int, b:int) :int > > - a bulk implementation > command myfunc(a:bat[:int], b:bat[:int]) :bat[:int] > > - a mixed implementation (e.g. a is a column, b is scalar): > command myfunc(a:bat[:int], b:int) :bat[:int] > > I'm trying to understand what the restrictions are, if any, for
the third
> case. > Is it expected to be allowed in (a combination of) the following
cases?
> > - Interleaved bulk/scalar arguments > command myfunc(a:bat[:int], b:int, c:bat[:int]) :bat[:int] > > - Table-returning function > command myfunc(a:bat[:int], b:int) (:bat[:int], :bat[:int]) > > - Table-returning function, used with a sub-select in input > SELECT * FROM myfunc( (SELECT a, 10 FROM mytable) ); > > In particular I can't get the last one to work. Even though the
sub-select
> has a constant as the second column, this constant seems to be
first
> replicated into a full column and then given to myfunc() as a
column.
> > When the constant parameters are a few, that becomes a waste of
time/space.
> Plus the annoyance of selecting the first value of a constant
column for
> the actual internal implementation, which of course expects a
scalar for
> those arguments. > > Am I missing something? Is there a way to make that work as I
would expect?
> > Thanks, Roberto > > > _______________________________________________ > users-list mailing list > users-list@monetdb.org <mailto:users-list@monetdb.org> > https://www.monetdb.org/mailman/listinfo/users-list > _______________________________________________ users-list mailing list users-list@monetdb.org <mailto:users-list@monetdb.org> https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list