Imad, I hope your success with this. Please comment if you get it, and then, could those new functions incorporate to future version of Monet? Or maybe easily compiled to current? So in the future users may suggest new useful functions (shame about SQL UDF performance)

Regards!

2016-12-28 14:48 GMT-03:00 imad hajj chahine <imad.hajj.chahine@gmail.com>:
Hi,

After reviewing all the other alternatives like SQL and Python UDF, I was either stuck on performance with SQL UDF or on usability with Python UDF (unable to use with aggregation, and not such great performance with dates), 

so I decided to go the hard way with C functions, as a bonus it will give me the possibility to change the functionalities without worrying about dependencies, which was not the case in other languages.

The purpose is to create a set of formatting functions for Year, Quarter, Month, Week and Day brackets, and of course i need to create the bulk version of each function for performance. 

Starting from the MTIMEdate_extract_year_bulk, now i have the simple function working, and successfully calling it from mclient:

str
UDFyearbracket(str *ret, const date *v)
{
if (*v == date_nil) {
*ret = GDKstrdup(str_nil);
} else {
int year;
fromdate(*v, NULL, NULL, &year);
*ret = (str) GDKmalloc(15);
sprintf(*ret, "%d", year);
}
return MAL_SUCCEED;
}


For the bulk version i get an error in the log: gdk_atoms.c:1345: strPut: Assertion `(v[i] & 0x80) == 0' failed.
str
UDFBATyearbracket(bat *ret, const bat *bid)
{
BAT *b, *bn;
BUN i,n;
str *y;
const date *t;

if ((b = BATdescriptor(*bid)) == NULL)
throw(MAL, "UDF.BATyearbracket", "Cannot access descriptor");
n = BATcount(b);

bn = COLnew(b->hseqbase, TYPE_str, BATcount(b), TRANSIENT);
if (bn == NULL) {
BBPunfix(b->batCacheid);
throw(MAL, "UDF.BATyearbracket", "memory allocation failure");
}
bn->tnonil = 1;
bn->tnil = 0;

t = (const date *) Tloc(b, 0);
y = (str *) Tloc(bn, 0);
for (i = 0; i < n; i++) {
if (*t == date_nil) {
*y = GDKstrdup(str_nil);
} else
UDFyearbracket(y, t);
if (strcmp(*y, str_nil) == 0) {
bn->tnonil = 0;
bn->tnil = 1;
}
y++;
t++;
}

BATsetcount(bn, (BUN) (y - (str *) Tloc(bn, 0)));

bn->tsorted = BATcount(bn)<2;
bn->trevsorted = BATcount(bn)<2;

BBPkeepref(*ret = bn->batCacheid);
BBPunfix(b->batCacheid);
return MAL_SUCCEED;
}

PS: I am not a c expert but i can find my way with basic operations and pointers.

Any help or suggestions is appreciated.

Thank you.

_______________________________________________
users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list