aggregate function in MAL

Scott Mathieson scott.mathieson at pb.com
Fri Jul 5 12:08:31 CEST 2013


Hi,

I am trying to create a simple aggregate function entirely in MAL and am now struggling at the final hurdle. I have created the aggregate function such that it aggregates across an entire table but in the sub aggregate function I can't work out how to link the group to data BAT (g) to the data. The aggregate is designed to simply collect a sample of the values from the named column and return them as a pipe delimited string value. Can anyone tell me how to use the 'g' bat to join to the 'b' bat in the sub... aggregate function to get only the data rows for a specific group - currently I only get 2 rows of data per group even though I know that there should be 100k+ rows in each group?

Here's the code:


function bin_sample(b:bat[:oid, :any]):str;
       value := "";
       bsamp := sample.uniform(b, 1000:wrd);
       barrier (h, t) := iterator.new(bsamp);
              tmp := "" + t;
              type := bat.getTailType(bsamp);
              test := str.startsWith(type, "str");
              # Force the string type to be cast to a string or you just end up with numeric values
              barrier xxxx := test;
                  tmp := str.str(t);
              exit xxxx;

              value := value + tmp;
              value := value + "||";
              redo (h, t) := iterator.next(bsamp);
       exit (h, t);
       return value;
end bin_sample;


function subbin_sample(b:bat[:oid,:any_1],g:bat[:oid,:oid],e:bat[:oid,:any_2],skip_nils:bit) :bat[:oid,:str];
       bn:=bat.new(:oid, :str);

       # check 'e' has some values - this is the list of group IDs in the head
       # 'g' is the group to the data BAT head ID mapping
       # 'b' is the data column BAT that we're aggregating over
       barrier (grpid, t) := iterator.new(e);

              # create a single row BAT of the group id we're processing
              bn_gid := bat.new(:oid, :oid);
              bat.insert(bn_gid, grpid, grpid);

              # join this to the GID->DATA map BAT to get the Group DATA oids
              b_g2dr := bat.reverse(g);
              b_gid2data := algebra.leftjoin(bn_gid, b_g2dr);

              grpval := "";

              # Get the BAT of the actual data to be processed
              #
              # !!! HOW TO GET THIS TO WORK????????
              #
              b_data := algebra.join(b_gid2data, b);
              grpval := bin_sample(b_data);

              # Store the result for this group
              bat.insert(bn, grpid, grpval);
              redo (grpid, t) := iterator.next(e);
       exit (grpid, t);
       return bn;

end subbin_sample;

Scott Mathieson,

________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/users-list/attachments/20130705/13172809/attachment.html>


More information about the users-list mailing list