Apache Pig

Apache Pig mk Fri, 07/19/2013 - 10:31

In many cases, the data that is supposed to be stored in MonetDB is created by a large-scale data aggregation process. Apache Pig is a popular tool for this task. There is a possibility of directly generating MonetDB binary column files using the MonetDB-Pig-Layer.

Inside a Pig script, this is used as follows:

STORE raw INTO './results/' USING nl.cwi.da.monetdb.loader.hadoop.MonetDBStoreFunc;
This will generate a SQL schema file, a set of column files, and a load command. Keep in mind that currently, only primitive Java types such as Integer, Long, Float, Double, String etc. are supported.