A good strategy for loading bulk data into monetdb

Brian Hood brianh6854 at googlemail.com
Mon Apr 25 14:05:23 CEST 2016


Hi,

Take a look at
https://www.monetdb.org/Documentation/SQLreference/Permissions

You'll need to create a user and have the ADMIN OPTION

Regards,

Brian Hood

On Mon, Apr 25, 2016 at 12:22 PM, Knezevic Nikola <nikkne at gmx.ch> wrote:

> Hi Martin,
>
> if I try using "LOCKED", I'm greeted with:
>
> $ mysql -C -N -B -u reader -pxxx some_schema -h mhost -e "SELECT * FROM
> messages WHERE mDate = '2016-03-30' LIMIT 10;"|mclient -d some_schema -s
> "COPY INTO \"messages__2016-03-30\" FROM STDIN USING DELIMITERS '\t','\n'
> LOCKED" -
> COPY INTO: insufficient privileges: COPY INTO from .. LOCKED requires
> database administrator rights
>
>
> Ok, that means that my user doesn't have db admin rights. Since I couldn't
> find what is the admin role in the documentation, StackOverflow had the
> following:
>
> http://stackoverflow.com/questions/32165339/grant-privileges-to-users-on-monetdb
>
> So, I tried:
> [root at wks038 ~]# mclient -d some_schema -u monetdb
> password:
> Welcome to mclient, the MonetDB/SQL interactive terminal (Jul2015-SP4)
> Database: MonetDB v11.21.19 (Jul2015-SP4),
> 'mapi:monetdb://wks038.intra:50000/some_schema'
> Type \q to quit, \? for a list of available commands
> auto commit mode: on
> sql>grant monetdb to writer;
> operation successful (2.569ms)
> sql>\q
> [root at wks038 ~]# mclient -d some_schema -u writer
> password:
> Welcome to mclient, the MonetDB/SQL interactive terminal (Jul2015-SP4)
> Database: MonetDB v11.21.19 (Jul2015-SP4),
> 'mapi:monetdb://wks038.intra:50000/some_schema'
> Type \q to quit, \? for a list of available commands
> auto commit mode: on
> sql>set role monetdb;
> Role (monetdb) missing
>
> Then I decided to use the db admin to populate the data, but that didn't
> go well either:
>
> $ mysql -C -N -B -u reader -pxxx some_schema -h mhost -e "SELECT * FROM
> messages WHERE mDate = '2016-03-30' LIMIT 10;"|mclient -d some_schema -u
> monetdb -s "COPY INTO \"messages__2016-03-30\" FROM STDIN USING DELIMITERS
> '\t','\n' LOCKED" -
> COPY INTO: no such table 'messages__2016-03-30'
> syntax error, unexpected sqlINT in: "1971276831"
>
>
> BTW, I thought that LOCKED is only applicable for COPY FROM 'file', not
> for STDIN.
>
> Any ideas how to resolve this issues?
>
> Thanks,
> Nikola
>
>
> > Gesendet: Montag, 25. April 2016 um 11:54 Uhr
> > Von: "Martin Kersten" <martin.kersten at cwi.nl>
> > An: users-list at monetdb.org
> > Betreff: Re: A good strategy for loading bulk data into monetdb
> >
> > Hi
> >
> > To avoid the logs you should use the LOCKED version.
> > see https://www.monetdb.org/Documentation/Manuals/SQLreference/CopyInto
> >
> >
> >
> > On 25/04/16 11:45, Knezevic Nikola wrote:
> > > Hi all,
> > >
> > > I'm trying to load large amounts of data into monetdb (migrating from
> mysql). Since I'm using Jul2015-SP4 release, I can't create local files and
> I can't use external tools, my approach is as following:
> > > For each table, I partition the data based on the day, and then:
> > > 10 create one table in monetdb for each day, and a merge table that
> takes all these tables
> > > 20 fetch data from mysql (mysql -B -N -C ...)
> > > 30 _pipe_ data to mclient (that is invoked with mclient ... -c "COPY
> 2000000 RECORDS FROM STDIN INTO ..." -)
> > > 40 load data into corresponding table (the right day)
> > > 50 sleep some seconds
> > > 60 goto 20
> > >
> > > Each daily table is around 6'000'000 rows, but I have some that have
> ~150'000'000 rows.
> > >
> > > However, I noticed that when I'm doing this, I quickly run out of disk
> space, as monetdb is creating really big logs (in
> /data/monetdb/some_schema/sql_logs). Size of that folder is usually 600KB,
> but once the process of migration starts, it goes to several GBs. Not only
> that, but /data/monetdb/some_schema/bat also grows a lot.
> > >
> > > The surprising part, and the motivation for this emails, is how well
> monetdb compresses the data. If I migrate only one day worth of data, and
> then restart monetdbd (once or twice) and run 'analyze some_schema',
> sql_logs size drops to normal levels, and I get a lot of reclaimed space
> (even in bat directory). I tried to put a 'sleep(120)' after each partition
> and running 'analyze some_schema' after each iteration, but couldn't get
> the same behaviour (and it is also quite slow). Is there a way to trigger
> the compression and reduce disk space overhead without restarting mserver,
> given the above constraints?
> > >
> > > Thanks,
> > > Nikola
> > > _______________________________________________
> > > users-list mailing list
> > > users-list at monetdb.org
> > > https://www.monetdb.org/mailman/listinfo/users-list
> > >
> >
> > _______________________________________________
> > users-list mailing list
> > users-list at monetdb.org
> > https://www.monetdb.org/mailman/listinfo/users-list
> >
> _______________________________________________
> users-list mailing list
> users-list at monetdb.org
> https://www.monetdb.org/mailman/listinfo/users-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.monetdb.org/pipermail/users-list/attachments/20160425/1d7da407/attachment.html>


More information about the users-list mailing list