How to perform copy into in multiple clients at the same time?

Yinjie Lin exialin37 at gmail.com
Fri Dec 28 07:25:09 CET 2018


Hi Roberto,

Thanks for your reply. What you provided is really helpful, and I gave
up the idea of hacking the code. Then I came up with a workaround: I
create a merge table with multiple partition tables, and data are
copied into partition tables so I can use multiple connections
simultaneously. Although it makes things a bit complicated, but it is
better!

Regards,
Yinjie Lin

On Fri, Dec 28, 2018 at 12:04 AM Roberto Cornacchia
<roberto.cornacchia at gmail.com> wrote:
>
> Yinjie,
>
> Have you looked into this? https://dev.monetdb.org/hg/monetdb-java/file/tip/tests/SQLcopyinto.java
>
> It shows how, from Java, you can use mapi to perform a COPY INTO from stdin (so the file stays on the client side).
> I use this approach on production code, it is pretty fast.
>
> Also, this may be of interest to you. I have not tried it, but it sounds like it allows a COPY INTO from a file on the client side: https://dev.monetdb.org/hg/MonetDB?cmd=changeset;node=0b4846d2089c
>
> Lastly, I most definitely second Jennie's suggestion to not pursue the way of disabling transaction checks. It is most certainly a recipe for disaster. I say this as an avid user who has known the internals for 15 years.
>
> Best, Roberto
>
>
> On Thu, 27 Dec 2018 at 16:13 Yinjie Lin <exialin37 at gmail.com> wrote:
>>
>> Hi Jennie,
>>
>> Thanks for your detailed explanation.
>>
>> Actually I'm using Mapi to copy data and send queries to MonetDB. I
>> send `COPY INTO ... FROM STDIN` to the server to transfer data.
>> I know COPY INTO is much faster, but it requires the file to be local
>> on the server. Currently I can only send data from the clients, and
>> this
>> is why I want to do it through multiple connections. I hope it can
>> speed up the copy process.
>>
>> I'm trying to hack the code so that some transaction checks can be
>> ignored, though it is not easy...
>> Another question is that I'm afraid, for example, the low-level
>> storage of data does not support concurrent copy into one table even
>> if I changed the code.
>> So is that possible theoretically?
>>
>> Regards,
>> Yinjie Lin
>>
>> On Thu, Dec 27, 2018 at 10:33 PM Ying Zhang <Y.Zhang at cwi.nl> wrote:
>> >
>> >
>> >
>> > > On 27 Dec 2018, at 07:59, Yinjie Lin <exialin37 at gmail.com> wrote:
>> > >
>> > > Hi,
>> > >
>> > >    I'm new to MonetDB and I think this is a great project!
>> >
>> > Hello Yinjie,
>> >
>> > Thanks for using MonetDB.
>> >
>> >
>> > >
>> > >    I have a problem the same as this one:
>> > > https://stackoverflow.com/questions/9179265/how-would-monetdb-perform-with-multiple-sql-copy-operations-running-at-the-same?answertab=votes#tab-top
>> > > , namely copy data into tables from multiple clients at the same time.
>> > >
>> > >    As the stackoverflow question suggests, MonetDB uses Optimistic
>> > > Concurrency Control, but the behaviour is different from my
>> > > expectation. For example, say I have a table `t` without any
>> > > constraints, and I run the following queries:
>> > >
>> > > client1                                      client2
>> > > -------------------------------------------------------------------
>> > > start transaction;
>> > > insert into t values (1);
>> > >                                           start transaction;
>> > >                                           insert into t values (2);
>> > > commit;
>> > >                                           commit;
>> > >
>> > > I would expect both commits succeed since the rows to be inserted do
>> > > not conflict with each other and do not change original rows in the
>> > > table. And this is the behaviour of some databases with OCC, e.g. TiDB
>> > > (a MySQL compatible database). But MonetDB would rollback the second
>> > > transaction.
>> >
>> > This is because MonetDB doesn’t allow concurrent updates on the same table.  Here is some more information:
>> > https://www.monetdb.org/blog/monetdb-sql-transaction-management-scheme
>> >
>> > > So my question is: is it possible to change this
>> > > behaviour by modifying the code? If so, could you give me some
>> > > guidance or suggestion?
>> >
>> > Changing this behaviour basically means changing the whole transaction manager of MonetDB...
>> > Our recommendation is simply don’t do this.  The best is to serialise the inserts at the client side.
>> > Such concurrency will give you nothing but unnecessary resource contention.
>> >
>> > BTW, if data loading speed is important for what you want to do, please try to avoid using INSERT INTO.
>> > COPY INTO is a much faster alternative, and it is highly parallelised internally (which renders concurrent COPY INTO unnecessary).
>> >
>> > Regards,
>> > Jennie
>> > >
>> > > Thanks in advance!
>> > >
>> > > Best Regards,
>> > > Yinjie Lin
>> > > _______________________________________________
>> > > developers-list mailing list
>> > > developers-list at monetdb.org
>> > > https://www.monetdb.org/mailman/listinfo/developers-list
>> >
>> > _______________________________________________
>> > developers-list mailing list
>> > developers-list at monetdb.org
>> > https://www.monetdb.org/mailman/listinfo/developers-list
>> _______________________________________________
>> developers-list mailing list
>> developers-list at monetdb.org
>> https://www.monetdb.org/mailman/listinfo/developers-list
>
> _______________________________________________
> developers-list mailing list
> developers-list at monetdb.org
> https://www.monetdb.org/mailman/listinfo/developers-list


More information about the developers-list mailing list