Re: Merge table performance on single machine with 4 cores

23 Mar 2017

      Hi Manish
the real reason that multi-threaded execution is slower than single-threaded execution
*in your very case* is the following:
You have ~10m rows and the result of your grouped-aggregation query has ~9.5M rows (groups)
(i.e., most of the groups have only 1 or at most few members; the count of 10-16 in the top-10
you show suggests a (very) long tail of singleton groups ...).
Multi-threading in MonetDB essentially slices the largest base-table in as many slices as there
are cores / threads to be used, and tries to replicate as much as possible of the query plan over
these slices, before combining the partial results into a final single query result.
This works fine and effectively in most cases.
However, group-by queries may be a problem.
With group-by queries, the grouping and aggregation is first done individually for each slice
(in parallel). Then, to get the final result, the individual results are concatenated and 
grouped and aggregated in a single thread, again. This works fine in case there are (only) few groups.
Then, the final ("extra") grouping/aggregation has to handle only few ("pre-aggregated") groups
from the individual slices.
However, with many groups, this extra grouping/aggregation can become expensive,
in your particular case with almost as many groups as original input tuples/rows,
this extra single-threaded grouping/aggregation is as expensive as the single
group by/aggregation in the single-threaded case, essentially making the multi-threaded
case almost twice as expensive as the single-threaded case.
For the time being (?) MonetDB has no knowledge how many groups a group-by will produce,
and thus cannot automatically decide when to run a group-by query or at least the/which
group-by operation(s) in a query single-threaded or multi-threaded --- I'm afraid
even ANALYZED does not produce sufficient statistics to derive from that the number of
groups (possibly "deep" in the query plan).
In general, though, multi-threaded execution should indeed be faster than single-threaded;
however, limited potential for parallelization (e.g., due to operators that cannot
be parallelized (by MonetDB (yet?)) and thus implicitly serialize the remainder of
the query execution plan) and/or resource congestion can limit the effect of multi-
threaded execution, or even eliminate it, resulting in multi-threaded being not better
than single-threaded.
Multi-threaded becoming worse than single-threaded should be the rare exception,
mainly in cases sketched above.
Hope this helps.
Best,
Stefan
ps: to understand in detail why you mergetable on a single machine is slower than
    a single base table, you also might use TRACE and analyze (or share) the output ...
    I suspect the same large group by result problem as sketched above.
----- On Mar 23, 2017, at 2:47 AM, Manish gupta gahoimnshg@gmail.com wrote:
...
Hello Martin,
Thanks for response. Is there any recommendation on what should be mode of
running monetdb server? Given I will be dealing with 100 times bigger than this
data ( but similar in nature ). Since the memory is sufficient ( 128GB physical
) to have relevant data in memory after first call. Does the decision depend on
the properties of data? Should I first ANALYZE the data and check certain
patterns to get the best usage of 16 cores that I will be having?
Regards,
Manish
On Thu, Mar 23, 2017 at 6:35 AM, Martin Kersten < martin.kersten@cwi.nl > wrote:
No it doesn;t. See the traces, the plans are different.
Blocking operators limit multithreaded work and at some point
the cores fight over resources.
regards, Martin
On 23/03/2017 01:54, Manish gupta wrote:
Dear All,
Sorry for spamming. Can anyone suggest how single thread is faster than 4
threads through trace output I attached in previous append. I want to confirm
that in general multi thread performance should be faster for monetdb, and it
should scale with
increasing cores.
Regards,
Mohit.
On Tue, Mar 21, 2017 at 9:23 PM, Manish gupta < gahoimnshg@gmail.com <mailto:
gahoimnshg@gmail.com >> wrote:
Hello Jennie,
Thanks for responding. Here are the answers for the questions you had asked

you can prefix your queries with the TRACE keyword to see where the time went

I am attaching Trace output. 4.trace is TRACE from 4 threads ( --set
"gdk_nr_threads=4" ), 1.trace files is TRACE from 1 threads monet server run (
--set "gdk_nr_threads=1"). I was expecting 4 threads run to be ~2X faster, but
its slower than 1
thread run.

are the times from running the queries once, e.g. for the first time? have

you tried to run the queries repeatedly to see if there are any changes in the
execution time?
No, I am taking these snapshots after 3-4 times prior runs, so that the data is
in memory, and no IO,

are there any other applications running on your windows machine?

No, apart from Windows, there is no other application running on the machine.

does your windows run in a VM or on physical hardware?

Its running on Physical hardware on my windows machine.
Please let me know if more info is required. In the meantime, I am trying to use
tomograph utility.
Regards,
Mohit
On Tue, Mar 21, 2017 at 1:25 PM, Ying Zhang < Y.Zhang@cwi.nl <mailto:
Y.Zhang@cwi.nl >> wrote:
Hai Manish,
For the use case you described here, using merge table won’t help the
performance, because, as Martin mentioned, MonetDB would already use all cores
for the single base table in your merge table.
However, the performance degradation is a bit surprising. Some following up
questions:

you can prefix your queries with the TRACE keyword to see where the time went
are the times from running the queries once, e.g. for the first time? have

you tried to run the queries repeatedly to see if there are any changes in the
execution time?
3. are there any other applications running on your windows machine?
4. does your windows run in a VM or on physical hardware?
Regards,
Jennie
...
On 19 Mar 2017, at 04:25, Manish gupta < gahoimnshg@gmail.com <mailto:
gahoimnshg@gmail.com >> wrote:
Dear all,
I am checking the performance of merge table vs single table on single machine
with 4 cores. The fact table contains ~10million records. I am using # MonetDB
5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram
).
I have created a merge table which contains fact table and 3 blank copies of
fact table. I am seeing surprisingly slow performance of merge table queries
compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by
f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+
| L2 | L5 | f_1_1_7 | f_1_1_5 |
+===========+======+=========+=========+
| 125885351 | 14 | SALE | 8084869 |
| 125629268 | 15 | SALE | 6852518 |
| 121733416 | 16 | SALE | 5440605 |
| 110328868 | 13 | SALE | 4842411 |
| 100848236 | 11 | SALE | 8068509 |
| 100719291 | 16 | SALE | 5822736 |
| 93127875 | 11 | SALE | 6529612 |
| 91034094 | 13 | SALE | 3401293 |
| 90766181 | 10 | SALE | 8084900 |
| 88848574 | 10 | SALE | 2660811 |
+-----------+------+---------+---------+
10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by
f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+
| L2 | L5 | f_1_1_7 | f_1_1_5 |
+===========+======+=========+=========+
| 125885351 | 14 | SALE | 8084869 |
| 125629268 | 15 | SALE | 6852518 |
| 121733416 | 16 | SALE | 5440605 |
| 110328868 | 13 | SALE | 4842411 |
| 100848236 | 11 | SALE | 8068509 |
| 100719291 | 16 | SALE | 5822736 |
| 93127875 | 11 | SALE | 6529612 |
| 91034094 | 13 | SALE | 3401293 |
| 90766181 | 10 | SALE | 8084900 |
| 88848574 | 10 | SALE | 2660811 |
+-----------+------+---------+---------+
10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3
blank copies of base table ), merge table query is taking more than 2.5 times
the base table runtime. Why is it so?
Thanks,
Mohit

users-list mailing list
users-list@monetdb.org <mailto: users-list@monetdb.org >
https://www.monetdb.org/mailman/listinfo/users-list <
https://www.monetdb.org/mailman/listinfo/users-list >

users-list mailing list
users-list@monetdb.org <mailto: users-list@monetdb.org >
https://www.monetdb.org/mailman/listinfo/users-list <
https://www.monetdb.org/mailman/listinfo/users-list >

users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list

users-list mailing list
users-list@monetdb.org
https://www.monetdb.org/mailman/listinfo/users-list
-- 
| Stefan.Manegold@CWI.nl | DB Architectures   (DA) |
| www.CWI.nl/~manegold/  | Science Park 123 (L321) |
| +31 (0)20 592-4212     | 1098 XG Amsterdam  (NL) |

Re: Merge table performance on single machine with 4 cores

Stefan Manegold

Base table query

Merge Table Query