Merge table performance on single machine with 4 cores
Dear all, I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query ================= sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query ================= sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
Hi
Simply because MonetDB will already use all cores whenever it can. This means less inter-process communication and less contention for resources. Furthermore, there might be decisions being taken due to lack of knowledge about the partitions, which should be set using the ANALYZE command.
regards, Martin
On 19/03/2017 04:25, Manish gupta wrote:
Dear all, I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Thanks Martin for explaining. Are you suggesting this expected behavior, since I am running on single machine I dont mind moving away from merge table architecture to base table? I have one more experiment, where I ran monetdb server with single thread ( --set "gdk_nr_threads 1" ) and with ( --set "gdk_nr_threads 4" ). Again contrary to my expectation same query on base table is faster with 1 thread ( around 25% faster ). We are using MonetDB in multi user scenario, so I dont mind using --set "gdk_nr_threads 1" setting as dedicated single core for one user query. But, I am not sure how multi thread query is slower than single thread execution. I also saw that for multi thread mode, all cores were busy only for 6-7 seconds, and rest of the time CPU utilization was ~25%. So looks like threads were running very small part in parallel, and merging in main thread took all the time.
I have trace outputs for both modes, if they can be helpful.
Regards, Mohit.
On Sun, Mar 19, 2017 at 2:42 PM, Martin Kersten martin@monetdb.org wrote:
Hi
Simply because MonetDB will already use all cores whenever it can. This means less inter-process communication and less contention for resources. Furthermore, there might be decisions being taken due to lack of knowledge about the partitions, which should be set using the ANALYZE command.
regards, Martin
On 19/03/2017 04:25, Manish gupta wrote:
Dear all, I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Dear All, Any suggestion/comments on my observation that multi thread monetdb ( --set "gdk_nr_threads 4" ) is slower than single thread ( --set "gdk_nr_threads 1" ) when executing big dataset?
Thanks, Mohit
On Sun, Mar 19, 2017 at 5:19 PM, Manish gupta gahoimnshg@gmail.com wrote:
Thanks Martin for explaining. Are you suggesting this expected behavior, since I am running on single machine I dont mind moving away from merge table architecture to base table? I have one more experiment, where I ran monetdb server with single thread ( --set "gdk_nr_threads 1" ) and with ( --set "gdk_nr_threads 4" ). Again contrary to my expectation same query on base table is faster with 1 thread ( around 25% faster ). We are using MonetDB in multi user scenario, so I dont mind using --set "gdk_nr_threads 1" setting as dedicated single core for one user query. But, I am not sure how multi thread query is slower than single thread execution. I also saw that for multi thread mode, all cores were busy only for 6-7 seconds, and rest of the time CPU utilization was ~25%. So looks like threads were running very small part in parallel, and merging in main thread took all the time.
I have trace outputs for both modes, if they can be helpful.
Regards, Mohit.
On Sun, Mar 19, 2017 at 2:42 PM, Martin Kersten martin@monetdb.org wrote:
Hi
Simply because MonetDB will already use all cores whenever it can. This means less inter-process communication and less contention for resources. Furthermore, there might be decisions being taken due to lack of knowledge about the partitions, which should be set using the ANALYZE command.
regards, Martin
On 19/03/2017 04:25, Manish gupta wrote:
Dear all, I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hai,
On 19 Mar 2017, at 12:49, Manish gupta gahoimnshg@gmail.com wrote:
Thanks Martin for explaining. Are you suggesting this expected behavior, since I am running on single machine I dont mind moving away from merge table architecture to base table?
the main (only?) advantage of using merge table on a single machine to be able to simultaneously load data into multiple partitions of a merge table. this is not for increased performance (because it might actually hurt performance), but merely to allow concurrent load.
I have one more experiment, where I ran monetdb server with single thread ( --set "gdk_nr_threads 1" ) and with ( --set "gdk_nr_threads 4" ). Again contrary to my expectation same query on base table is faster with 1 thread ( around 25% faster ). We are using MonetDB in multi user scenario, so I dont mind using --set "gdk_nr_threads 1" setting as dedicated single core for one user query.
what do you mean exactly with "using MonetDB in multi user scenario”? From what I have seen (on Linux), 4 cores is fairly minimal for MonetDB. 1 core was clearly too few, even for a single MonetDB user. If you’re going to using 1 core to let a single MonetDB instance serving multiple users, I’m not so sure about the performance. Although it’s contrary to your experiences for now.
But, I am not sure how multi thread query is slower than single thread execution. I also saw that for multi thread mode, all cores were busy only for 6-7 seconds, and rest of the time CPU utilization was ~25%. So looks like threads were running very small part in parallel, and merging in main thread took all the time.
I have trace outputs for both modes, if they can be helpful.
The trace outputs are definitely helpful. They help understand where the time went. That’s usually the first question in such cases. You can also general tomograph [1] graphs to see why MonetDB doesn’t use 100%.
Regards, Jennie
[1] http://www.monetdb.org/Documentation/Manuals/MonetDB/Profiler/tomograph
Regards, Mohit.
On Sun, Mar 19, 2017 at 2:42 PM, Martin Kersten martin@monetdb.org wrote: Hi
Simply because MonetDB will already use all cores whenever it can. This means less inter-process communication and less contention for resources. Furthermore, there might be decisions being taken due to lack of knowledge about the partitions, which should be set using the ANALYZE command.
regards, Martin
On 19/03/2017 04:25, Manish gupta wrote: Dear all, I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hai Manish,
For the use case you described here, using merge table won’t help the performance, because, as Martin mentioned, MonetDB would already use all cores for the single base table in your merge table.
However, the performance degradation is a bit surprising. Some following up questions: 1. you can prefix your queries with the TRACE keyword to see where the time went 2. are the times from running the queries once, e.g. for the first time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? 3. are there any other applications running on your windows machine? 4. does your windows run in a VM or on physical hardware?
Regards, Jennie
On 19 Mar 2017, at 04:25, Manish gupta gahoimnshg@gmail.com wrote:
Dear all, I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hello Jennie, Thanks for responding. Here are the answers for the questions you had asked
1. you can prefix your queries with the TRACE keyword to see where the time went I am attaching Trace output. 4.trace is TRACE from 4 threads ( --set "gdk_nr_threads=4" ), 1.trace files is TRACE from 1 threads monet server run ( --set "gdk_nr_threads=1"). I was expecting 4 threads run to be ~2X faster, but its slower than 1 thread run.
2. are the times from running the queries once, e.g. for the first time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? No, I am taking these snapshots after 3-4 times prior runs, so that the data is in memory, and no IO,
3. are there any other applications running on your windows machine? No, apart from Windows, there is no other application running on the machine.
4. does your windows run in a VM or on physical hardware? Its running on Physical hardware on my windows machine.
Please let me know if more info is required. In the meantime, I am trying to use tomograph utility.
Regards, Mohit
On Tue, Mar 21, 2017 at 1:25 PM, Ying Zhang Y.Zhang@cwi.nl wrote:
Hai Manish,
For the use case you described here, using merge table won’t help the performance, because, as Martin mentioned, MonetDB would already use all cores for the single base table in your merge table.
However, the performance degradation is a bit surprising. Some following up questions:
- you can prefix your queries with the TRACE keyword to see where the
time went 2. are the times from running the queries once, e.g. for the first time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? 3. are there any other applications running on your windows machine? 4. does your windows run in a VM or on physical hardware?
Regards, Jennie
On 19 Mar 2017, at 04:25, Manish gupta gahoimnshg@gmail.com wrote:
Dear all, I am checking the performance of merge table vs single table on
single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ).
I have created a merge table which contains fact table and 3 blank
copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1
group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group
by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table
and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Dear All, Sorry for spamming. Can anyone suggest how single thread is faster than 4 threads through trace output I attached in previous append. I want to confirm that in general multi thread performance should be faster for monetdb, and it should scale with increasing cores.
Regards, Mohit.
On Tue, Mar 21, 2017 at 9:23 PM, Manish gupta gahoimnshg@gmail.com wrote:
Hello Jennie, Thanks for responding. Here are the answers for the questions you had asked
- you can prefix your queries with the TRACE keyword to see where the
time went I am attaching Trace output. 4.trace is TRACE from 4 threads ( --set "gdk_nr_threads=4" ), 1.trace files is TRACE from 1 threads monet server run ( --set "gdk_nr_threads=1"). I was expecting 4 threads run to be ~2X faster, but its slower than 1 thread run.
- are the times from running the queries once, e.g. for the first time?
have you tried to run the queries repeatedly to see if there are any changes in the execution time? No, I am taking these snapshots after 3-4 times prior runs, so that the data is in memory, and no IO,
- are there any other applications running on your windows machine?
No, apart from Windows, there is no other application running on the machine.
- does your windows run in a VM or on physical hardware?
Its running on Physical hardware on my windows machine.
Please let me know if more info is required. In the meantime, I am trying to use tomograph utility.
Regards, Mohit
On Tue, Mar 21, 2017 at 1:25 PM, Ying Zhang Y.Zhang@cwi.nl wrote:
Hai Manish,
For the use case you described here, using merge table won’t help the performance, because, as Martin mentioned, MonetDB would already use all cores for the single base table in your merge table.
However, the performance degradation is a bit surprising. Some following up questions:
- you can prefix your queries with the TRACE keyword to see where the
time went 2. are the times from running the queries once, e.g. for the first time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? 3. are there any other applications running on your windows machine? 4. does your windows run in a VM or on physical hardware?
Regards, Jennie
On 19 Mar 2017, at 04:25, Manish gupta gahoimnshg@gmail.com wrote:
Dear all, I am checking the performance of merge table vs single table on
single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ).
I have created a merge table which contains fact table and 3 blank
copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1
group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group
by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10;
+-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table
and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
No it doesn;t. See the traces, the plans are different. Blocking operators limit multithreaded work and at some point the cores fight over resources.
regards, Martin On 23/03/2017 01:54, Manish gupta wrote:
Dear All, Sorry for spamming. Can anyone suggest how single thread is faster than 4 threads through trace output I attached in previous append. I want to confirm that in general multi thread performance should be faster for monetdb, and it should scale with increasing cores.
Regards, Mohit.
On Tue, Mar 21, 2017 at 9:23 PM, Manish gupta <gahoimnshg@gmail.com mailto:gahoimnshg@gmail.com> wrote:
Hello Jennie, Thanks for responding. Here are the answers for the questions you had asked 1. you can prefix your queries with the TRACE keyword to see where the time went I am attaching Trace output. 4.trace is TRACE from 4 threads ( --set "gdk_nr_threads=4" ), 1.trace files is TRACE from 1 threads monet server run ( --set "gdk_nr_threads=1"). I was expecting 4 threads run to be ~2X faster, but its slower than 1 thread run. 2. are the times from running the queries once, e.g. for the first time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? No, I am taking these snapshots after 3-4 times prior runs, so that the data is in memory, and no IO, 3. are there any other applications running on your windows machine? No, apart from Windows, there is no other application running on the machine. 4. does your windows run in a VM or on physical hardware? Its running on Physical hardware on my windows machine. Please let me know if more info is required. In the meantime, I am trying to use tomograph utility. Regards, Mohit On Tue, Mar 21, 2017 at 1:25 PM, Ying Zhang <Y.Zhang@cwi.nl <mailto:Y.Zhang@cwi.nl>> wrote: Hai Manish, For the use case you described here, using merge table won’t help the performance, because, as Martin mentioned, MonetDB would already use all cores for the single base table in your merge table. However, the performance degradation is a bit surprising. Some following up questions: 1. you can prefix your queries with the TRACE keyword to see where the time went 2. are the times from running the queries once, e.g. for the first time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? 3. are there any other applications running on your windows machine? 4. does your windows run in a VM or on physical hardware? Regards, Jennie > On 19 Mar 2017, at 04:25, Manish gupta <gahoimnshg@gmail.com <mailto:gahoimnshg@gmail.com>> wrote: > > Dear all, > I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). > I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries. > > Base table query > ================= > sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; > +-----------+------+---------+---------+ > | L2 | L5 | f_1_1_7 | f_1_1_5 | > +===========+======+=========+=========+ > | 125885351 | 14 | SALE | 8084869 | > | 125629268 | 15 | SALE | 6852518 | > | 121733416 | 16 | SALE | 5440605 | > | 110328868 | 13 | SALE | 4842411 | > | 100848236 | 11 | SALE | 8068509 | > | 100719291 | 16 | SALE | 5822736 | > | 93127875 | 11 | SALE | 6529612 | > | 91034094 | 13 | SALE | 3401293 | > | 90766181 | 10 | SALE | 8084900 | > | 88848574 | 10 | SALE | 2660811 | > +-----------+------+---------+---------+ > 10 tuples (44.2s) > > Merge Table Query > ================= > sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; > +-----------+------+---------+---------+ > | L2 | L5 | f_1_1_7 | f_1_1_5 | > +===========+======+=========+=========+ > | 125885351 | 14 | SALE | 8084869 | > | 125629268 | 15 | SALE | 6852518 | > | 121733416 | 16 | SALE | 5440605 | > | 110328868 | 13 | SALE | 4842411 | > | 100848236 | 11 | SALE | 8068509 | > | 100719291 | 16 | SALE | 5822736 | > | 93127875 | 11 | SALE | 6529612 | > | 91034094 | 13 | SALE | 3401293 | > | 90766181 | 10 | SALE | 8084900 | > | 88848574 | 10 | SALE | 2660811 | > +-----------+------+---------+---------+ > 10 tuples (1m 51s) > > > For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so? > > Thanks, > Mohit > > > _______________________________________________ > users-list mailing list > users-list@monetdb.org <mailto:users-list@monetdb.org> > https://www.monetdb.org/mailman/listinfo/users-list <https://www.monetdb.org/mailman/listinfo/users-list> _______________________________________________ users-list mailing list users-list@monetdb.org <mailto:users-list@monetdb.org> https://www.monetdb.org/mailman/listinfo/users-list <https://www.monetdb.org/mailman/listinfo/users-list>
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hello Martin, Thanks for response. Is there any recommendation on what should be mode of running monetdb server? Given I will be dealing with 100 times bigger than this data ( but similar in nature ). Since the memory is sufficient ( 128GB physical ) to have relevant data in memory after first call. Does the decision depend on the properties of data? Should I first ANALYZE the data and check certain patterns to get the best usage of 16 cores that I will be having?
Regards, Manish
On Thu, Mar 23, 2017 at 6:35 AM, Martin Kersten martin.kersten@cwi.nl wrote:
No it doesn;t. See the traces, the plans are different. Blocking operators limit multithreaded work and at some point the cores fight over resources.
regards, Martin On 23/03/2017 01:54, Manish gupta wrote:
Dear All, Sorry for spamming. Can anyone suggest how single thread is faster than 4 threads through trace output I attached in previous append. I want to confirm that in general multi thread performance should be faster for monetdb, and it should scale with increasing cores.
Regards, Mohit.
On Tue, Mar 21, 2017 at 9:23 PM, Manish gupta <gahoimnshg@gmail.com mailto:gahoimnshg@gmail.com> wrote:
Hello Jennie, Thanks for responding. Here are the answers for the questions you
had asked
1. you can prefix your queries with the TRACE keyword to see where
the time went I am attaching Trace output. 4.trace is TRACE from 4 threads ( --set "gdk_nr_threads=4" ), 1.trace files is TRACE from 1 threads monet server run ( --set "gdk_nr_threads=1"). I was expecting 4 threads run to be ~2X faster, but its slower than 1 thread run.
2. are the times from running the queries once, e.g. for the first
time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? No, I am taking these snapshots after 3-4 times prior runs, so that the data is in memory, and no IO,
3. are there any other applications running on your windows machine? No, apart from Windows, there is no other application running on the
machine.
4. does your windows run in a VM or on physical hardware? Its running on Physical hardware on my windows machine. Please let me know if more info is required. In the meantime, I am
trying to use tomograph utility.
Regards, Mohit On Tue, Mar 21, 2017 at 1:25 PM, Ying Zhang <Y.Zhang@cwi.nl <mailto:
Y.Zhang@cwi.nl>> wrote:
Hai Manish, For the use case you described here, using merge table won’t help
the performance, because, as Martin mentioned, MonetDB would already use all cores for the single base table in your merge table.
However, the performance degradation is a bit surprising. Some
following up questions: 1. you can prefix your queries with the TRACE keyword to see where the time went 2. are the times from running the queries once, e.g. for the first time? have you tried to run the queries repeatedly to see if there are any changes in the execution time? 3. are there any other applications running on your windows machine? 4. does your windows run in a VM or on physical hardware?
Regards, Jennie > On 19 Mar 2017, at 04:25, Manish gupta <gahoimnshg@gmail.com
mailto:gahoimnshg@gmail.com> wrote: > > Dear all, > I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). > I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries. > > Base table query > ================= > sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; > +-----------+------+---------+---------+ > | L2 | L5 | f_1_1_7 | f_1_1_5 | > +===========+======+=========+=========+ > | 125885351 | 14 | SALE | 8084869 | > | 125629268 | 15 | SALE | 6852518 | > | 121733416 | 16 | SALE | 5440605 | > | 110328868 | 13 | SALE | 4842411 | > | 100848236 | 11 | SALE | 8068509 | > | 100719291 | 16 | SALE | 5822736 | > | 93127875 | 11 | SALE | 6529612 | > | 91034094 | 13 | SALE | 3401293 | > | 90766181 | 10 | SALE | 8084900 | > | 88848574 | 10 | SALE | 2660811 | > +-----------+------+---------+---------+ > 10 tuples (44.2s) > > Merge Table Query > ================= > sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; > +-----------+------+---------+---------+ > | L2 | L5 | f_1_1_7 | f_1_1_5 | > +===========+======+=========+=========+ > | 125885351 | 14 | SALE | 8084869 | > | 125629268 | 15 | SALE | 6852518 | > | 121733416 | 16 | SALE | 5440605 | > | 110328868 | 13 | SALE | 4842411 | > | 100848236 | 11 | SALE | 8068509 | > | 100719291 | 16 | SALE | 5822736 | > | 93127875 | 11 | SALE | 6529612 | > | 91034094 | 13 | SALE | 3401293 | > | 90766181 | 10 | SALE | 8084900 | > | 88848574 | 10 | SALE | 2660811 | > +-----------+------+---------+---------+ > 10 tuples (1m 51s) > > > For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so? > > Thanks, > Mohit > > > _______________________________________________ > users-list mailing list > users-list@monetdb.org mailto:users-list@monetdb.org > https://www.monetdb.org/mailman/listinfo/users-list < https://www.monetdb.org/mailman/listinfo/users-list%3E
_______________________________________________ users-list mailing list users-list@monetdb.org <mailto:users-list@monetdb.org> https://www.monetdb.org/mailman/listinfo/users-list <
https://www.monetdb.org/mailman/listinfo/users-list%3E
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Manish
the real reason that multi-threaded execution is slower than single-threaded execution *in your very case* is the following:
You have ~10m rows and the result of your grouped-aggregation query has ~9.5M rows (groups) (i.e., most of the groups have only 1 or at most few members; the count of 10-16 in the top-10 you show suggests a (very) long tail of singleton groups ...).
Multi-threading in MonetDB essentially slices the largest base-table in as many slices as there are cores / threads to be used, and tries to replicate as much as possible of the query plan over these slices, before combining the partial results into a final single query result. This works fine and effectively in most cases. However, group-by queries may be a problem. With group-by queries, the grouping and aggregation is first done individually for each slice (in parallel). Then, to get the final result, the individual results are concatenated and grouped and aggregated in a single thread, again. This works fine in case there are (only) few groups. Then, the final ("extra") grouping/aggregation has to handle only few ("pre-aggregated") groups from the individual slices. However, with many groups, this extra grouping/aggregation can become expensive, in your particular case with almost as many groups as original input tuples/rows, this extra single-threaded grouping/aggregation is as expensive as the single group by/aggregation in the single-threaded case, essentially making the multi-threaded case almost twice as expensive as the single-threaded case. For the time being (?) MonetDB has no knowledge how many groups a group-by will produce, and thus cannot automatically decide when to run a group-by query or at least the/which group-by operation(s) in a query single-threaded or multi-threaded --- I'm afraid even ANALYZED does not produce sufficient statistics to derive from that the number of groups (possibly "deep" in the query plan).
In general, though, multi-threaded execution should indeed be faster than single-threaded; however, limited potential for parallelization (e.g., due to operators that cannot be parallelized (by MonetDB (yet?)) and thus implicitly serialize the remainder of the query execution plan) and/or resource congestion can limit the effect of multi- threaded execution, or even eliminate it, resulting in multi-threaded being not better than single-threaded. Multi-threaded becoming worse than single-threaded should be the rare exception, mainly in cases sketched above.
Hope this helps.
Best, Stefan
ps: to understand in detail why you mergetable on a single machine is slower than a single base table, you also might use TRACE and analyze (or share) the output ... I suspect the same large group by result problem as sketched above.
----- On Mar 23, 2017, at 2:47 AM, Manish gupta gahoimnshg@gmail.com wrote:
Hello Martin, Thanks for response. Is there any recommendation on what should be mode of running monetdb server? Given I will be dealing with 100 times bigger than this data ( but similar in nature ). Since the memory is sufficient ( 128GB physical ) to have relevant data in memory after first call. Does the decision depend on the properties of data? Should I first ANALYZE the data and check certain patterns to get the best usage of 16 cores that I will be having?
Regards, Manish
On Thu, Mar 23, 2017 at 6:35 AM, Martin Kersten < martin.kersten@cwi.nl > wrote:
No it doesn;t. See the traces, the plans are different. Blocking operators limit multithreaded work and at some point the cores fight over resources.
regards, Martin On 23/03/2017 01:54, Manish gupta wrote:
Dear All, Sorry for spamming. Can anyone suggest how single thread is faster than 4 threads through trace output I attached in previous append. I want to confirm that in general multi thread performance should be faster for monetdb, and it should scale with increasing cores.
Regards, Mohit.
On Tue, Mar 21, 2017 at 9:23 PM, Manish gupta < gahoimnshg@gmail.com <mailto: gahoimnshg@gmail.com >> wrote:
Hello Jennie, Thanks for responding. Here are the answers for the questions you had asked
- you can prefix your queries with the TRACE keyword to see where the time went
I am attaching Trace output. 4.trace is TRACE from 4 threads ( --set "gdk_nr_threads=4" ), 1.trace files is TRACE from 1 threads monet server run ( --set "gdk_nr_threads=1"). I was expecting 4 threads run to be ~2X faster, but its slower than 1 thread run.
- are the times from running the queries once, e.g. for the first time? have
you tried to run the queries repeatedly to see if there are any changes in the execution time? No, I am taking these snapshots after 3-4 times prior runs, so that the data is in memory, and no IO,
- are there any other applications running on your windows machine?
No, apart from Windows, there is no other application running on the machine.
- does your windows run in a VM or on physical hardware?
Its running on Physical hardware on my windows machine.
Please let me know if more info is required. In the meantime, I am trying to use tomograph utility.
Regards, Mohit
On Tue, Mar 21, 2017 at 1:25 PM, Ying Zhang < Y.Zhang@cwi.nl <mailto: Y.Zhang@cwi.nl >> wrote:
Hai Manish,
For the use case you described here, using merge table won’t help the performance, because, as Martin mentioned, MonetDB would already use all cores for the single base table in your merge table.
However, the performance degradation is a bit surprising. Some following up questions:
- you can prefix your queries with the TRACE keyword to see where the time went
- are the times from running the queries once, e.g. for the first time? have
you tried to run the queries repeatedly to see if there are any changes in the execution time? 3. are there any other applications running on your windows machine? 4. does your windows run in a VM or on physical hardware?
Regards, Jennie
On 19 Mar 2017, at 04:25, Manish gupta < gahoimnshg@gmail.com <mailto: gahoimnshg@gmail.com >> wrote:
Dear all, I am checking the performance of merge table vs single table on single machine with 4 cores. The fact table contains ~10million records. I am using # MonetDB 5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB physical ram ). I have created a merge table which contains fact table and 3 blank copies of fact table. I am seeing surprisingly slow performance of merge table queries compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group by f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table and 3 blank copies of base table ), merge table query is taking more than 2.5 times the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org <mailto: users-list@monetdb.org > https://www.monetdb.org/mailman/listinfo/users-list < https://www.monetdb.org/mailman/listinfo/users-list >
users-list mailing list users-list@monetdb.org <mailto: users-list@monetdb.org > https://www.monetdb.org/mailman/listinfo/users-list < https://www.monetdb.org/mailman/listinfo/users-list >
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Stefan, Thanks for response. This is exactly what I had thought that merging of partitioned data ( because of nature of query and data was issue). Thanks for clarifying that multi core should in general be faster. My queries will mostly have groupby clauses, but rarely with such skewed groups.
Regards, Manish
On Thu, Mar 23, 2017 at 12:12 PM, Stefan Manegold Stefan.Manegold@cwi.nl wrote:
Hi Manish
the real reason that multi-threaded execution is slower than single-threaded execution *in your very case* is the following:
You have ~10m rows and the result of your grouped-aggregation query has ~9.5M rows (groups) (i.e., most of the groups have only 1 or at most few members; the count of 10-16 in the top-10 you show suggests a (very) long tail of singleton groups ...).
Multi-threading in MonetDB essentially slices the largest base-table in as many slices as there are cores / threads to be used, and tries to replicate as much as possible of the query plan over these slices, before combining the partial results into a final single query result. This works fine and effectively in most cases. However, group-by queries may be a problem. With group-by queries, the grouping and aggregation is first done individually for each slice (in parallel). Then, to get the final result, the individual results are concatenated and grouped and aggregated in a single thread, again. This works fine in case there are (only) few groups. Then, the final ("extra") grouping/aggregation has to handle only few ("pre-aggregated") groups from the individual slices. However, with many groups, this extra grouping/aggregation can become expensive, in your particular case with almost as many groups as original input tuples/rows, this extra single-threaded grouping/aggregation is as expensive as the single group by/aggregation in the single-threaded case, essentially making the multi-threaded case almost twice as expensive as the single-threaded case. For the time being (?) MonetDB has no knowledge how many groups a group-by will produce, and thus cannot automatically decide when to run a group-by query or at least the/which group-by operation(s) in a query single-threaded or multi-threaded --- I'm afraid even ANALYZED does not produce sufficient statistics to derive from that the number of groups (possibly "deep" in the query plan).
In general, though, multi-threaded execution should indeed be faster than single-threaded; however, limited potential for parallelization (e.g., due to operators that cannot be parallelized (by MonetDB (yet?)) and thus implicitly serialize the remainder of the query execution plan) and/or resource congestion can limit the effect of multi- threaded execution, or even eliminate it, resulting in multi-threaded being not better than single-threaded. Multi-threaded becoming worse than single-threaded should be the rare exception, mainly in cases sketched above.
Hope this helps.
Best, Stefan
ps: to understand in detail why you mergetable on a single machine is slower than a single base table, you also might use TRACE and analyze (or share) the output ... I suspect the same large group by result problem as sketched above.
----- On Mar 23, 2017, at 2:47 AM, Manish gupta gahoimnshg@gmail.com wrote:
Hello Martin, Thanks for response. Is there any recommendation on what should be mode
of
running monetdb server? Given I will be dealing with 100 times bigger
than this
data ( but similar in nature ). Since the memory is sufficient ( 128GB
physical
) to have relevant data in memory after first call. Does the decision
depend on
the properties of data? Should I first ANALYZE the data and check certain patterns to get the best usage of 16 cores that I will be having?
Regards, Manish
On Thu, Mar 23, 2017 at 6:35 AM, Martin Kersten < martin.kersten@cwi.nl wrote:
No it doesn;t. See the traces, the plans are different. Blocking operators limit multithreaded work and at some point the cores fight over resources.
regards, Martin On 23/03/2017 01:54, Manish gupta wrote:
Dear All, Sorry for spamming. Can anyone suggest how single thread is faster than 4 threads through trace output I attached in previous append. I want to
confirm
that in general multi thread performance should be faster for monetdb,
and it
should scale with increasing cores.
Regards, Mohit.
On Tue, Mar 21, 2017 at 9:23 PM, Manish gupta < gahoimnshg@gmail.com
<mailto:
gahoimnshg@gmail.com >> wrote:
Hello Jennie, Thanks for responding. Here are the answers for the questions you had
asked
- you can prefix your queries with the TRACE keyword to see where the
time went
I am attaching Trace output. 4.trace is TRACE from 4 threads ( --set "gdk_nr_threads=4" ), 1.trace files is TRACE from 1 threads monet server
run (
--set "gdk_nr_threads=1"). I was expecting 4 threads run to be ~2X
faster, but
its slower than 1 thread run.
- are the times from running the queries once, e.g. for the first time?
have
you tried to run the queries repeatedly to see if there are any changes
in the
execution time? No, I am taking these snapshots after 3-4 times prior runs, so that the
data is
in memory, and no IO,
- are there any other applications running on your windows machine?
No, apart from Windows, there is no other application running on the
machine.
- does your windows run in a VM or on physical hardware?
Its running on Physical hardware on my windows machine.
Please let me know if more info is required. In the meantime, I am
trying to use
tomograph utility.
Regards, Mohit
On Tue, Mar 21, 2017 at 1:25 PM, Ying Zhang < Y.Zhang@cwi.nl <mailto: Y.Zhang@cwi.nl >> wrote:
Hai Manish,
For the use case you described here, using merge table won’t help the performance, because, as Martin mentioned, MonetDB would already use all
cores
for the single base table in your merge table.
However, the performance degradation is a bit surprising. Some following
up
questions:
- you can prefix your queries with the TRACE keyword to see where the
time went
- are the times from running the queries once, e.g. for the first time?
have
you tried to run the queries repeatedly to see if there are any changes
in the
execution time? 3. are there any other applications running on your windows machine? 4. does your windows run in a VM or on physical hardware?
Regards, Jennie
On 19 Mar 2017, at 04:25, Manish gupta < gahoimnshg@gmail.com <mailto: gahoimnshg@gmail.com >> wrote:
Dear all, I am checking the performance of merge table vs single table on single
machine
with 4 cores. The fact table contains ~10million records. I am using #
MonetDB
5 server v11.25.9 "Dec2016-SP2" on windows machine ( core i7 16GB
physical ram
). I have created a merge table which contains fact table and 3 blank
copies of
fact table. I am seeing surprisingly slow performance of merge table
queries
compared to base table queries.
Base table query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from t3760_279_1_1
group by
f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (44.2s)
Merge Table Query
sql>select sum(f_1_1_1), count(*), f_1_1_7,f_1_1_5 from mt279_1_1 group
by
f_1_1_7,f_1_1_5 order by sum(f_1_1_1) desc limit 10; +-----------+------+---------+---------+ | L2 | L5 | f_1_1_7 | f_1_1_5 | +===========+======+=========+=========+ | 125885351 | 14 | SALE | 8084869 | | 125629268 | 15 | SALE | 6852518 | | 121733416 | 16 | SALE | 5440605 | | 110328868 | 13 | SALE | 4842411 | | 100848236 | 11 | SALE | 8068509 | | 100719291 | 16 | SALE | 5822736 | | 93127875 | 11 | SALE | 6529612 | | 91034094 | 13 | SALE | 3401293 | | 90766181 | 10 | SALE | 8084900 | | 88848574 | 10 | SALE | 2660811 | +-----------+------+---------+---------+ 10 tuples (1m 51s)
For the same results ( because merge table is containing the base table
and 3
blank copies of base table ), merge table query is taking more than 2.5
times
the base table runtime. Why is it so?
Thanks, Mohit
users-list mailing list users-list@monetdb.org <mailto: users-list@monetdb.org > https://www.monetdb.org/mailman/listinfo/users-list < https://www.monetdb.org/mailman/listinfo/users-list >
users-list mailing list users-list@monetdb.org <mailto: users-list@monetdb.org > https://www.monetdb.org/mailman/listinfo/users-list < https://www.monetdb.org/mailman/listinfo/users-list >
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
participants (5)
-
Manish gupta
-
Martin Kersten
-
Martin Kersten
-
Stefan Manegold
-
Ying Zhang