Sinterklaas score board

From MonetDB
Jump to: navigation, search


An overview of the Sinterklaas contest there This score board is public and anyone can upload his results. The performance evaluation sequence should be the following:

  • generate 3 query sets, query1 query2 and query3
  • flush memory
  • time
  • flush memory
  • time < query1
  • flush memory
  • time < query2
  • flush memory
  • time < query3

The reported time is the "wallclock" time (real)
The queries may be generated with the query generator described here.

Flushing memory can be achieved with the tool Meat, here.

Time format is HH:MM:SS.SS... or

A complete result set for 300 queries generated with the random seed set to 1000 may be found here:



Date Total time Loading Query 1 Query 2 Query 3 Brief description (less than 50 words if any) Team
2011-11-17 19:00 03:04:40.780 00:10:37.648 00:50:58.572 00:54:01.328 01:09:03.230 Naive approach described in the instruction page
"Old" desktop
2011-12-12 12:00 02:04:26.96 02:04:11.72 00:00:03.88 00:00:02.89 00:00:08.47 Out-of-box PostGIS with RTree
103m38.942s mins spent in building the index (included in measurement)
New desktop, SSD
2011-12-12 12:00 00:46:00.25 00:43:19.76 00:00:57.61 00:00:24.36 00:01:18.52 Out-of-box PostgreSQL with BTree on (lon,lat)
29m44.703s mins spent in building the index (included in measurement)
New desktop, SSD
2011-11-18 01:42 00:37:35.040 00:04:21.554 00:10:24.044 00:11:23.522 00:11:25.920 Naive approach MonetDB (default branch, debugging compile)
"new" desktop (HDD)
2011-11-19 11:50 00:29:28.466 00:03:53.876 00:07:55.994 00:08:33.959 00:09:04.637 Naive approach MonetDB (default branch, optimized compile)
"new" desktop (HDD)
2011-11-29 14:50 00:26:14.00 00:06:35.00 00:06:53.00 00:06:53.00 00:06:53.00 Plain VectorWise. Everything in naive SQL. Client-side COPY INTO using 'tm' tool (not optimized).
vectorwise.conf: block_size=64KB group_size=32 minmax_maxsize=8192
2011-11-29 14:50 00:20:23.40 00:19:57.00 00:00:08.80 00:00:08.80 00:00:08.80 Sorted VectorWise. Naive SQL, just CREATE INDEX(lon,lat) DDL
Note: downloading results of one query batch to client is 1.5s (included)
2011-12-09 13:15 00:18:48.187 00:04:35.284 00:04:58.934 00:04:26.856 00:04:47.113 Straightforward implementation in Hadoop.
Uncompressed data is used as input, because splitting gzip is not supported.
#Map tasks: 8, #Reduce tasks:1, HDFS replication factor:1 (HDD).
2011-11-29 14:50 00:12:44.50 00:12:35.00 00:00:01.90 00:00:01.90 00:00:01.90 Vectorwise with BDC. Naive SQL queries, but X100 DDL for BDC (8bits lon,8bits lat).
Note: downloading results of one query batch to client is 1.5s (included)
2011-12-09 20:20 00:05:07.819 00:00:50.209 00:01:25.000 00:01:27.447 00:01:25.163 (Partial) Parallel Scan. To load, copy the data.gz the local machine.
To answer a batch of 300 queries, first load all 300 queries to memory,
then read the data.gz as a stream and partition every 700MB read.
A thread is spawn for each partition to scan and materialize the 300 queries.
Source Code: /net/barcelona/export/scratch1/gast580/pscan.cpp
2011-11-18 14:50 00:04:14.833 00:02:52.119 00:00:27.086 00:00:26.890 00:00:28.738 Stochastic k-d tree Cracking in C compiled with -O3 run on i7-2600 (SDD).
The C program is a stand alone program (not a client/server program).
Each 300 queries batch is answered in one session.
The indexes are not persisted (each query batch start indexing from scratch).
Source Code: /net/barcelona/export/scratch1/gast580/row_store.c
2011-12-10 17:23 00:03:20.030 00:00:59.649 00:00:45.636 00:00:47.536 00:00:47.209 Parallel Scan. Loading: the data.gz is unzipped, partitioned every 700MB read.
A thread is spawned for each partition to parse, convert to binary format
and save it to a new binary file (Total is 18 partitions/file created).
To answer a batch of 300 queries, first load all 300 queries to memory,
then read all the 18 partitions in parallel and materialize the 300 queries.
This is like MapReduce with 18 mappers and 1 reducer with binary input format.
Source Code: /net/barcelona/export/scratch1/gast580/pscan2.cpp
2011-11-19 12:43 00:02:14.199 00:02:01.458 00:00:03.801 00:00:03.120 00:00:05.820 Holistic Stochastic k-d tree Cracking in C with -O3 and using the SSD.
The random-seed for the queries are 1, 2, and 3.
The query outputs are compared to "Monet's Naive" (see the commentary below).
The indexes are persisted (using mmap) on the first load and on each query batch.
Source Code: /net/barcelona/export/scratch1/gast580/hscrack.c
2011-11-23 12:43 00:00:53.994 00:00:45.671 00:00:02.610 00:00:02.253 00:00:03.460 (System tuned) Holistic Stochastic k-d tree Cracking in C with -O3 on SSD.
The dataset is roughly partitioned every 700MB read during loading.
A thread is spawned for each partition to parse, index, and save it to a new file.
Every 16 consecutive queries in the batch are answered in parallel.
SSD sometimes stalls for ~10 seconds (see the commentary).
Source Code: /net/barcelona/export/scratch1/gast580/system_tuned.c


  • [2011-11-17] The "old" desktop is based on Ix with 8GB RAM, 2 x 0.5TB HDD RAID0
  • [2011-11-18] The "new" desktop is based on I7 2600 with 16GB RAM, a 2TB HDD, and 80GB SSD. If you first prepare a single file with all queries and sent it over for execution (mclient -d sint queryfile), it brings query1 down to about 4 minutes. This is likely caused by the query cache.
  • [2011-11-19] The naive solutions involves establishing a separate client session with each query. This costs about 2 sec for a complete batch of 300 queries.
  • [2011-11-19] If you use the LOCKED option, then the load time reduces to about 2 min.
  • [2011-11-19] (Holistic) Stochastic Cracking is a k-d tree approach by alternating the cracking on each dimension. Furthermore, the best results are from the SSD, because it provides better random access. The HDD would lead to 20+ seconds. All the 3 query outputs of the (holistic) stochastic cracking are compared against MonetDB naive's output and they are identical.
  • [2011-11-23] During the loading phase, the runtime for writing the dataset to local machine can vary about ~10 seconds depending on SSD performance. If the SSD is gracious, it will give you a clean new block for writing thus your writing speed will be very fast. At some other time, the SSD needs to recycle/clean a dirty block before giving it to you for writing thus your writing speed will be stalled by several seconds. The reported results here is when SSD gives its best. So, on average it is +10 seconds than the reported runtime :).

Results Verification[edit]

This section shows how to verify the output of your's with MonetDB naive's output.
When producing the output result sets of the 300 queries, redirect it to a file:

./ < query1 > query1.out

The output file (query1.out) contains 300 result sets each followed by a blank line.
We want to compare the output against the "Naive approach MonetDB"'s output.
Since the result sets in the outputs are not sorted, the outputs are not directly comparable.
Both outputs need to be converted into its canonical form by sorting each result set using this C++ code:

// file: canon.cpp
#include <stdio.h>
#include <string.h>
#include <string>
#include <algorithm>
#include <vector>

using namespace std;

int main(){
    char s[10000];
    vector<string> vs;
    while (fgets(s,10000,stdin)){
        if (strlen(s)==1){ // a blank line
            for (int i=0; i<int(vs.size()); i++)
        } else {

The C++ code above will sort each result set (whenever a blank line is found).
This is how to canonicalize both naive monet's output and your's output before comparing:

./ < query1 > query1.naive           # this is the output of the "Naive approach MonetDB"
./ < query1 > query1.out                   # this is the output of the holistic stochastic cracking
g++ -o canon canon.cpp                             # compile the above C++ code
./canon < query1.naive >        # canonicalization of the naive output
./canon < query1.out >            # canonicalization of the cracking output
diff           # compare both output (should output nothing if identical)