Tomograph

Tomograph mk Fri, 08/31/2012 - 13:03

The program tomograph uses the information in the event stream to illustrate the parallel processing behavior of individual queries. The information can be used to identify possible expensive and blocking portions in a query. It helps the developers to understand the actual parallel behavior and may spot areas for improvement. Tomograph is an off-line only inspection tool and it is only available on Linux platforms.

Assume a MonetDB server is running to serve the database “voc”. First, in terminal 1 start tomograph on the database as the following:

    $ tomograph -d voc
    -- Output directed towards cache/voc_*
    -- Stop capturing with or after 32 pages

Then, in terminal 2, start a mclient to execute a query:

    $ mclient -d voc -s “select count(*) from tables;”

This triggers tomograph to respond with something like below in terminal 1:

    -- page 0 :set time zone interval '+02:00' hour to minute
    -- page 1 :select count(*) from tables

This indicates that tomograph has captured two queries. The first query “set time zone...” is executed automatically once at the start of each mclient to set the time zone for this client. For each captured query, tomograph will generate one page with its execution information. The tomograph output of the set time zone query can be safely ignored.

After our real SQL query has finished, go to terminal 1 and press Ctrl-c to terminate tomograph, which will try to plot the execution information of each captured query into a separated PDF file, and glue the individual PDF files into one file. This requires the tools gnuplot and gs. If this succeeds, tomograph will terminate with a message like below:

    ^Csignal 2 received
    -- exec:gnuplot cache/voc_00.gpl;
    -- exec:gs -q -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=cache/voc.pdf -dBATCH cache/voc_00.pdf
    -- done: cache/voc.pdf

Otherwise, tomograph will terminate with an error message and instructions to manually generate the plots and glue the PDF files, like below:

    ^Csignal 2 received
    -- exec:gnuplot cache/voc_00.gpl;
    <error message>

    To finish the atlas make sure gnuplot is available and run:
    gnuplot cache/voc_00.gpl
    gnuplot cache/voc_01.gpl
    gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=cache/voc.pdf -dBATCH cache/voc_00.pdf cache/voc_01.pdf

For each captured query, tomograph produces three files:

    $ ls cache
    voc_00.dat  voc_00.gpl  voc_00.trace  voc_01.dat  voc_01.gpl  voc_01.trace

The “.trace” files contain the event stream with raw query execution information. The “.dat” and “.gpl” files are generated by tomograph by extracting information from the “.trace” files. These files can be regenerated by passing the corresponding “.trace” file to the “--input” of tomograph. The figure below shows an example tomograph graph for one query:

In a tomograph graph, the top most part (“memory in GB” and “in/oublock”) illustrates the memory RSS (shown as blue dots) and I/O activities (reads as grey dots, writes as red dots) as reported by the OS during each heartbeat event. Note that all values displayed here reflect system wide activities, thus they include not only MonetDB activities. Also note that the I/O counts shown here do not reflect the actual amount of data read/written to the hardware drives, as this information is generally not available to all users of a Linux system.

The second part (“cores”) shows a heat map of the available CPU cores in the system, one line per CPU. This information is gathered from a heartbeat monitor in the server. The arrival rate can be controlled using the “--beat” option. The colours show the level of core utilization (white: <2%, yellow: >2%, gold: >25%, orange: >50%, orangered: >75%, red: >90%). Please be aware that the information given here includes the activities of the whole system, instead of only that of MonetDB.

The third part (“worker threads”) shows the activities of all MonetDB worker threads. Therefore, this part contains the most interesting information to find the bottlenecks in query performance. Note that the activities of the software threads here are not directly linked to the CPU activities shown in the second part, as tomograph does not give any information about which core is executing which thread (although some deductions can be made). Along the time axes, we show for each software thread the MAL instructions in progress. A hollow box with smaller height indicates that the thread was waiting for the next instruction to become available for execution. The little red-filled boxes, often placed at the end of a thread, indicate there is no instruction left at that moment.

Finally, at the bottom of a tomograph, we provide detailed statistics on the most frequently executed MAL instructions. Among others, the “parallelism” is calculated by: (sum of the maximum worker running time) × (numbers of cores) ÷ (reported number of ticks spent on each MAL instruction).

A synopsis of the calling conventions:

    tomograph [options]
      -d | --dbname=<database_name>
      -u | --user=<user>
      -P | --password=<password>
      -p | --port=<portnr>
      -h | --host=<hostname>
      -T | --title=<plot title>
      -r | --range=<starttime>-<endtime>[ms,s]
      -i | --input=<profiler event file>
      -o | --output=<dir/file prefix > (default 'cache/<dbname>')
      -b | --beat=<delay> in milliseconds (default 5000)
      -A | --atlas=<number> maximum number of queries (default 1)
      -D | --debug
      -? | --help

The tomograph trace files can be also used for off-line inspection of specific time ranges, without having to contact the MonetDB server and re-run a query. Most notably is to zoom into a range of the execution trace using the option “--range”. The ranges may be expressed in seconds (s), milliseconds (ms), or microseconds (default). The trace file can also be used to locate the details of each instruction.