This article describes how to debug performance issues with help of Valgrind, Callgrind and KCachegrind.
Callgrind uses runtime instrumentation via the Valgrind framework for its cache simulation and call-graph generation. This way, even shared libraries and dynamically opened plugins can be profiled. The data files generated by Callgrind can be loaded into KCachegrind for browsing the performance results. But there is also a command line tool in the package to get ASCII reports from data files without the need to use KCachegrind. But the browsing tools should be enough to debug the performance problem. Command line tool and its instructions are out of scope of this page.
Installation
Install valgrind and KCachegrind
Requirements:
Callgrind: part of Valgrind (supports Linux on x86, amd64, arm7, ...)
KCachegrind
Libraries and development files for KDE 4.4 or higher
Commands 'dot' (GraphViz) for call graph, and 'objdump' (BinUtils) for assembler view (these are runtime requirements, not needed for compilation)
QCachegrind (included in KCachegrind sources)
Qt5 or Qt4.x (x>=4) or higher
'dot' binary for call graph and 'objdump' binary for annotated machine code
Installing KCachegrind package on Ubuntu 14.04 or higher (Trusty Tahr) is as easy as running the following command on terminal. You also need to install graphviz in order to view the call graph in KCachegrind. The applications are already packaged for the most important Linux distributions. You can just use apt-get to install them:
sudo apt-get install valgrind kcachegrind graphviz
or aptitude:
sudo aptitude install valgrind kcachegrind graphviz
How to run
Start the netconfd-pro server as follows, CLI parameters can vary:
valgrind --tool=callgrind [callgrind options] your-program [program options]
E.g.:
valgrind --tool=callgrind netconfd-pro module=ietf-interfaces module=iana-if-type log-level=info no-config access-control=off
For more info on how to use valgrind with callgrind refer to http://valgrind.org/docs/manual/cl-manual.html
Execute desired operation that you want to test and shutdown the server. After the cleanup is done, you should see something similar to:
==7729==
==7729== Events : Ir
==7729== Collected : 175808352
==7729==
==7729== I refs: 175,808,352
The result will be stored in a callgrind.out.XXX file where XXX will be the process identifier.
valgrind --tool=callgrind netconfd-pro module=ietf-interfaces module=iana-if-type log-level=info no-config access-control=off
ls
callgrind.out.7729
You can read this file using a text editor, but it won't be very useful because it's very cryptic. That's here that KCacheGrind will be useful. You can launch KCacheGrind using command line or in the program menu if your system installed it here. Then, you have to open your profile file.
The first view present a list of all the profiled functions. You can see the inclusive and the self cost of each function and the location of each one.
Once you click on a function, the other views are filled in with information. The view in upper right part of the window gives some information about the selected function.
The view have several tabs presenting different information:
Types: Present the types of events that have been recorded. In our case, it's not really interesting, it's just the number of instructions fetch
Callers: List of the direct callers
All Callers: List of all the callers, it seems the callers and the callers of the callers
Callee Map: A map of the callee, personally, I do not really understand this view, but it's a kind of call graph representing the cost of the functions
Source code: The source code of the function if the application has been compiled with the debug symbol
And finally, you have another view with data about the selected function.
Again, several tabs:
Callees: The direct callees of the function
Call Graph: The call graph from the function to the end
All Callees: All the callees and the callees of the callees
Caller Map: The map of the caller, again not really interesting for me
Machine Code: The machine code of the function if the application has been profiled with --dump-instr=yes option
You have also several display options and filter features to find exactly what you want and display it the way you want.
The information provided by KCacheGrind can be very useful to find which functions takes too much time or which functions are called too much.