October 7, 2009

High Performance C++ Profiling

Filed under: Uncategorized — floodyberry @ 10:40 am

[This is about my C++ Profiler which may be found on Google Code under High Performance C++ Profiler]

My interest in code profiling started when I was making hudbot. What with code injection and patching, function hooking, data hijacking, and OpenGL, I knew I had relatively no experience in what I was attempting and that I could easily be producing some amazing slowdowns if I wasn’t careful.

Unfortunately, C++ profilers seem to come in three varieties, all of which have a fatal downside:

  1. Sampling Profilers which are fast, multi-threaded, but inaccurate and have decent output (sometimes too detailed). Some examples are VTune, CodeAnalyst, google-perftools and Sleepy.
  2. Instrumenting Profilers which are accurate, multi-threaded, but slow, and have decent output. Some examples are GlowCode and the now defunct DevPartner Profiler Community Edition.
  3. Instrumenting Profilers which are fast, accurate, but single threaded and have limited output. These range from extremely simple profilers like Peter Kankowski’s Poor Man’s Profiler to the more complicated and full-featured Shiny C++ Profiler.

The obvious outcome is that if you want fast and accurate, like I did, you’ll have to use an existing profiler or write it yourself and instrument your code manually. With a little work, fancy stuff like call trees can be added. Once you get it tested and working, you can start going crazy profiSegmentation fault.

Oh yeah, about that. There are no multi-threaded instrumented profilers that are open source, and depending on how your single threaded profiler works, the results when trying to use it in a multi-threaded environment can range from bad data to outright crashing. It’s possible to patch the profiler to only allow the main thread in, but this adds unnecessary slowdowns and doesn’t address how to profile other threads. This is where my profiler comes in!


Blog at