Login | Register
My pages Projects Community openCollabNet

trend-prof
Project home

If you were registered and logged in, you could join this project.

Summary Finds performance trends.
Category construction
License BSD License
Owner(s) dsw, simongoldsmith

Trend Profiler

Maintainer: Simon Goldsmith
Maintainer: Daniel S. Wilkerson

Summary

trend-prof aims at finding performance surprises (super-linearities) in C / C++ code based on trends in the runs of the program on small to medium inputs. That is, we predict which parts of the code will not scale well to larger inputs. For each basic block (more or less a line of source code), trend-prof constructs a model that predicts how many times that basic block is executed as a function of input size.

The current trend-prof tool works on C / C++ compiled with gcc (see below). With some hacking, it can be made to work on other programs.

To see what trend-prof is capable of, please see our recent paper here.

Measuring Empirical Computational Complexity.
SEC/FSE'07: The 6th joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering,
Simon F. Goldsmith, Alex S. Aiken, Daniel S. Wilkerson.

Directory contents

Other documents in this directory that you will want to consult.

Releases

Overview

Inputs:

  • Your C / C++ program compiled with gcc -ftest-coverage -fprofile-arcs
  • Several workloads for your program: Each workload should be annotated with a "size".

Outputs:

  • For each basic block the following are output.
    • a model that predicts number of executions as a function of input size. There are two kinds of models: a linear model like count = 10 + 20n or a power law model like count = 4 * n^1.2.
    • a plot with input size on the horizontal axis and number of executions on the vertical axis. The plot shows the observed number of executions of the basic block and the number of executions the model predicts.
  • For each source file: a copy of that source file where each line is annotated with the model for that line.
  • A summary of all the models for all the basic blocks, ranked from most executions to least (configurable).

The rankings are intended to suggest the places in your code that will have the worst performance as workload sizes increase. trend-prof is different from traditional profilers in that it attempts to predict what will happen at workload sizes larger than were actually measured instead of merely showing what did happen on sizes that were actually measured.

It is important to consider carefully the results that trend-prof reports. It shows graphs of observed execution counts for a basic block (y axis) versus user reported workload size (x axis) and the line of best fit for those points. To the extent that the workload size is a good predictor of the execution count, the points will be tightly clustered and the line will fit well. Similarly, the models trend-prof constructs will be representative of the chosen workloads; the model will be a good predictor of a new workload only to the extent that the new workload is like the ones on which trend-prof was initially run.

The bubble sort example in examples/bubble_sort shows a subtly incorrect use of trend-prof; in particular, notice that trend-prof predicts that line 19 will be executed (input size)^1.84 times even though bubble sort is quadratic in the size of its input. If you understand why, you'll get more out of trend-prof (hint: what's the input size?).

trend-prof shows its work; it is easy to look at intermediate computations. For instance, the maintainers found an off-by-one bug in the bubble sort example by looking at the exact number of times line 19 was executed for various workloads. See the Schema of a Trend Profile for details on trend-prof's intermediate files.

This work was supported by professor Alex Aiken and was done at UC Berkeley.

Installing

  1. Make sure all of the following programs are installed.
    • gcc and gcov
    • perl
    • the perl module Archive::Zip; Running cpan -i Archive::Zip as root will install this module if cpan is set up.
    • gnuplot
  2. Download and unpack the release tarball (or check out the repository) into, say, "~/trend-prof".
  3. Edit Config.mk so that BIN points at where you put trend-prof. For example
    BIN := /home/simon/trend-prof
    
    Also, delete the line that looks like:
    $(error You must set the BIN variable in Config.mk.  Please see the documentation)
    
  4. If necessary, edit Config.mk so that it can find perl, time, gnuplot, and gcov on your system. The version of gcov pointed to in Config.mk must be the same as the version of gcc with which you compiled your program. Any difference will result in an obscure error message during the run of trend-prof.
  5. run "make test". This target runs trend-prof on the example in the examples/bubble_sort profdirs. You can clean up all the stuff "make test" generates with "make test-clean".

An example use of trend-prof

We'll illustrate the use of trend-prof with an example. This example is included in the repository under the "examples/quick_sort" directory. Here we detail how that example was written. You can run trend-prof on it by running make test/quick_sort. Note: Most of the links below this point do not work when this page is viewed as http://trend-prof.tigris.org; however, they should work if you're viewing a local copy of this document that you got as part of a trend-prof tarball (or check out). Some of the links won't point to anything until you've built the examples/quick_sort by running make test/quick_sort.

Suppose we have the following program which generates a random array of integers and calls qsort() on them main.c compare.c compare.h Makefile.

Setup your program

  • Install trend-prof (see above).
    
    
  • Create and initialize a directory, in this example we'll use $HOME/quick_sort to store profiling information. Note that profdir/ is a parameterized make target and the string following the / is the path to the directory.
    $ cd $HOME/trend-prof
    $ make profdir/$HOME/quick_sort
    
    Note that your directory $HOME/quick_sort is now populated with a skeleton of a Trend Profile directory; for example it now has a "0_config" subdirectory. We suggest that you do not use make profdir/~/quick_sort as some systems won't do what you want. Specifying an absolute path is always safe make profdir//home/simon/quick_sort. We suggest also that you put your profdir and your compiled code on a local disk -- not a networked file system. During a profiling session, trend-prof does a large amount (approximately number of workloads times size of your program) of disk i/o and writing across a network is likely to slow things down a great deal.
  • Make a directory (or symlink) called "$HOME/quick_sort/0_config/src" that contains your program source. This directory may contain an arbitrarily deep directory hierarchy; trend-prof uses find to locate the files it needs.
    $ cd $HOME/quick_sort/0_config
    $ mkdir src
    (copy your source code there)
    
  • Compile your code with gcc (or g++) as follows. You can use trend-prof without compiling every file in this way, but you will get information and models only for the files you compile with these flags. See "man gcc" and "man gcov" for more information on these flags.
    • Compiling: Compile your source files with gcc -fprofile-arcs -ftest-coverage.
      $ cd $HOME/quick_sort/0_config/src
      $ gcc -fprofile-arcs -ftest-coverage -g -O3 -c main.c -o main.o
      $ gcc -fprofile-arcs -ftest-coverage -g -O3 -c compare.c -o compare.o
      
    • Linking: If you use gcc or g++ to link, pass -fprofile-arcs -ftest-coverage. If you call ld directly, you're on your own, but passing -lgcov is a good start.
      $ gcc -fprofile-arcs -ftest-coverage -g -O3 main.o compare.o -o main
      
      Notice that compiling as above creates main.gcno and compare.gcno and that running main (created as above) creates main.gcda and compare.gcda. We refer to these files as "gcov metadata". trend-prof invokes gcov to gather information about program runs.
    • Constructing Static Libraries: You don't have to do anything special when constructing static libraries.
    • Constructing Dynamic Libraries: As is the case for linking, with gcc or g++, pass -fprofile-arcs -ftest-coverage. If you call ld directly, you're on your own, but passing -lgcov is a good start.
    Versions of gcc that are known to work with trend-prof include 3.4, 4.0. Versions of gcc that are known to not work with trend-prof include 2.95, 3.3.
  • Make a directory (or symlink) called "$HOME/quick_sort/0_config/obj" that contains the object files for your program. In particular, trend-prof will look for gcov metadata (for example, main.gcda) here. "obj" can just be a symlink to "src" if you're not doing anything fancy with your build process and your .o files are right next to your .c/.cc files.
    $ cd $HOME/quick_sort/0_config
    $ ln -s src obj
    

Setup your workloads

  • Create a file called "0_config/workloads" that lists all the workloads on which you want to run your program and their sizes.

    The script workloads_from_dir will do this for you if your workloads are just simple input files in a directory and what you are thinking of as the "size" of the workload (the number on the X-axis on the plots) is just the concrete size in bytes of the file. Please see examples/bubble_sort/0_config/Makefile for an example use of this technique. Feel free to hack this script or roll your own.

    The file looks something like the following.

    workload:
      size=256
      name=workload1
    
    #you need a blank line to separate workloads
    workload:
      size=512
      # Workload names may not start with whitespace nor contain newlines.
      # Other than that they can contain most keyboard characters.
      # It is up to your run_workload script to make sense of the names.
      name=workload2,stuff=blah still part of the workload name
    
    See the workloads file of examples/quick_sort for another example.
  • Create a script called "0_config/run_workload" that takes as input a workload name (from the "0_config/workloads" file) and runs that workload. The script will be run from the profdir ("$HOME/quick_sort/" in this case). The names of the workloads don't have to resemble anything in particular; trend-prof just passes them to your "run_workload" script. Your script could be as simple as the following. Remember to chmod +x run_workload. Also, remember to 'exec' your program from the script if you can to avoid extra process creation.
    #!/bin/sh
    exec 0_config/src/your_program 0_config/your_inputs/$1
    
    See the run_workload script of examples/quick_sort for a more involved example.
  • Test your script. Doing so will create a .gdca file which records the results of profiling. You can clean out the effect of profiling the runs and restart by deleting all .gcda files. You don't need to be concerned that you are adding a data point to your results because trend-prof will remove all the .gcda files when it starts its profiling run of your workloads.
    $ cd $HOME/quick_sort
    $ 0_config/run_workload size=64,seed=1234
    
  • Your directory structure should look like this:
    $ cd $HOME/quick_sort
    $ find
    .
    ./Makefile
    ./0_config
    ./0_config/src
    ./0_config/src/main
    ./0_config/src/Makefile
    ./0_config/src/main.c
    ./0_config/src/main.gcno
    ./0_config/src/compare.c
    ./0_config/src/compare.h
    ./0_config/src/main.o
    ./0_config/src/compare.gcno
    ./0_config/src/compare.o
    ./0_config/run_workload
    ./0_config/workloads
    
    If you've run main, the following files will also be present.
    ./0_config/src/main.gcda
    ./0_config/src/compare.gcda
    

Run Trend Profiler

  • Run trend-prof
    $ cd $HOME/quick_sort
    $ make
    
  • If trend-prof can't match up all the gcov metadata with source code, it will complain and halt the run. The problematic files will be annotated as such in "0_config/source_files". Your options are as follows.
    • Keep going anyway. Just type "make" again and things will resume. In this case some of the data trend-prof gathers won't be matched up to the corresponding source code. The data and models will be reported with the correct source files and line numbers, but trend-prof will not show the source code for those files.
    • (recommended) Edit the "0_config/source_files" file so that gcov metadata is matched up with the correct source file. The source_files file looks as follows.
    SourceFile:
       #how you want the name to show up on results pages
       displayName=main.c
       #the gcov metadata (.gcno) file;  the path is relative to the profdir ($HOME/quick_sort/ in the example)
       gcno=0_config/obj/main.gcno
       #the source file; the path is relative to the profdir ($HOME/quick_sort/ in the example)
       source=0_config/src/main.c
    
    # you need a blank line between entries; comments don't count
    SourceFile:
       displayName=compare.c
       gcno=0_config/obj/compare.gcno
       source=0_config/src/./compare.c
    
  • Point your web browser at the 4_view/index.html file.

Acknowledgements

Thanks to Scott McPeak for pointing us towards gcov.

Thanks to Johnathon Jamison for trying the tool and pointing out several shortcomings in the documentation.

Thanks to Armando Solar-Lezama for trying the tool.

Thanks to Karl Chen for pointing out the performance bottleneck in the first pre-release version and for suggesting writing a local server as a way to prevent the rendering of the whole output at once.