Maintainer: Simon Goldsmith
Maintainer: Daniel S. Wilkerson
trend-prof aims at finding performance surprises (super-linearities) in C / C++
code based on trends in the runs of the program on small to medium inputs. That
is, we predict which parts of the code will not scale well to larger inputs. For
each basic block (more or less a line of source code), trend-prof constructs a
model that predicts how many times that basic block is executed as a function of
The current trend-prof tool works on C / C++ compiled with gcc (see
below). With some hacking, it can be made to work on other programs.
To see what trend-prof is capable of, please see our
recent paper here.
Measuring Empirical Computational
SEC/FSE'07: The 6th
joint meeting of the European Software Engineering Conference and the
ACM SIGSOFT Symposium on the Foundations of Software
F. Goldsmith, Alex
S. Aiken, Daniel S. Wilkerson.
Other documents in this directory that you will want to consult.
- Your C / C++ program compiled with gcc -ftest-coverage -fprofile-arcs
- Several workloads for your program: Each workload should be annotated with a
- For each basic block the following are output.
- a model that predicts number of executions as a function of input size.
There are two kinds of models: a linear model like count = 10 + 20n or a
power law model like count = 4 * n^1.2.
- a plot with input size on the horizontal axis and number of executions on the
vertical axis. The plot shows the observed number of executions of the basic
block and the number of executions the model predicts.
- For each source file: a copy of that source file where each line is annotated
with the model for that line.
- A summary of all the models for all the basic blocks, ranked from most
executions to least (configurable).
The rankings are intended to suggest the places in your code that will have
the worst performance as workload sizes increase. trend-prof is different
from traditional profilers in that it attempts to predict what will happen at
workload sizes larger than were actually measured instead of merely showing what
did happen on sizes that were actually measured.
It is important to consider carefully the results that trend-prof reports.
It shows graphs of observed execution counts for a basic block (y axis) versus
user reported workload size (x axis) and the line of best fit for those points. To
the extent that the workload size is a good predictor of the execution count, the
points will be tightly clustered and the line will fit well. Similarly, the
models trend-prof constructs will be representative of the chosen workloads;
the model will be a good predictor of a new workload only to the extent that the
new workload is like the ones on which trend-prof was initially run.
The bubble sort example in examples/bubble_sort shows a
subtly incorrect use of trend-prof; in particular, notice
that trend-prof predicts that line 19 will be executed (input
size)^1.84 times even though bubble sort is quadratic in the size of
its input. If you understand why, you'll get more out of
trend-prof (hint: what's the input size?).
trend-prof shows its work; it is easy to look at intermediate
computations. For instance, the maintainers found an off-by-one bug in the bubble
sort example by looking at the exact number of times line 19 was executed for
various workloads. See the Schema of a Trend Profile
for details on trend-prof's intermediate files.
This work was supported by professor Alex Aiken and was
done at UC Berkeley.
- Make sure all of the following programs are installed.
- gcc and gcov
- the perl module Archive::Zip; Running cpan -i
Archive::Zip as root will install this module if cpan is set up.
- Download and unpack the release tarball (or check out the repository)
into, say, "~/trend-prof".
- Edit Config.mk so that BIN points at where you put trend-prof. For
BIN := /home/simon/trend-prof
Also, delete the line that looks like:
$(error You must set the BIN variable in Config.mk. Please see the documentation)
- If necessary, edit Config.mk so that it can find perl, time, gnuplot, and
gcov on your system. The version of gcov pointed to in Config.mk must be the same
as the version of gcc with which you compiled your program. Any difference will
result in an obscure error message during the run of trend-prof.
- run "make test". This target runs trend-prof on the example in the
examples/bubble_sort profdirs. You can clean up all the stuff "make
test" generates with "make test-clean".
An example use of trend-prof
We'll illustrate the use of trend-prof with an example.
This example is included in the repository under the
"examples/quick_sort" directory. Here we detail how that example was
written. You can run trend-prof on it by running make
test/quick_sort. Note: Most of the links below this point do not work
when this page is viewed as http://trend-prof.tigris.org; however,
they should work if you're viewing a local copy of this document that
you got as part of a trend-prof tarball (or check out). Some of the links
won't point to anything until you've built the examples/quick_sort by
running make test/quick_sort.
Suppose we have the
following program which generates a random array of integers and calls qsort() on
Setup your program
- Install trend-prof (see above).
- Create and initialize a directory, in this example we'll use $HOME/quick_sort to store
profiling information. Note that profdir/ is a
parameterized make target and the string following the / is
the path to the directory.
$ cd $HOME/trend-prof
$ make profdir/$HOME/quick_sort
Note that your directory $HOME/quick_sort is now populated with a
skeleton of a Trend Profile directory; for example it now has a
We suggest that you do not use make profdir/~/quick_sort as some systems
won't do what you want. Specifying an absolute path is always safe make
profdir//home/simon/quick_sort. We suggest also that you put your profdir
and your compiled code on a local disk -- not a networked file system. During a
profiling session, trend-prof does a large amount (approximately number
of workloads times size of your program) of disk i/o and writing
across a network is likely to slow things down a great deal.
- Make a directory (or symlink) called "$HOME/quick_sort/0_config/src" that contains your program
source. This directory may contain an arbitrarily deep directory hierarchy; trend-prof
uses find to locate the files it needs.
$ cd $HOME/quick_sort/0_config
$ mkdir src
(copy your source code there)
- Compile your code with gcc (or g++) as follows. You can
use trend-prof without compiling every file in this way, but you will get
information and models only for the files you compile with these flags. See "man
gcc" and "man gcov" for more information on these flags.
Versions of gcc that are known to work with trend-prof include
Versions of gcc that are known to not work with trend-prof include
- Compiling: Compile your source files with gcc -fprofile-arcs -ftest-coverage.
$ cd $HOME/quick_sort/0_config/src
$ gcc -fprofile-arcs -ftest-coverage -g -O3 -c main.c -o main.o
$ gcc -fprofile-arcs -ftest-coverage -g -O3 -c compare.c -o compare.o
- Linking: If you use gcc or g++ to link, pass
-fprofile-arcs -ftest-coverage. If you call ld directly,
you're on your own, but passing -lgcov is a good start.
$ gcc -fprofile-arcs -ftest-coverage -g -O3 main.o compare.o -o main
Notice that compiling as above creates
main.gcno and compare.gcno and that running main (created as above) creates
main.gcda and compare.gcda. We refer to these files as "gcov
metadata". trend-prof invokes gcov to gather information about program runs.
- Constructing Static Libraries: You don't have to do anything special when constructing static
- Constructing Dynamic Libraries: As is the case for linking, with gcc or
g++, pass -fprofile-arcs -ftest-coverage. If you call ld
directly, you're on your own, but passing -lgcov is a good
- Make a directory (or symlink) called "$HOME/quick_sort/0_config/obj" that
contains the object files for your program. In particular,
trend-prof will look for gcov metadata (for example,
main.gcda) here. "obj" can just be a symlink to "src" if
you're not doing anything fancy with your build process and your .o
files are right next to your .c/.cc files.
$ cd $HOME/quick_sort/0_config
$ ln -s src obj
Setup your workloads
- Create a file called "0_config/workloads" that lists all the
workloads on which you want to run your program and their sizes.
The script workloads_from_dir will do this for you if your
workloads are just simple input files in a directory and what you are
thinking of as the "size" of the workload (the number on the X-axis on
the plots) is just the concrete size in bytes of the file. Please see
examples/bubble_sort/0_config/Makefile for an example use of this
technique. Feel free to hack this script or roll your own.
The file looks something like the following.
#you need a blank line to separate workloads
# Workload names may not start with whitespace nor contain newlines.
# Other than that they can contain most keyboard characters.
# It is up to your run_workload script to make sense of the names.
name=workload2,stuff=blah still part of the workload name
See the workloads file of
examples/quick_sort for another example.
- Create a script called "0_config/run_workload" that takes as
input a workload name (from the "0_config/workloads" file) and runs
that workload. The script will be run from the profdir
("$HOME/quick_sort/" in this case). The names of the workloads don't
have to resemble anything in particular; trend-prof just
passes them to your "run_workload" script. Your script could be as
simple as the following. Remember to chmod +x run_workload.
Also, remember to 'exec' your program from the script if you can to
avoid extra process creation.
exec 0_config/src/your_program 0_config/your_inputs/$1
See the run_workload
script of examples/quick_sort for a more involved example.
- Test your script. Doing so will create a .gdca file which records
the results of profiling. You can clean out the effect of profiling
the runs and restart by deleting all .gcda files. You don't need to
be concerned that you are adding a data point to your results because
trend-prof will remove all the .gcda files when it starts its
profiling run of your workloads.
$ cd $HOME/quick_sort
$ 0_config/run_workload size=64,seed=1234
- Your directory structure should look like this:
$ cd $HOME/quick_sort
If you've run main, the following files will also be present.
Run Trend Profiler
- Run trend-prof
$ cd $HOME/quick_sort
- If trend-prof can't match up all the gcov metadata with source code,
it will complain and halt the run. The problematic files will be annotated as
such in "0_config/source_files".
Your options are as follows.
- Keep going anyway. Just type "make" again and things will resume. In this
case some of the data trend-prof gathers won't be matched up to the
corresponding source code. The data and models will be reported with the correct
source files and line numbers, but trend-prof will not show the source code
for those files.
- (recommended) Edit the "0_config/source_files" file so that gcov metadata is
matched up with the correct source file. The source_files file looks as follows.
#how you want the name to show up on results pages
#the gcov metadata (.gcno) file; the path is relative to the profdir ($HOME/quick_sort/ in the example)
#the source file; the path is relative to the profdir ($HOME/quick_sort/ in the example)
# you need a blank line between entries; comments don't count
- Point your web browser at the 4_view/index.html
Thanks to Scott McPeak for
pointing us towards gcov.
Thanks to Johnathon Jamison for trying the tool and pointing out several
shortcomings in the documentation.
Thanks to Armando Solar-Lezama for trying the tool.
Thanks to Karl
Chen for pointing out the performance bottleneck in the first
pre-release version and for suggesting writing a local server as a way
to prevent the rendering of the whole output at once.