Profile-based optimization (PBO) is a set of performance-improving
code transformations based on the run-time characteristics of your
application.
There are three steps involved in performing this optimization:
Instrumentation - Insert data collection
code into the object program.
Data Collection - Run
the program with representative data to collect execution profile
statistics.
Optimization - Generate
optimized code based on the profile data.
Invoke profile-based optimization through HP C by using any
level of optimization and the +I
and +P options
on the cc command
line.
Compile times will be fast and link times will be slow when
using PBO because code generation happens at link time.
Instrumenting the Code |
 |
To instrument your program, use the +I
option as follows:
cc -Aa +I -O -c sample.c Compile for instrumentation.
|
cc -o sample.exe +I -O sample.o Link to make instrumented executable.
|
The first command line uses the -O option to perform level
2 optimization and instruments the code. The -c
option in the first command line suppresses linking and creates
an intermediate object file called sample.o.
The .o file can
be used later in the optimization phase, avoiding a second compile.
The second command line uses the -o
option to link sample.o
into sample.exe.
The +I option
instruments sample.exe
with data collection code. Note that instrumented programs run slower
than non-instrumented programs. Only use instrumented code to collect
statistics for profile-based optimization.
Collecting Data for Profiling |
 |
To collect execution profile statistics, run your instrumented
program with representative data as follows:
sample.exe < input.file1 Collect execution profile data.
|
This step creates and logs the profile statistics to a file,
by default called flow.data.
The data collection file is a structured file that may be used to
store the statistics from multiple test runs of different programs
that you may have instrumented.
Performing Profile-Based Optimization |
 |
To optimize the program based on the previously collected
run-time profile statistics, relink the program as follows:
cc -o sample.exe +P -O sample.o
|
An alternative to this procedure is to recompile the source
file in the optimization step:
cc -o sample.exe +I -0 sample.c instrumentation
|
sample.exe < input.file1 data collection
|
cc -o sample.exe +P -O sample.c optimization
|
Maintaining Profile Data Files |
 |
Profile-based optimization stores execution profile data in
a disk file. By default, this file is called flow.data
and is located in your current working directory.
You can override the default name of the profile data file.
This is useful when working on large programs or on projects with
many different program files.
The FLOW_DATA
environment variable can be used to specify the name of the profile
data file with either the +I
or +P options.
The +df command
line option can be used to specify the name of the profile data
file when used with the +P
option.
The +df
option takes precedence over the FLOW_DATA
environment variable.
In the following example, the FLOW_DATA
environment variable is used to override the flow.data
file name. The profile data is stored instead in /users/profiles/prog.data.
%setenv FLOW_DATA /users/profiles/prog.data %cc -Aa -c +I +O3 sample.c %cc -o sample.exe +I +03 sample.o %sample.exe < input.file1 %cc -o sample.exe +P +03 sample.o
|
In the next example, the +df
option is used to override the flow.data
file name with the name /users/profiles/prog.data.
%cc -Aa -c +I +O3 sample.c %cc -o sample.exe +I +03 sample.o %sample.exe < input.file1 %mv flow.data /users/profile/prog.data %cc -o sample.exe +df /users/profiles/prog.data +P +03 sample.o
|
Maintaining Instrumented and Optimized Program Files |
 |
You can maintain both instrumented and optimized versions
of a program. You might keep an instrumented version of the program
on hand for development use, and several optimized versions on hand
for performance testing and program distribution.
Care must be taken when maintaining different versions of
the executable file because the instrumented
program file name is used as the key identifier
when storing execution profile data in the
data file.
The optimizer must know what this key identifier
name is in order to find the execution profile data. By default,
the key identifier name used to retrieve
the profile data is the instrumented program
file name used to run the program for data collection.
When you optimize a program file and the optimized program
file name is different from the instrumented program file name,
you must use the +pgm
option. Specify the instrumented program file name with this option.
The optimizer uses this value as the key identifier
to retrieve execution profile data.
In the following example, the instrumented program file name
is sample.inst.
The optimized program file name is sample.opt.
The +pgm name
option is used to pass the instrumented program name to the optimizer:
%cc -Aa -c +I +O3 sample.c %cc -o sample.inst +I +03 sample.o %sample.inst < input.file1 %cc -o sample.opt +P +03 +pgm sample.inst sample.o
|
Profile-Based Optimization Notes |
 |
When using profile-based optimization, please note the following:
Because the linker performs code generation
for profile-based optimization, linking object files compiled with
+I and +P
takes more time than linking ordinary object files. However, compile-times
will be relatively fast. This is because the compiler is only generating
the intermediate code.
Profile-based optimization has a greater impact
on application performance at each higher level of optimization.
Profile-based optimization should be enabled during
the final stages of application development. To obtain the best
performance, re-profile and re-optimize your application after making
source code changes.
If you use level-4 or profile-based optimization
and do not use +DA
to generate code for a specific version of PA-RISC, note that code
generation occurs at link time. Therefore, the system on which you
link, rather than compile, determines the object code generated.
If you use level-4 or profile-based optimization
and do not use +DS
to specify instruction scheduling, note that instruction scheduling
occurs at link time. Therefore, the system on which you link, rather
than compile, determines the implementation of instruction scheduling.
For more information on profile-based optimization, see the
HP-UX Linker and Libraries Online User Guide.