report — generate a report from a sample file
report
[-h, --help] [--version] [--verbose] -i, --in-file FILENAME
[-t, --title TITLE
] [-o, --report NAME
]
[-s, --source-dir DIRECTORY
...] [-D, --debug-dir DIRECTORY
...] [-b, --binary BINARY
...] [--debug-symbol-level LEVEL
] [--c++filt[=FILTER]
]
[-p, --percent PERCENT
] [-d, --depth FRAMES
] [--new-hwpf-algorithm] [--no-barriers]
[--cpu CPU
] [--number-of-cpus NUM
] [--level LEVEL
]
[-c, --cache-size SIZE
] [-l, --line-size SIZE
] [-r, --replacement POLICY
] [-n, --number-of-caches NUM
] [--group THREAD[,THREAD...]
] [--real-thread-id]
-h, --help
Print help message.
--version
Print version information.
--verbose
Show which filenames are tried when searching for debug information and source code.
-i, --in-file FILENAME
Specifies the input file.
-t, --title TITLE
Title of report.
-o, --report NAME
Name of the generated report file, defaults to
report.tsr
.
-s, --source-dir DIRECTORY
Additional directories to look for source code in.
-D, --debug-dir DIRECTORY
Additional directories to look for external debug information in.
The report tool will by default look in the system global debug
directory (/usr/lib/debug
), the
.debug
directory in the same directory as the
binary and the same directory as the binary.
-b, --binary BINARY
Specifies an additional binary containing debug information to use if the sampled binary with the same file name can not be found.
--debug-symbol-level
(experimental) Balance debug symbol detail and processing speed.
no debug symbols
line number
line number and public symbols
full debug info (default)
--c++-filt[=FILTER]
Specify an external symbol demangler program to translate the symbols for presentation. Useful for c++ code compiled with the 'stabs' debugging format.
If --c++filt
is given without a program, the
external program c++filt is used.
If --c++filt=FILTER
is specified, use the
program FILTER.
Default is to not translate symbols.
-p, --percent PERCENT
Percent of total fetches, upgrades or write-backs required for advice to be reported. The default is 1.
-d, --depth FRAMES
Stack depth to use for separating issues caused by different calls to the same function. Default 1. Use 0 to merge all different call paths into a function for analysis.
--new-hwpf-algorithm
Use a new experimental version of the hardware prefetch analysis algorithm, that more diligently captures complex patterns. It consumes more memory and takes longer for some input sets than the default algorithm.
--no-barriers
Show fusion and blocking advice which would otherwise be suppressed due to detected possible data dependencies.
--cpu CPU
Selects the processor model to use in the
analysis. CPU
is specified as
vendor-id
/cpu-id
. Default
is to 'auto'.
The following special processor models are defined:
Lists available processor models.
Auto-detects the processor model of the computer the report is being generated on.
--number-of-cpus NUM
Number of physical processors to include in the analysis. Each physical processor may have multiple logical processors (cores/threads). The special value '0' may be used to indicate that auto-detection should be used, which is also the default.
--level LEVEL
Selects the cache level to analyze. The number of available cache levels depend on the selected processor model. Default is to analyze the highest cache level.
-c, --cache-size SIZE
Overrides the cache size specified in the processor model.
-l, --line-size SIZE
Overrides the cache line size specified in the processor model. Must be power of two.
-r, --replacement POLICY
Cache replacement policy. Must be 'random' or 'lru'. The default is 'random'.
-n, --number-of-caches NUM
Total number of caches to assign threads to. Should match the number of caches of the desired cache level for the intended processor/architecture. Default: Determined by the processor model, cache level and the number of physical processors.
The special value '0' may be used to assign one private cache to each thread in the application.
--group THREAD[,THREAD...]
Manually specify thread to cache mappings. Each instance
of the --group
parameter creates a new
cache with the specified threads. Threads that are left
unmapped will be automatically assigned to the created
caches.
This option overrides the number of caches specified in
the processor model and using the
--number-of-caches
option.
--real-thread-id
Use real thread id:s instead of virtual when specifying
cache groups using the --group
parameter. Virtual thread id:s are assigned sequentially
starting from 0 in the order they are seen in the sample
file. The real thread id:s are the id:s exposed by the
operating system during sampling.
Example 8. Analyzing sample files using autodetected CPU models
Perform an analysis of sample.smp
for the
currently running processor on cache level 2:
$
report --level 2 -i
sample.smp
The report tool will create a report file
named report.tsr
by default.
Example 9. Specifying a CPU model
If you are running a different processor than you are
analyzing for, you may specify the --cpu
model
option.
First, use --cpu help to get a list of available CPUs.
$
report --cpu help
Find the processor you want to perform analysis for, for example the Intel Quad-Core Xeon E5345 which has the model name 'clowertown_4_8'.
Use the manufacturer name together with the model name like this when calling report:
$
report --cpu intel/clovertown_4_8 -i
sample.smp
Example 10. Using custom thread to cache mappings
Some effects of communication between threads change depending on how threads are mapped to different caches. It is possible to explicitly specify thread to cache mappings to evaluate such effects.
Assume that the sampled application contains 4 threads and we are interested in what happens in the L2 cache of a system with two coherent caches at this level. Assuming we are only interested in the cases where two threads are mapped to each cache, the following commands will create reports for all unique such cases:
$
report --group 0,1 --group 2,3 --level 2 -o case1 -i
sample.smp
$
report --group 0,2 --group 1,3 --level 2 -o case2 -i
sample.smp
$
report --group 0,3 --group 1,2 --level 2 -o case3 -i
sample.smp
There are other possible permutations as well, but they will be identical to one of the above mappings due to symmetry.