A non-temporal data issue is reported when Freja finds places where accessed cache lines are nearly always evicted from the cache before being reused. However, the cache lines still occupy space in the cache, that could otherwise be put to better use. See Section 5.3, “Non-Temporal Data” for more information about non-temporal data.
Using non-temporal prefetches on the the non-temporal data can prevent the data from being cached in this cache level. This does not hurt performance since the data would have been evicted from the cache before being reused anyway, but may improve performance by leaving more cache space for other data that can be successfully cached, and for data of other threads and processes that are sharing the cache. See Section 5.3.5.1, “Non-Temporal Prefetches” for more information.
This issue type is normally only included when analyzing the highest cache level, that is, the cache level closest to memory, since non-temporal prefetches affect this cache level in most processors.
The most important statistics reported in the non-temporal data issue are for the non-temporal reuses, that is, data reuses when the cache line has been evicted from the first level cache.
The non-temporal data issue has the following sections:
Statistics for the reuses of the non-temporal data
This section shows the fetch ratio and number of fetches of the reuses of the non-temporal data. When using non-temporal prefetches you want the fetch ratio to be as high as possible, so that you do not cause additional fetches by preventing caching of the data.
The section also shows the percentage of all fetches of the application that are caused by these non-temporal reuses. The larger the percentage of all fetches, the greater the potential for performance improvement.
Last instructions to touch the data before it is evicted
This section lists the instructions that were last to touch the cache lines before they were evicted, and the percentage of the time each instruction was last.
For each instruction you also get an estimate of the cache size that would be required for the fetch ratio the reuses where the instruction was last touch the data to fall below 80%.
For example, if you have done the analysis for a 2 MB cache, and the non-temporal data requires a 200 MB cache not to be evicted you can safely insert non-temporal prefetches since no current processor has a cache that large. On the other hand, if the non-temporal data only requires 6 MB of cache to fit, inserting non-temporal prefetches may cause unnecessary fetches on processors that have 6 MB or larger caches.
First instructions to touch the data after it is evicted
This section lists the instructions that were first to touch the cache lines after they were evicted, and the percentage of the time each instruction was first.
Statistics for the instruction group containing the last instructions to touch the data before it is evicted.
Instructions in instruction group
Instructions in the instruction group containing the last instructions to touch the data before it is evicted.
Statistics for the loop containing the last instructions to touch the data before it is evicted.
Instructions in the loop containing the last instructions to touch the data before it is evicted.