The Prefetch Cache module is a performance enhancing module included in some processors of the PIC32MX family.
It consists of a Prefetch Buffer, combined with a small Program Flash Memory (PFM) Cache memory.
When running at high-clock rates, Wait states must be inserted into PFM read transactions to meet the access time of the PFM. Wait states can be hidden to the core by prefetching and storing instructions in a temporary holding area that the CPU can access quickly. Although the data path to the CPU is 32 bits wide, the data path to the PFM is 128 bits wide. This wide data path provides the same bandwidth to the CPU as a 32-bit path running at four times the frequency.
There are two main functions that the Prefetch Cache module performs:
- Caching instructions when they are accessed, and
- Prefetching instructions from the PFM before they are needed.
The cache holds a subset of the cacheable memory in temporary holding spaces known as cache lines. Each cache line has a tag describing what it is currently holding, and the address where it is mapped. Normally, the cache lines just hold a copy of what is currently in memory to make data available to the CPU without Wait states.
CPU requested data may or may not be in the cache.
A cache-miss occurs if the CPU requests cacheable data that is not in the cache. In this case, a read is performed to the PFM at the correct address, the data is supplied to the cache and to the CPU.
A cache-hit occurs if the cache contains the data that the CPU requests. In the case of a cache-hit, data is supplied to the CPU without Wait states.
The second main function of the Prefetch Cache module is to prefetch cache instructions. The module calculates the address of the next cache line and performs a read of the PFM to get the next 16-byte cache line. This line is placed into a 16-byte-wide prefetch cache buffer in anticipation of executing straight-line code.
The PIC32MX Prefetch Cache Reference Manual covers the PIC32MX prefetch cache in detail:
Which feature helps in which situations?
The prefetch mechanism helps to increase pure MIPS performance when:
- The code is very linear (i.e. few jumps, loops, etc.)
- The code is being executed from KSEG1 (which can’t be cached)
The cache helps in all other cases by:
- Keeping a copy of program instructions of short, tight loops in fast cache RAM
- Allowing frequently used code to be locked into the cache for as long as desired. This can help to ensure there is a fixed latency when executing, for example, an interrupt routine Caching data, such as coefficients, data tables or text strings, in the fast cache RAM.
Data Memory SRAM accesses can’t be cached, regardless of whether it is data or program instructions that are being accessed. However, this is no big issue since the SRAM functions at core speed, and therefore does not require any wait states.
Some PIC32MX devices implement wait state settings for Data Memory SRAM access. Please consult your device datasheet, and ensure you configure the setting to optimize Data Memory SRAM performance in your application.
Basic Configuration
The primary SFR used to control PFM wait states and configure the module is the CHECON (Cache Control) register.
The primary SFR used to control Data SRAM wait states is the BMXCON (Bus Matrix Control) register.
On reset, both cache & prefetch functions are disabled! and Data SRAM & PFM wait states are set to their maximum values!
Therefore, applications are highly advised to always enable these modules!
There is no disadvantage to turning on these modules from a performance perspective. From a power perspective, if you are running at 30MHz or lower (and since you will be using 0 wait states) leave the prefetch turned off as it does not provide any advantages and consumes power. The cache on the other hand helps to reduce power since accessing instructions in the cache requires less power than accessing instructions from the flash.
The following example initialization code enables the prefetch cache for KSEG0 instructions, initializes Data SRAM wait states to 0, and initializes PFM wait states to 2 as per Table 31-12 (reproduced below) for a PIC32MX795F512L running @ 80MHz: