The
memory hierarchy of high performance and embedded processors has been shown to
be one of the major energy consumers. Extrapolating the current trends, this portion
is likely to be increased in the near future. In this paper, a technique is proposed
which uses an additional mini cache, called the L0-cache, located between the
I-cache and the CPU core. This mechanism can provide the instruction stream to
the data path, and when managed properly, it can efficiently eliminate the need
for high utilization of the more expensive I-cache.Cache
memories are accounting for an increasing fraction of a chip's transistors and
overall energy dissipation. Current proposals for resizable caches fundamentally
vary in two design aspects: (1) cache organization, where one organization, referred
to as selective-ways, varies the cache's set-associatively, while the other, referred
to as selective-sets, varies the number of cache sets, and (2) resizing strategy,
where one proposal statically sets the cache size prior to an application's execution,
while the other allows for dynamic resizing both across and within applications.
Five techniques are proposed and evaluated which are used to the dynamic analysis
of the program instruction access behavior and to proactively guide the L0-cache.
The basic idea is that only the most frequently executed portion of the code should
be stored in the L0-cache, since this is where the program spends most of its
time. Results of the experiments indicate that more than 60% of the dissipated
energy in the I-cache subsystem can be saved.