Author retrospective improving data cache performance by pre-executing instructions under a cache miss

Embedded systems are designed for a variety of applications ranging from Hard Real Time applications to mobile computing, which demands various types of cache designs for better performance. Since real-time applications place stringent requirements on performance, the role of the cache subsystem assumes significance. Reconfigurable caches meet performance requirements under this context. Existing reconfigurable caches tend to use associativity and size for maximizing cache performance. This article proposes a novel approach of a reconfigurable and intelligent data cache (L1) based on replacement algorithms. An intelligent embedded data cache and a dynamic reconfigurable intelligent embedded data cache have been implemented using Verilog 2001 and tested for cache performance. Data collected by enabling the cache with two different replacement strategies have shown that the hit rate improves by 40% when compared to LRU and 21% when compared to MRU for sequential applications which will significantly improve performance of embedded real time application.

Download Full-text

Improving data cache performance with integrated use of split caches, victim cache and stream buffers

ACM SIGARCH Computer Architecture News ◽

10.1145/1101868.1101876 ◽

2005 ◽

Vol 33 (3) ◽

pp. 41-48 ◽

Cited By ~ 3

Author(s):

Afrin Naz ◽

Mehran Rezaei ◽

Krishna Kavi ◽

Philip Sweany

Keyword(s):

Data Cache ◽

Cache Performance

Download Full-text

Improving the data cache performance of multiprocessor operating systems

Proceedings. Second International Symposium on High-Performance Computer Architecture ◽

10.1109/hpca.1996.501176 ◽

2002 ◽

Cited By ~ 3

Author(s):

Chun Xia ◽

J. Torrellas

Keyword(s):

Operating Systems ◽

Data Cache ◽

Cache Performance

Download Full-text

Compiler Techniques for Reducing Data Cache Miss Rate on a Multithreaded Architecture

High Performance Embedded Architectures and Compilers - Lecture Notes in Computer Science ◽

10.1007/978-3-540-77560-7_24 ◽

2008 ◽

pp. 353-368 ◽

Cited By ~ 11

Author(s):

Subhradyuti Sarkar ◽

Dean M. Tullsen

Keyword(s):

Data Cache ◽

Multithreaded Architecture ◽

Cache Miss ◽

Compiler Techniques

Download Full-text

DStride: data-cache miss-address-based stride prefetching scheme for multimedia processors

Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001 ◽

10.1109/acac.2001.903360 ◽

2002 ◽

Cited By ~ 4

Author(s):

G. Hariprakash ◽

R. Achutharaman ◽

A.R. Omondi

Keyword(s):

Data Cache ◽

Cache Miss ◽

Multimedia Processors

Download Full-text

Application-Specific Hardware-Driven Prefetching to Improve Data Cache Performance

Advances in Computer Systems Architecture - Lecture Notes in Computer Science ◽

10.1007/11572961_62 ◽

2005 ◽

pp. 761-774 ◽

Cited By ~ 2

Author(s):

Mehdi Modarressi ◽

Maziar Goudarzi ◽

Shaahin Hessabi

Keyword(s):

Data Cache ◽

Cache Performance ◽

Application Specific

Download Full-text

A SPLIT L2 DATA CACHE FOR SCALABLE CC-NUMA MULTIPROCESSORS

Journal of Circuits System and Computers ◽

10.1142/s021812660500243x ◽

2005 ◽

Vol 14 (03) ◽

pp. 605-617 ◽

Cited By ~ 2

Author(s):

SUNG WOO CHUNG ◽

HYONG-SHIK KIM ◽

CHU SHIK JHON

Keyword(s):

Execution Time ◽

Memory Access ◽

Remote Memory ◽

Access Time ◽

Data Cache ◽

Total Execution Time ◽

Memory Address ◽

L2 Cache ◽

Cache Miss

In scalable CC-NUMA multiprocessors, it is crucial to reduce the average memory access time. For applications where the second-level (L2) cache is large enough, we propose a split L2 cache to utilize the surplus space. The split L2 cache is composed of a traditional LRU cache and an RVC (Remote Victim Cache) which only stores the data of remote memory address range. Thus, it reduces the average L2 cache miss time by keeping remote blocks that would be discarded otherwise. Though the split cache does not reduce the miss rates, it is observed to reduce the total execution time effectively by up to 27%.It even outperform an LRU cache of double size.

Download Full-text