Cache-Line Decay: A Mechanism to Reduce Cache Leakage Power

Author(s):  
Stefanos Kaxiras ◽  
Zhigang Hu ◽  
Girija Narlikar ◽  
Rae McLellan
Keyword(s):  
2018 ◽  
Vol 6 (2) ◽  
pp. 1
Author(s):  
SEKHAR REDDY M. CHANDRA ◽  
REDDY P. RAMANA ◽  
◽  

2018 ◽  
pp. 47-53
Author(s):  
B. Z. Shmeylin ◽  
E. A. Alekseeva

In this paper the tasks of managing the directory in coherence maintenance systems in multiprocessor systems with a large number of processors are solved. In microprocessor systems with a large number of processors (MSLP) the problem of maintaining the coherence of processor caches is significantly complicated. This is due to increased traffic on the memory buses and increased complexity of interprocessor communications. This problem is solved in various ways. In this paper, we propose the use of Bloom filters used to accelerate the determination of an element’s belonging to a certain array. In this article, such filters are used to establish the fact that the processor belongs to some subset of the processors and determine if the processor has a cache line in the set. In the paper, the processes of writing and reading information in the data shared between processors are discussed in detail, as well as the process of data replacement from private caches. The article also shows how the addresses of cache lines and processor numbers are removed from the Bloom filters. The system proposed in this paper allows significantly speeding up the implementation of operations to maintain cache coherence in the MSLP as compared to conventional systems. In terms of performance and additional hardware and software costs, the proposed system is not inferior to the most efficient of similar systems, but on some applications and significantly exceeds them.


2020 ◽  
Vol 11 (1) ◽  
pp. 129
Author(s):  
Po-Yu Kuo ◽  
Ming-Hwa Sheu ◽  
Chang-Ming Tsai ◽  
Ming-Yan Tsai ◽  
Jin-Fa Lin

The conventional shift register consists of master and slave (MS) latches with each latch receiving the data from the previous stage. Therefore, the same data are stored in two latches separately. It leads to consuming more electrical power and occupying more layout area, which is not satisfactory to most circuit designers. To solve this issue, a novel cross-latch shift register (CLSR) scheme is proposed. It significantly reduced the number of transistors needed for a 256-bit shifter register by 48.33% as compared with the conventional MS latch design. To further verify its functions, this CLSR was implemented by using TSMC 40 nm CMOS process standard technology. The simulation results reveal that the proposed CLSR reduced the average power consumption by 36%, cut the leakage power by 60.53%, and eliminated layout area by 34.76% at a supply voltage of 0.9 V with an operating frequency of 250 MHz, as compared with the MS latch.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1718
Author(s):  
Neha Gupta ◽  
Ambika Prasad Shah ◽  
Sajid Khan ◽  
Santosh Kumar Vishvakarma ◽  
Michael Waltl ◽  
...  

This paper proposes an error-tolerant reconfigurable VDD (R-VDD) scaled SRAM architecture, which significantly reduces the read and hold power using the supply voltage scaling technique. The data-dependent low-power 10T (D2LP10T) SRAM cell is used for the R-VDD scaled architecture with the improved stability and lower power consumption. The R-VDD scaled SRAM architecture is developed to avoid unessential read and hold power using VDD scaling. In this work, the cells are implemented and analyzed considering a technologically relevant 65 nm CMOS node. We analyze the failure probability during read, write, and hold mode, which shows that the proposed D2LP10T cell exhibits the lowest failure rate compared to other existing cells. Furthermore, the D2LP10T cell design offers 1.66×, 4.0×, and 1.15× higher write, read, and hold stability, respectively, as compared to the 6T cell. Moreover, leakage power, write power-delay-product (PDP), and read PDP has been reduced by 89.96%, 80.52%, and 59.80%, respectively, compared to the 6T SRAM cell at 0.4 V supply voltage. The functional improvement becomes even more apparent when the quality factor (QF) is evaluated, which is 458× higher for the proposed design than the 6T SRAM cell at 0.4 V supply voltage. A significant improvement of power dissipation, i.e., 46.07% and 74.55%, can also be observed for the R-VDD scaled architecture compared to the conventional array for the respective read and hold operation at 0.4 V supply voltage.


2011 ◽  
Vol 20 (06) ◽  
pp. 1019-1035 ◽  
Author(s):  
SAMBHU NATH PRADHAN ◽  
M. TILAK KUMAR ◽  
SANTANU CHATTOPDHYAY

In this paper, a heuristic based on genetic algorithm to realize multi-output Boolean function as three-level AND-OR-XOR network performing area power trade-off is presented. All the previous works dealt with the minimization of number of product terms only in the two sum-of-product-expressions representing a Boolean function during AND-OR-XOR network synthesis. To the best of knowledge this is the first ever effort to incorporate total power, that is, dynamic and leakage power along with the area (in terms of number of product terms) during three-level AND-OR-XOR networks synthesis. The synthesis process, without changing the delay performance results in lesser number of product terms compared to those reported in the literature. It also enumerates the trade-offs present in the solution space for different weights associated with area, dynamic power, and leakage power of the resulting circuit.


2015 ◽  
Vol 2015 ◽  
pp. 1-4
Author(s):  
Wei Zhang ◽  
Biyun L. Jackson ◽  
Ke Sun ◽  
Jae Young Lee ◽  
Shyh-Jer Huang ◽  
...  

The scalability of In2Se3, one of the phase change materials, is investigated. By depositing the material onto a nanopatterned substrate, individual In2Se3nanoclusters are confined in the nanosize pits with well-defined shape and dimension permitting the systematic study of the ultimate scaling limit of its use as a phase change memory element. In2Se3of progressively smaller volume is heated inside a transmission electron microscope operating in diffraction mode. The volume at which the amorphous-crystalline transition can no longer be observed is taken as the ultimate scaling limit, which is approximately 5 nm3for In2Se3. The physics for the existence of scaling limit is discussed. Using phase change memory elements in memory hierarchy is believed to reduce its energy consumption because they consume zero leakage power in memory cells. Therefore, the phase change memory applications are of great importance in terms of energy saving.


Sign in / Sign up

Export Citation Format

Share Document