Cache-Line Decay: A Mechanism to Reduce Cache Leakage Power

A Large “Read” and “Write” Margins, Low Leakage Power, Six-Transistor 90-nm CMOS SRAM

IEICE Transactions on Electronics ◽

10.1587/transele.e94.c.530 ◽

2011 ◽

Vol E94-C (4) ◽

pp. 530-538

Author(s):

Tadayoshi ENOMOTO ◽

Nobuaki KOBAYASHI

Keyword(s):

Leakage Power ◽

Low Leakage

Download Full-text

LEAKAGE POWER REDUCTION USING MULTI MODAL DRIVEN HIERARCHICAL POWER MODE SWITCHES

i-manager s Journal on Circuits and Systems ◽

10.26634/jcir.6.2.14291 ◽

2018 ◽

Vol 6 (2) ◽

pp. 1

Author(s):

SEKHAR REDDY M. CHANDRA ◽

REDDY P. RAMANA ◽

◽

Keyword(s):

Power Reduction ◽

Leakage Power ◽

Power Mode

Download Full-text

THE PROBLEM OF PROVIDING CACHE COHERENCE IN MULTIPROCESSOR SYSTEMS WITH MANY PROCESSORS

Issues of radio electronics ◽

10.21778/2218-5453-2018-5-47-53 ◽

2018 ◽

pp. 47-53

Author(s):

B. Z. Shmeylin ◽

E. A. Alekseeva

Keyword(s):

Cache Coherence ◽

Bloom Filters ◽

Multiprocessor Systems ◽

Cache Line ◽

Maintenance Systems ◽

Processor Caches ◽

Conventional Systems ◽

Additional Hardware

In this paper the tasks of managing the directory in coherence maintenance systems in multiprocessor systems with a large number of processors are solved. In microprocessor systems with a large number of processors (MSLP) the problem of maintaining the coherence of processor caches is significantly complicated. This is due to increased traffic on the memory buses and increased complexity of interprocessor communications. This problem is solved in various ways. In this paper, we propose the use of Bloom filters used to accelerate the determination of an element’s belonging to a certain array. In this article, such filters are used to establish the fact that the processor belongs to some subset of the processors and determine if the processor has a cache line in the set. In the paper, the processes of writing and reading information in the data shared between processors are discussed in detail, as well as the process of data replacement from private caches. The article also shows how the addresses of cache lines and processor numbers are removed from the Bloom filters. The system proposed in this paper allows significantly speeding up the implementation of operations to maintain cache coherence in the MSLP as compared to conventional systems. In terms of performance and additional hardware and software costs, the proposed system is not inferior to the most efficient of similar systems, but on some applications and significantly exceeds them.

Download Full-text

The epsilon-approximation to discrete VT assignment for leakage power minimization

Proceedings of the 2009 International Conference on Computer-Aided Design - ICCAD '09 ◽

10.1145/1687399.1687453 ◽

2009 ◽

Cited By ~ 2

Author(s):

Yujia Feng ◽

Shiyan Hu

Keyword(s):

Leakage Power ◽

Power Minimization

Download Full-text

A Novel Cross-Latch Shift Register Scheme for Low Power Applications

Applied Sciences ◽

10.3390/app11010129 ◽

2020 ◽

Vol 11 (1) ◽

pp. 129

Author(s):

Po-Yu Kuo ◽

Ming-Hwa Sheu ◽

Chang-Ming Tsai ◽

Ming-Yan Tsai ◽

Jin-Fa Lin

Keyword(s):

Power Consumption ◽

Supply Voltage ◽

Electrical Power ◽

Average Power ◽

Leakage Power ◽

Shift Register ◽

Cmos Process ◽

Average Power Consumption ◽

Simulation Results ◽

Layout Area

The conventional shift register consists of master and slave (MS) latches with each latch receiving the data from the previous stage. Therefore, the same data are stored in two latches separately. It leads to consuming more electrical power and occupying more layout area, which is not satisfactory to most circuit designers. To solve this issue, a novel cross-latch shift register (CLSR) scheme is proposed. It significantly reduced the number of transistors needed for a 256-bit shifter register by 48.33% as compared with the conventional MS latch design. To further verify its functions, this CLSR was implemented by using TSMC 40 nm CMOS process standard technology. The simulation results reveal that the proposed CLSR reduced the average power consumption by 36%, cut the leakage power by 60.53%, and eliminated layout area by 34.76% at a supply voltage of 0.9 V with an operating frequency of 250 MHz, as compared with the MS latch.

Download Full-text

Stacked keeper with body bias: A new approach to reduce leakage power for low power VLSI design

2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies ◽

10.1109/icaccct.2014.7019482 ◽

2014 ◽

Cited By ~ 3

Author(s):

K. N. Bhargav ◽

A. Suresh ◽

Gaurav Saini

Keyword(s):

Low Power ◽

Vlsi Design ◽

Leakage Power ◽

New Approach ◽

Low Power Vlsi Design ◽

Body Bias

Download Full-text

Error-Tolerant Reconfigurable VDD 10T SRAM Architecture for IoT Applications

Electronics ◽

10.3390/electronics10141718 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1718

Author(s):

Neha Gupta ◽

Ambika Prasad Shah ◽

Sajid Khan ◽

Santosh Kumar Vishvakarma ◽

Michael Waltl ◽

...

Keyword(s):

Power Dissipation ◽

Supply Voltage ◽

Leakage Power ◽

Functional Improvement ◽

Scaling Technique ◽

Iot Applications ◽

Sram Cell ◽

Improved Stability ◽

Supply Voltage Scaling ◽

Power Delay Product

This paper proposes an error-tolerant reconfigurable VDD (R-VDD) scaled SRAM architecture, which significantly reduces the read and hold power using the supply voltage scaling technique. The data-dependent low-power 10T (D2LP10T) SRAM cell is used for the R-VDD scaled architecture with the improved stability and lower power consumption. The R-VDD scaled SRAM architecture is developed to avoid unessential read and hold power using VDD scaling. In this work, the cells are implemented and analyzed considering a technologically relevant 65 nm CMOS node. We analyze the failure probability during read, write, and hold mode, which shows that the proposed D2LP10T cell exhibits the lowest failure rate compared to other existing cells. Furthermore, the D2LP10T cell design offers 1.66×, 4.0×, and 1.15× higher write, read, and hold stability, respectively, as compared to the 6T cell. Moreover, leakage power, write power-delay-product (PDP), and read PDP has been reduced by 89.96%, 80.52%, and 59.80%, respectively, compared to the 6T SRAM cell at 0.4 V supply voltage. The functional improvement becomes even more apparent when the quality factor (QF) is evaluated, which is 458× higher for the proposed design than the 6T SRAM cell at 0.4 V supply voltage. A significant improvement of power dissipation, i.e., 46.07% and 74.55%, can also be observed for the R-VDD scaled architecture compared to the conventional array for the respective read and hold operation at 0.4 V supply voltage.

Download Full-text

Optimization of Static Power, Leakage Power and Delay of Full Adder Circuit Using Dual Threshold MOSFET Based Design and T-Spice Simulation

2009 International Conference on Advances in Recent Technologies in Communication and Computing ◽

10.1109/artcom.2009.28 ◽

2009 ◽

Cited By ~ 2

Author(s):

Anindya Ghosh ◽

Debapriyo Ghosh

Keyword(s):

Full Adder ◽

Leakage Power ◽

Spice Simulation ◽

Static Power ◽

Power Leakage

Download Full-text

AND-OR-XOR NETWORK SYNTHESIS WITH AREA-POWER TRADE-OFF

Journal of Circuits System and Computers ◽

10.1142/s0218126611007736 ◽

2011 ◽

Vol 20 (06) ◽

pp. 1019-1035 ◽

Cited By ~ 1

Author(s):

SAMBHU NATH PRADHAN ◽

M. TILAK KUMAR ◽

SANTANU CHATTOPDHYAY

Keyword(s):

Boolean Function ◽

Solution Space ◽

Leakage Power ◽

Total Power ◽

Synthesis Process ◽

Network Synthesis ◽

Trade Off ◽

Delay Performance ◽

Trade Offs ◽

Power Trade

In this paper, a heuristic based on genetic algorithm to realize multi-output Boolean function as three-level AND-OR-XOR network performing area power trade-off is presented. All the previous works dealt with the minimization of number of product terms only in the two sum-of-product-expressions representing a Boolean function during AND-OR-XOR network synthesis. To the best of knowledge this is the first ever effort to incorporate total power, that is, dynamic and leakage power along with the area (in terms of number of product terms) during three-level AND-OR-XOR networks synthesis. The synthesis process, without changing the delay performance results in lesser number of product terms compared to those reported in the literature. It also enumerates the trade-offs present in the solution space for different weights associated with area, dynamic power, and leakage power of the resulting circuit.

Download Full-text

Scalability of Phase Change Materials in Nanostructure Template

International Journal of Photoenergy ◽

10.1155/2015/253296 ◽

2015 ◽

Vol 2015 ◽

pp. 1-4

Author(s):

Wei Zhang ◽

Biyun L. Jackson ◽

Ke Sun ◽

Jae Young Lee ◽

Shyh-Jer Huang ◽

...

Keyword(s):

Phase Change ◽

Phase Change Materials ◽

Scaling Limit ◽

Phase Change Memory ◽

Leakage Power ◽

Memory Element ◽

Transmission Electron ◽

Diffraction Mode ◽

Memory Applications ◽

Change Memory

The scalability of In2Se3, one of the phase change materials, is investigated. By depositing the material onto a nanopatterned substrate, individual In2Se3nanoclusters are confined in the nanosize pits with well-defined shape and dimension permitting the systematic study of the ultimate scaling limit of its use as a phase change memory element. In2Se3of progressively smaller volume is heated inside a transmission electron microscope operating in diffraction mode. The volume at which the amorphous-crystalline transition can no longer be observed is taken as the ultimate scaling limit, which is approximately 5 nm3for In2Se3. The physics for the existence of scaling limit is discussed. Using phase change memory elements in memory hierarchy is believed to reduce its energy consumption because they consume zero leakage power in memory cells. Therefore, the phase change memory applications are of great importance in terms of energy saving.

Download Full-text