Precise Cache Profiling for Studying Radiation Effects

2021 ◽  
Vol 20 (3) ◽  
pp. 1-25
Author(s):  
James Marshall ◽  
Robert Gifford ◽  
Gedare Bloom ◽  
Gabriel Parmer ◽  
Rahul Simha

Increased access to space has led to an increase in the usage of commodity processors in radiation environments. These processors are vulnerable to transient faults such as single event upsets that may cause bit-flips in processor components. Caches in particular are vulnerable due to their relatively large area, yet are often omitted from fault injection testing because many processors do not provide direct access to cache contents and they are often not fully modeled by simulators. The performance benefits of caches make disabling them undesirable, and the presence of error correcting codes is insufficient to correct for increasingly common multiple bit upsets. This work explores building a program’s cache profile by collecting cache usage information at an instruction granularity via commonly available on-chip debugging interfaces. The profile provides a tighter bound than cache utilization for cache vulnerability estimates (50% for several benchmarks). This can be applied to reduce the number of fault injections required to characterize behavior by at least two-thirds for the benchmarks we examine. The profile enables future work in hardware fault injection for caches that avoids the biases of existing techniques.

2014 ◽  
Vol 60 (1) ◽  
pp. 92-97 ◽  
Author(s):  
Mariusz Węgrzyn ◽  
Janusz Sosnowski

Abstract The paper presents the extent of fault effects in FPGA based systems and concentrates on transient faults (induced by single event upsets - SEUs) within the configuration memory of FPGA. An original method of detailed analysis of fault effect propagation is presented. It is targeted at microprocessor based FPGA systems using the developed fault injection technique. The fault injection is performed at HDL description level of the microprocessor using special simulators and developed supplementary programs. The proposed methodology is illustrated for soft PicoBlaze microprocessor running 3 programs. The presented results reveal some problems with fault handling at the software level.


2011 ◽  
Vol 8 (2) ◽  
pp. 308-314 ◽  
Author(s):  
Marta Portela-Garcia ◽  
Celia Lopez-Ongil ◽  
Mario Garcia Valderas ◽  
Luis Entrena
Keyword(s):  

2011 ◽  
Vol 1 (4) ◽  
pp. 265-270 ◽  
Author(s):  
Sho Endo ◽  
Takeshi Sugawara ◽  
Naofumi Homma ◽  
Takafumi Aoki ◽  
Akashi Satoh

Author(s):  
Zhenyu Qi ◽  
Yan Zhang ◽  
Mircea Stan

Corner-based design and verification are based on worst-case analysis, thus introducing over-pessimism and large area and power overhead and leading to unnecessary energy consumption. Typical case-based design and verification maximize energy efficiency through design margins reduction and adaptive computation, thus helping achieve sustainable computing. Dynamically adapting to manufacturing, environmental, and usage variations is the key to shaving unnecessary design margins, which requires on-chip modules that can sense and configure design parameters both globally and locally to maximize computation efficiency, and maintain this efficiency over the lifetime of the system. This chapter presents an adaptive threshold compensation scheme using a transimpedance amplifier and adaptive body biasing to overcome the effects of temperature variation, reliability degradation, and process variation. The effectiveness and versatility of the scheme are demonstrated with two example applications, one as a temperature aware design to maintain IONto IOFFcurrent ratio, the other as a reliability sensor for NBTI (Negative Bias Temperature Instability).


Author(s):  
Matteo Sonza Reorda ◽  
Luca Sterpone ◽  
Massimo Violante

Transient faults became an increasing issue in the past few years as smaller geometries of newer, highly miniaturized, silicon manufacturing technologies brought to the mass-market failure mechanisms traditionally bound to niche markets as electronic equipments for avionic, space or nuclear applications. This chapter presents the origin of transient faults, it discusses the propagation mechanism, it outlines models devised to represent them and finally it discusses the state-of-the-art design techniques that can be used to detect and correct transient faults. The concepts of hardware, data and time redundancy are presented, and their implementations to cope with transient faults affecting storage elements, combinational logic and IP-cores (e.g., processor cores) typically found in a System-on-Chip are discussed.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2074
Author(s):  
J.-Carlos Baraza-Calvo ◽  
Joaquín Gracia-Morán ◽  
Luis-J. Saiz-Adalid ◽  
Daniel Gil-Tomás ◽  
Pedro-J. Gil-Vicente

Due to transistor shrinking, intermittent faults are a major concern in current digital systems. This work presents an adaptive fault tolerance mechanism based on error correction codes (ECC), able to modify its behavior when the error conditions change without increasing the redundancy. As a case example, we have designed a mechanism that can detect intermittent faults and swap from an initial generic ECC to a specific ECC capable of tolerating one intermittent fault. We have inserted the mechanism in the memory system of a 32-bit RISC processor and validated it by using VHDL simulation-based fault injection. We have used two (39, 32) codes: a single error correction–double error detection (SEC–DED) and a code developed by our research group, called EPB3932, capable of correcting single errors and double and triple adjacent errors that include a bit previously tagged as error-prone. The results of injecting transient, intermittent, and combinations of intermittent and transient faults show that the proposed mechanism works properly. As an example, the percentage of failures and latent errors is 0% when injecting a triple adjacent fault after an intermittent stuck-at fault. We have synthesized the adaptive fault tolerance mechanism proposed in two types of FPGAs: non-reconfigurable and partially reconfigurable. In both cases, the overhead introduced is affordable in terms of hardware, time and power consumption.


Eureka ◽  
2014 ◽  
Vol 4 (1) ◽  
pp. 35-39 ◽  
Author(s):  
Aaron Melnyk

Experimental observation of the ‘trapped rainbow’ in the visible is demonstrated using tapered hollow Bragg waveguides. These waveguides spatially disperse an input spectrum into its various frequency components and vertical out of plane radiation was observed at wavelength dependant positions along the entire length of the waveguide. The experimental observation is corroborated by a brief theoretical analysis and simulation. These devices form the foundation for future work involving integration into a micro-spectrometer for eventual lab-on-chip use. 


Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 709
Author(s):  
Abhishek Das ◽  
Nur A. Touba

Technology scaling has led to an increase in density and capacity of on-chip caches. This has enabled higher throughput by enabling more low latency memory transfers. With the reduction in size of SRAMs and development of emerging technologies, e.g., STT-MRAM, for on-chip cache memories, reliability of such memories becomes a major concern. Traditional error correcting codes, e.g., Hamming codes and orthogonal Latin square codes, either suffer from high decoding latency, which leads to lower overall throughput, or high memory overhead. In this paper, a new single error correcting code based on a shared majority voting logic is presented. The proposed codes trade off decoding latency in order to improve the memory overhead posed by orthogonal Latin square codes. A latency optimization technique is also proposed which lowers the decoding latency by incurring a slight memory overhead. It is shown that the proposed codes achieve better redundancy compared to orthogonal Latin square codes. The proposed codes are also shown to achieve lower decoding latency compared to Hamming codes. Thus, the proposed codes achieve a balanced trade-off between memory overhead and decoding latency, which makes them highly suitable for on-chip cache memories which have stringent throughput and memory overhead constraints.


Sign in / Sign up

Export Citation Format

Share Document