scholarly journals An Empirical Study on Data Center System Failure Diagnosis

Author(s):  
Montri Wiboonrat
2018 ◽  
Vol 204 ◽  
pp. 169-182 ◽  
Author(s):  
Imad Eddine Kaid ◽  
Ahmed Hafaifa ◽  
Mouloud Guemana ◽  
Nadji Hadroug ◽  
Abdellah Kouzou ◽  
...  

2012 ◽  
Vol 529 ◽  
pp. 431-435
Author(s):  
Zu Ming Xiao ◽  
Zhan Guo ◽  
Xin Hua Huang ◽  
Li Xue Mei

Current, overhaul equipments commonly used in vehicle maintenance have some disadvantages. How to design a diagnosis system easy to grasp and directly reflecting the cause of the failure has a strong value. This paper has designed a PC-based system for automotive electronic control system failure diagnosis. After the system inputs car output signal collected to the computer system, according to comparative analysis of the signal waveform by software, electronic control failure reasons can be diagnosed very conveniently.


Author(s):  
Mohammad Tradat ◽  
Bahgat Sammakia ◽  
Husam Alissa ◽  
Kourosh Nemati

Given the vital rule of data center availability and since the inlet temperature of the IT equipment increase rapidly until reaching a certain threshold value after which IT starts throttling or shut down because of overheat during cooling system failure. Hence, it is especially important to understand failures and their effects. This study presented experimental investigation and analysis of a facility-level cooling system failure scenario in which chilled water interruption introduced to the data center. Quantitative instrumentation tools including wireless technology such as wireless temperature and pressure sensors were used to measure the discrete air inlet temperature and pressure differential though cold aisle enclosure, respectively. In addition, Intelligent Platform Management Interface (IPMI) and cooling system data during failure/recovery were reported. Furthermore, the IT equipment performance and response for opened and contained environments were simulated and compared. Finally, an experiment based analysis of the Ride Through Time (RTT) of servers during chilled water interruption of the cooling infrastructure presented as well. The results showed that for all three classes of servers tested during the cooling failure, CAC helped keep the server’s cooler for longer. The containment provided a barrier between the hot and cold air streams and caused slight negative pressure to build up, which allowed the servers to pull cold air from the underfloor plenum. In addition, the results show that the effect of CAC in containment solutions on the IT equipment performance and response could vary and depend on the server’s airflow, generation and hence types of servers deployed in cold aisle enclosure. Moreover, it was shown that when compared to the discrete sensors, the IPMI inlet temperature sensors underestimate the Ride Through Time (RTT) by 42% and 12% for the CAC and opened cases, respectively.


2013 ◽  
Vol 717 ◽  
pp. 347-349
Author(s):  
Chun Hui Pan ◽  
Huan Dong Wang

It is difficult to directly determine the main cause of failure in hydraulic system on themachine tools. The transmission principle and structural characteristics of the hydraulic system mustbe fully understood before any troubleshooting start. Then the root cause of the failure can be foundand excluded through judgment and analysis on fault phenomena. This paper analyzes thecharacteristics of the hydraulic system failure, summarizes the frequently-employed methodsregarding hydraulic system failure diagnosis. The authors hope to provide some help to machinehydraulic system failure diagnosis at the workplace.


Author(s):  
TAKEHISA KOHDA ◽  
KOICHI INOUE

The software for a failure diagnosis system can be represented in terms of production rules; the condition part represents a system failure condition and the conclusion part corresponds to a cause of the system failure or an appropriate protective action to be taken. This paper proposes a safety assessment model of the software to evaluate its contribution to the risk caused by the entire failure diagnosis system. The proposed risk criterion considers not only the reliability of hardware components of the failure diagnosis system, but also the reliability characteristics of the system to be monitored. Conventional verification and validation methods of rule-based systems assume that the software reliability can be achieved by maintaining the consistent relation between condition parts and conclusion parts. However, the risk criterion derived in this paper shows that the software for a failure diagnosis system cannot be optimized without considering these environmental factors.


Sign in / Sign up

Export Citation Format

Share Document