Data Center Housing the World’s 3rd Fastest Supercomputer: Above Floor Thermal Measurements Compared to CFD Analysis

Author(s):  
Roger Schmidt ◽  
Madhusudan Iyengar ◽  
Joe Caricari

With the ever increasing heat dissipated by IT equipment housed in data centers it is becoming more important to project the changes that can occur in the data center as the newer higher powered hardware is installed. The computational fluid dynamics (CFD) software that is available has improved over the years and some CFD software specific to data center thermal analysis has been developed. This has improved the timeliness of providing some quick analysis of the effects of new hardware into the data center. But it is critically important that this software provide a good report to the user of the effects of adding this new hardware. And it is the purpose of this paper to examine a large cluster installation and compare the CFD analysis with environmental measurements obtained from the same site. This paper shows measurements and CFD analysis of high powered racks as high as 27 kW clustered such that heat fluxes in some regions of the data center exceeded 700 Watts/ft2 (7535 W/m2). This paper describes the thermal profile of a high performance computing cluster located in an IBM data center and a comparison of that cluster modeled with CFD software. The high performance Advanced Simulation and Computing (ASC) cluster, developed and manufactured by IBM, is code named ASC Purple. It is the World’s 3rd fastest supercomputer [1], operating at a peak performance of 77.8 TFlop/s. ASC Purple, which employs IBM pSeries p575, Model 9118, contains more than 12,000 processors, 50 terabytes of memory, and 2 petabytes of globally accessible disk space. The cluster was first tested in the IBM development lab in Poughkeepsie, NY and then shipped to Lawrence Livermore National Labs in Livermore, California where it was installed to support our national security mission. Detailed measurements were taken in both data centers of electronic equipment power usage, perforated floor tile airflow, cable cutout airflow, computer room air conditioning (CRAC) airflow, and electronic equipment inlet air temperatures and were report in Schmidt [2], but only the IBM Poughkeepsie results will be reported here along with a comparison to CFD modeling results. In some areas of the Poughkeepsie data center there were regions that did exceed the equipment inlet air temperature specifications by a significant amount. These areas will be highlighted and reasons given on why these areas failed to meet the criteria. The modeling results by region showed trends that compared somewhat favorably but some rack thermal profiles deviated quite significantly from measurements.

2010 ◽  
Vol 132 (2) ◽  
Author(s):  
Roger Schmidt ◽  
Madhusudan Iyengar ◽  
Joe Caricari

With the ever increasing heat dissipated by information technology (IT) equipment housed in data centers, it is becoming more important to project the changes that can occur in the data center as the newer higher powered hardware is installed. The computational fluid dynamics (CFD) software that is available has improved over the years. CFD software specific to data center thermal analysis has also been developed. This has improved the time lines of providing some quick analysis of the effects of new hardware into the data center. But it is critically important that this software provide a good report to the user of the effects of adding this new hardware. It is the purpose of this paper to examine a large cluster installation and compare the CFD analysis with environmental measurements obtained from the same site. This paper shows measurements and CFD data for high powered racks as high as 27 kW clustered such that heat fluxes in some regions of the data center exceeded 700 W per square foot. This paper describes the thermal profile of a high performance computing cluster located in an data center and a comparison of that cluster modeled via CFD. The high performance advanced simulation and computing (ASC) cluster had a peak performance of 77.8 TFlop/s, and employed more than 12,000 processors, 50 Tbytes of memory, and 2 Pbytes of globally accessible disk space. The cluster was first tested in the manufacturer’s development laboratory in Poughkeepsie, New York, and then shipped to Lawrence Livermore National Laboratory in Livermore, California, where it was installed to support the national security mission of the U.S. Detailed measurements were taken in both data centers and were previously reported. The Poughkeepsie results will be reported here along with a comparison to CFD modeling results. In some areas of the Poughkeepsie data center, there were regions that did exceed the equipment inlet air temperature specifications by a significant amount. These areas will be highlighted and reasons given on why these areas failed to meet the criteria. The modeling results by region showed trends that compared somewhat favorably but some rack thermal profiles deviated quite significantly from measurements.


Author(s):  
Roger Schmidt ◽  
Madhusudan Iyengar

The heat dissipated by large servers and switching equipment is reaching levels that make it very difficult to cool these systems in data centers or telecommunications rooms. Some of the highest powered systems are dissipating upwards of 4000 watts/ft2(43,000 watts/m2) based on the equipment footprint. When systems dissipate this amount of heat and then are clustered together within a data center significant cooling challenges can result. This paper describes the thermal profile of 3 data center layouts (2 are of the same data center but different points in time with a different layout). Detailed measurements of all three were taken: electronic equipment power usage; perforated floor tile airflow; cable cutout airflow; computer room air conditioning (CRAC) airflow, temperatures and power usage; electronic equipment inlet air temperatures. Although the detailed measurements were recorded this paper will focus at the macro level results of the data center to see if some patterns present themselves that might be helpful for future guidelines of data center layout for optimized cooling. Specifically, areas of the data center where racks have similar inlet air temperatures are examined relative to the rack and CRAC unit layout.


Author(s):  
Chris Muller ◽  
Chuck Arent ◽  
Henry Yu

Abstract Lead-free manufacturing regulations, reduction in circuit board feature sizes and the miniaturization of components to improve hardware performance have combined to make data center IT equipment more prone to attack by corrosive contaminants. Manufacturers are under pressure to control contamination in the data center environment and maintaining acceptable limits is now critical to the continued reliable operation of datacom and IT equipment. This paper will discuss ongoing reliability issues with electronic equipment in data centers and will present updates on ongoing contamination concerns, standards activities, and case studies from several different locations illustrating the successful application of contamination assessment, control, and monitoring programs to eliminate electronic equipment failures.


Author(s):  
Siddharth Bhopte ◽  
Dereje Agonafer ◽  
Roger Schmidt ◽  
Bahgat Sammakia

In a typical raised floor data center with alternating hot and cold aisles, air enters the front of each rack over the entire height of the rack. Since the heat loads of data processing equipment continues to increase at a rapid rate, it is a challenge to maintain the temperature within the requirements as stated for all the racks within the data center. A facility manager has discretion in deciding the data center room layout, but a wrong decision will eventually lead to equipment failure. There are many complex decisions to be made early in the design as the data center evolves. Challenges occur such as optimizing the raised floor plenum, floor tile placement, minimizing the data center local hot spots etc. These adjustments in configuration affects rack inlet air temperatures which is one of the important key to effective thermal management. In this paper, a raised floor data center with 4.5 kW racks is considered. There are four rows of racks with alternating hot and cold aisle arrangement. Each row has six racks installed. Two CRAC units supply chilled air to the data center through the pressurized plenum. Effect of plenum depth, floor tile placement and ceiling height on the rack inlet air temperature is discussed. Plots will be presented over the defined range. Now a multi-variable approach to optimize data center room layout to minimize the rack inlet air temperature is proposed. Significant improvement over the initial model is shown by using multi-variable design optimization approach. The results of multi-variable design optimization are used to present guidelines for optimal data center performance.


Author(s):  
Tianyi Gao ◽  
James Geer ◽  
Russell Tipton ◽  
Bruce Murray ◽  
Bahgat G. Sammakia ◽  
...  

The heat dissipated by high performance IT equipment such as servers and switches in data centers is increasing rapidly, which makes the thermal management even more challenging. IT equipment is typically designed to operate at a rack inlet air temperature ranging between 10 °C and 35 °C. The newest published environmental standards for operating IT equipment proposed by ASHARE specify a long term recommended dry bulb IT air inlet temperature range as 18°C to 27°C. In terms of the short term specification, the largest allowable inlet temperature range to operate at is between 5°C and 45°C. Failure in maintaining these specifications will lead to significantly detrimental impacts to the performance and reliability of these electronic devices. Thus, understanding the cooling system is of paramount importance for the design and operation of data centers. In this paper, a hybrid cooling system is numerically modeled and investigated. The numerical modeling is conducted using a commercial computational fluid dynamics (CFD) code. The hybrid cooling strategy is specified by mounting the in row cooling units between the server racks to assist the raised floor air cooling. The effect of several input variables, including rack heat load and heat density, rack air flow rate, in row cooling unit operating cooling fluid flow rate and temperature, in row coil effectiveness, centralized cooling unit supply air flow rate, non-uniformity in rack heat load, and raised floor height are studied parametrically. Their detailed effects on the rack inlet air temperatures and the in row cooler performance are presented. The modeling results and corresponding analyses are used to develop general installation and operation guidance for the in row cooler strategy of a data center.


Author(s):  
Prabjit Singh ◽  
Levente Klein ◽  
Dereje Agonafer ◽  
Jimil M. Shah ◽  
Kanan D. Pujara

The energy used by information technology (IT) equipment and the supporting data center equipment keeps rising as data center proliferation continues unabated. In order to contain the rising computing costs, data center administrators are resorting to cost cutting measures such as not tightly controlling the temperature and humidity levels and in many cases installing air side economizers with the associated risk of introducing particulate and gaseous contaminations into their data centers. The ASHRAE TC9.9 subcommittee, on Mission Critical Facilities, Data Centers, Technology Spaces, and Electronic Equipment, has accommodated the data center administrators by allowing short period excursions outside the recommended temperature-humidity range, into allowable classes A1-A3. Under worst case conditions, the ASHRAE A3 envelope allows electronic equipment to operate at temperature and humidity as high as 24°C and 85% relative humidity for short, but undefined periods of time. This paper addresses the IT equipment reliability issues arising from operation in high humidity and high temperature conditions, with particular attention paid to the question of whether it is possible to determine the all-encompassing x-factors that can capture the effects of temperature and relative humidity on equipment reliability. The role of particulate and gaseous contamination and the aggravating effects of high temperature and high relative humidity will be presented and discussed. A method to determine the temperature and humidity x-factors, based on testing in experimental data centers located in polluted geographies, will be proposed.


2013 ◽  
Vol 135 (3) ◽  
Author(s):  
Dustin W. Demetriou ◽  
H. Ezzat Khalifa

This paper expands on the work presented by Demetriou and Khalifa (Demetriou and Khalifa, 2013, “Thermally Aware, Energy-Based Load Placement in Open-Aisle, Air-Cooled Data Centers,” ASME J. Electron. Packag., 135(3), p. 030906) that investigated practical IT load placement options in open-aisle, air-cooled data centers. The study found that a robust approach was to use real-time temperature measurements at the inlet of the racks to remove IT load from the servers with the warmest inlet temperature. By considering the holistic optimization of the data center load placement strategy and the cooling infrastructure optimization, for a range of data center IT utilization levels, this study investigated the effect of ambient temperatures on the data center operation, the consolidation of servers by completely shutting them off, a complementary strategy to those presented by Demetriou and Khalifa (Demetriou and Khalifa, 2013, “Thermally Aware, Energy-Based Load Placement in Open-Aisle, Air-Cooled Data Centers,” ASME J. Electron. Packag., 135(3), p. 030906) for increasing the IT load beginning with servers that have the coldest inlet temperature and finally the development of load placement rules via either static (i.e., during data center benchmarking) or dynamic (using real-time data from the current thermal environment) allocation. In all of these case studies, by using a holistic optimization of the data center and associated cooling infrastructure, a key finding has been that a significant amount of savings in the cooling infrastructure's power consumption is seen by reducing the CRAH's airflow rate. In many cases, these savings can be larger than providing higher temperature chilled water from the refrigeration units. Therefore, the path to realizing the industry's goal of higher IT equipment inlet temperatures to improve energy efficiency should be through both a reduction in air flow rate and increasing supply air temperatures and not necessarily through only higher CRAH supply air temperatures.


Author(s):  
A. A. Zatsarinny ◽  
K. I. Volovich ◽  
S. A. Denisov ◽  
Yu. S. Ionenkov ◽  
V. A. Kondrashev

This article discusses a methodology for assessing the effectiveness of a high-performance research platform. The assessment is carried out for the example of the "Informatika" Center for Collective Use (CCU) established at the Federal Research Center of the Institute of Management of the Russian Academy of Sciences, for solving new materials synthesis problems. The main objective of the "Informatika" Center for Collective Use is to conduct research using the software and hardware of the data center of the FRC IU RAS, including for the benefit of third-party organizations and research teams. The general characteristics of the "Informatika" Center for Collective Use are presented, including the main characteristics of its scientific equipment, work organization and capabilities. The hybrid high-performance computing cluster of the FRC CSC RAS (HHPCC) is part of the data center of the FRC IU RAS and also part of the “Informatika” Center for Collective Use. HHPCC provides computing resources in the form of cloud services as software (SaaS) and platform (PaaS) services. With the aid of special technologies, scientific services are delivered to researchers in the form of subject-oriented applications. Based on the analysis of the structure and operation principles of the Informatika Center, key performance indicators of the Center have been developed taking into account its specific tasks in order to characterize its various activity aspects (development, activities and performance). CCU efficiency evaluation implies calculation, on the basis of the developed indicators, of overall (generalized) indicators that characterize the CCU operation efficiency in various areas. An integral indicator is also calculated showing the overall CCU efficiency. To develop the overall performance indicators and the integral performance indicator, it is suggested to use the methods of weighted average and analysis of hierarchies. The procedure of determining partial performance indicators has been considered. Specific features of the choice of CCU performance indicators for solving new materials synthesis problems have been identified that characterize computing complex capabilities in the creation of a virtualization environment (peak performance of a computing system, real performance of a computing system on specialized tests, equipment loading with applied tasks and program code efficiency).


Author(s):  
Dan Comperchio ◽  
Sameer Behere

Data center cooling systems have long been burdened by high levels of redundancy requirements, resulting in inefficient system designs to satisfy a risk-adverse operating environment. As attitudes, technologies, and sustainability awareness change within the industry, data centers are beginning to realize higher levels of energy efficiency without sacrificing operational security. By exploiting the increased temperature and humidity tolerances of the information technology equipment (ITE), data center mechanical systems can leverage ambient conditions to operate in economization mode for increased times during the year. Economization provides one of the largest methodologies for data centers to reduce their energy consumption and carbon footprint. As outside air temperatures and conditions become more favorable for cooling the data center, mechanical cooling through vapor-compression cycles is reduced or entirely eliminated. One favorable method for utilizing low outside air temperatures without sacrificing indoor air quality is through deploying rotary heat wheels to transfer heat between the data center return air and outside air without introducing outside air into the white space. A metal corrugated wheel is rotated through two opposing airstreams with varying thermal gradients to provide a net cooling effect at significantly reduced electrical energy over traditional mechanical cooling topologies. To further extend the impacts of economization, data centers are also able to significantly raise operating temperatures beyond what is traditionally found in comfort cooling applications. The increase in the dry bulb temperature provided to the inlet of the information technology equipment, as well as an elevated temperature rise across the equipment significantly reduces the energy use within a data center.


Author(s):  
Zahra Bouramdane ◽  
Abdellah Bah ◽  
Mohammed Alaoui ◽  
Nadia Martaj

Although thermoacoustic devices comprise simple components, the design of these machines is very challenging. In order to predict the behavior and optimize the performance of a thermoacoustic refrigerator driven by a standing-wave thermoacoustic engine, considering the changes in geometrical parameters, two analogies have been presented in this paper. The first analogy is based on CFD analysis where a 2D model is implemented to investigate the influence of stack parameters on the refrigerator performance, to analyze the time variation of the temperature gradient across the stack, and to examine the refrigerator performance in terms of refrigeration temperature. The second analogy is based on the use of an optimization algorithm based on the simplified linear thermoacoustic theory applied for designing thermoacoustic refrigerators with different stack parameters and operating conditions. Simulation results show that the engine produced a high-powered acoustic wave with a pressure amplitude of 23[Formula: see text]kPa and a frequency of 584[Formula: see text]Hz and this wave applies a temperature difference across the refrigeration stack with a cooling temperature of 292.8[Formula: see text]K when the stacks are positioned next to the pressure antinode. The results from the algorithm give the ability to design any thermoacoustic refrigerator with high performance by picking the appropriate parameters.


Sign in / Sign up

Export Citation Format

Share Document