Data Center Housing High Performance Supercomputer Cluster: Above Floor Thermal Measurements Compared To CFD Analysis

2010 ◽  
Vol 132 (2) ◽  
Author(s):  
Roger Schmidt ◽  
Madhusudan Iyengar ◽  
Joe Caricari

With the ever increasing heat dissipated by information technology (IT) equipment housed in data centers, it is becoming more important to project the changes that can occur in the data center as the newer higher powered hardware is installed. The computational fluid dynamics (CFD) software that is available has improved over the years. CFD software specific to data center thermal analysis has also been developed. This has improved the time lines of providing some quick analysis of the effects of new hardware into the data center. But it is critically important that this software provide a good report to the user of the effects of adding this new hardware. It is the purpose of this paper to examine a large cluster installation and compare the CFD analysis with environmental measurements obtained from the same site. This paper shows measurements and CFD data for high powered racks as high as 27 kW clustered such that heat fluxes in some regions of the data center exceeded 700 W per square foot. This paper describes the thermal profile of a high performance computing cluster located in an data center and a comparison of that cluster modeled via CFD. The high performance advanced simulation and computing (ASC) cluster had a peak performance of 77.8 TFlop/s, and employed more than 12,000 processors, 50 Tbytes of memory, and 2 Pbytes of globally accessible disk space. The cluster was first tested in the manufacturer’s development laboratory in Poughkeepsie, New York, and then shipped to Lawrence Livermore National Laboratory in Livermore, California, where it was installed to support the national security mission of the U.S. Detailed measurements were taken in both data centers and were previously reported. The Poughkeepsie results will be reported here along with a comparison to CFD modeling results. In some areas of the Poughkeepsie data center, there were regions that did exceed the equipment inlet air temperature specifications by a significant amount. These areas will be highlighted and reasons given on why these areas failed to meet the criteria. The modeling results by region showed trends that compared somewhat favorably but some rack thermal profiles deviated quite significantly from measurements.

Author(s):  
Roger Schmidt ◽  
Madhusudan Iyengar ◽  
Joe Caricari

With the ever increasing heat dissipated by IT equipment housed in data centers it is becoming more important to project the changes that can occur in the data center as the newer higher powered hardware is installed. The computational fluid dynamics (CFD) software that is available has improved over the years and some CFD software specific to data center thermal analysis has been developed. This has improved the timeliness of providing some quick analysis of the effects of new hardware into the data center. But it is critically important that this software provide a good report to the user of the effects of adding this new hardware. And it is the purpose of this paper to examine a large cluster installation and compare the CFD analysis with environmental measurements obtained from the same site. This paper shows measurements and CFD analysis of high powered racks as high as 27 kW clustered such that heat fluxes in some regions of the data center exceeded 700 Watts/ft2 (7535 W/m2). This paper describes the thermal profile of a high performance computing cluster located in an IBM data center and a comparison of that cluster modeled with CFD software. The high performance Advanced Simulation and Computing (ASC) cluster, developed and manufactured by IBM, is code named ASC Purple. It is the World’s 3rd fastest supercomputer [1], operating at a peak performance of 77.8 TFlop/s. ASC Purple, which employs IBM pSeries p575, Model 9118, contains more than 12,000 processors, 50 terabytes of memory, and 2 petabytes of globally accessible disk space. The cluster was first tested in the IBM development lab in Poughkeepsie, NY and then shipped to Lawrence Livermore National Labs in Livermore, California where it was installed to support our national security mission. Detailed measurements were taken in both data centers of electronic equipment power usage, perforated floor tile airflow, cable cutout airflow, computer room air conditioning (CRAC) airflow, and electronic equipment inlet air temperatures and were report in Schmidt [2], but only the IBM Poughkeepsie results will be reported here along with a comparison to CFD modeling results. In some areas of the Poughkeepsie data center there were regions that did exceed the equipment inlet air temperature specifications by a significant amount. These areas will be highlighted and reasons given on why these areas failed to meet the criteria. The modeling results by region showed trends that compared somewhat favorably but some rack thermal profiles deviated quite significantly from measurements.


Author(s):  
A. A. Zatsarinny ◽  
K. I. Volovich ◽  
S. A. Denisov ◽  
Yu. S. Ionenkov ◽  
V. A. Kondrashev

This article discusses a methodology for assessing the effectiveness of a high-performance research platform. The assessment is carried out for the example of the "Informatika" Center for Collective Use (CCU) established at the Federal Research Center of the Institute of Management of the Russian Academy of Sciences, for solving new materials synthesis problems. The main objective of the "Informatika" Center for Collective Use is to conduct research using the software and hardware of the data center of the FRC IU RAS, including for the benefit of third-party organizations and research teams. The general characteristics of the "Informatika" Center for Collective Use are presented, including the main characteristics of its scientific equipment, work organization and capabilities. The hybrid high-performance computing cluster of the FRC CSC RAS (HHPCC) is part of the data center of the FRC IU RAS and also part of the “Informatika” Center for Collective Use. HHPCC provides computing resources in the form of cloud services as software (SaaS) and platform (PaaS) services. With the aid of special technologies, scientific services are delivered to researchers in the form of subject-oriented applications. Based on the analysis of the structure and operation principles of the Informatika Center, key performance indicators of the Center have been developed taking into account its specific tasks in order to characterize its various activity aspects (development, activities and performance). CCU efficiency evaluation implies calculation, on the basis of the developed indicators, of overall (generalized) indicators that characterize the CCU operation efficiency in various areas. An integral indicator is also calculated showing the overall CCU efficiency. To develop the overall performance indicators and the integral performance indicator, it is suggested to use the methods of weighted average and analysis of hierarchies. The procedure of determining partial performance indicators has been considered. Specific features of the choice of CCU performance indicators for solving new materials synthesis problems have been identified that characterize computing complex capabilities in the creation of a virtualization environment (peak performance of a computing system, real performance of a computing system on specialized tests, equipment loading with applied tasks and program code efficiency).


Author(s):  
Zahra Bouramdane ◽  
Abdellah Bah ◽  
Mohammed Alaoui ◽  
Nadia Martaj

Although thermoacoustic devices comprise simple components, the design of these machines is very challenging. In order to predict the behavior and optimize the performance of a thermoacoustic refrigerator driven by a standing-wave thermoacoustic engine, considering the changes in geometrical parameters, two analogies have been presented in this paper. The first analogy is based on CFD analysis where a 2D model is implemented to investigate the influence of stack parameters on the refrigerator performance, to analyze the time variation of the temperature gradient across the stack, and to examine the refrigerator performance in terms of refrigeration temperature. The second analogy is based on the use of an optimization algorithm based on the simplified linear thermoacoustic theory applied for designing thermoacoustic refrigerators with different stack parameters and operating conditions. Simulation results show that the engine produced a high-powered acoustic wave with a pressure amplitude of 23[Formula: see text]kPa and a frequency of 584[Formula: see text]Hz and this wave applies a temperature difference across the refrigeration stack with a cooling temperature of 292.8[Formula: see text]K when the stacks are positioned next to the pressure antinode. The results from the algorithm give the ability to design any thermoacoustic refrigerator with high performance by picking the appropriate parameters.


Author(s):  
Magnus K. Herrlin ◽  
Michael K. Patterson

Increased Information and Communications Technology (ICT) capability and improved energy-efficiency of today’s server platforms have created opportunities for the data center operator. However, these platforms also test the ability of many data center cooling systems. New design considerations are necessary to effectively cool high-density data centers. Challenges exist in both capital costs and operational costs in the thermal management of ICT equipment. This paper details how air cooling can be used to address both challenges to provide a low Total Cost of Ownership (TCO) and a highly energy-efficient design at high heat densities. We consider trends in heat generation from servers and how the resulting densities can be effectively cooled. A number of key factors are reviewed and appropriate design considerations developed to air cool 2000 W/ft2 (21,500 W/m2). Although there are requirements for greater engineering, such data centers can be built with current technology, hardware, and best practices. The density limitations are shown primarily from an airflow management and cooling system controls perspective. Computational Fluid Dynamics (CFD) modeling is discussed as a key part of the analysis allowing high-density designs to be successfully implemented. Well-engineered airflow management systems and control systems designed to minimize airflow by preventing mixing of cold and hot airflows allow high heat densities. Energy efficiency is gained by treating the whole equipment room as part of the airflow management strategy, making use of the extended environmental ranges now recommended and implementing air-side air economizers.


2021 ◽  
Author(s):  
Herve Gross ◽  
Antoine Mazuyer

Abstract Evaluating large basin-scale formations for CO2 sequestration is one of the most important challenges for our industry. The technical complexity and the quantification of risks associated with these operations call for new reservoir engineering and reservoir simulation tools. The impact of multiple coupled physical phenomena, the century timescale, and basin-sized models in these operations force us to completely take apart and revisit the numerical backbone of existing simulation tools. We need a reservoir simulation tool designed for scalability and portability on high-performance computing architectures. To achieve this, we are proposing a new, open-source, multiphysics, and multilevel physics simulation tool called GEOSX. This tool is jointly created by Lawrence Livermore National Laboratory, Stanford University, and Total. It is designed for scalability on multiple CPUs and multiple GPUs and offers a suite of physical solvers that can be extended easily while achieving a balance between performance and portability. GEOSX is initially targeting multiphysics simulations with coupled geomechanics, flow, and transport mechanics but with its open architecture, it allows access to high-performance physical solvers as building blocks of other multiphysics problems and provides users with a suite of tools for numerical optimization across platforms. In this paper, we introduce GEOSX, expose its fundamental architecture principles, and show an example of geological sequestration of CO2 modeling on real data. We demonstrate our ability to simulate fluid and rock poromechanical interactions over long periods and basin-scale dimensions. GEOSX demonstrates its usefulness for such complex and large problems and proves to be scalable and portable across multiple high-performance systems.


Author(s):  
Sadegh Khalili ◽  
Srikanth Rangarajan ◽  
Bahgat Sammakia ◽  
Vadim Gektin

Abstract Increasing power densities in data centers due to the rise of Artificial Intelligence (AI), high-performance computing (HPC) and machine learning compel engineers to develop new cooling strategies and designs for high-density data centers. Two-phase cooling is one of the promising technologies which exploits the latent heat of the fluid. This technology is much more effective in removing high heat fluxes than when using the sensible heat of fluid and requires lower coolant flow rates. The latent heat also implies more uniformity in the temperature of a heated surface. Despite the benefits of two-phase cooling, the phase change adds complexities to a system when multiple evaporators (exposed to different heat fluxes potentially) are connected to one coolant distribution unit (CDU). In this paper, a commercial pumped two-phase cooling system is investigated in a rack level. Seventeen 2-rack unit (RU) servers from two distinct models are retrofitted and deployed in the rack. The flow rate and pressure distribution across the rack are studied in various filling ratios. Also, investigated is the transient behavior of the cooling system due to a step change in the information technology (IT) load.


Author(s):  
Satyam Saini ◽  
Kaustubh K. Adsul ◽  
Pardeep Shahi ◽  
Amirreza Niazmand ◽  
Pratik Bansode ◽  
...  

Abstract Modern-day data center administrators are finding it increasingly difficult to lower the costs incurred in mechanical cooling of their IT equipment. This is especially true for high-performance computing facilities like Artificial Intelligence, Bitcoin Mining, and Deep Learning, etc. Airside Economization or free air cooling has been out there as a technology for a long time now to reduce the mechanical cooling costs. In free air cooling, under favorable ambient conditions of temperature and humidity, outside air can be used for cooling the IT equipment. In doing so, the IT equipment is exposed to sub-micron particulate/gaseous contaminants that might enter the data center facility with the cooling airflow. The present investigation uses a computational approach to model the airflow paths of particulate contaminants entering inside the IT equipment using a commercially available CFD code. A Discrete Phase Particle modeling approach is chosen to calculate trajectories of the dispersed contaminants. Standard RANS approach is used to model the airflow in the airflow and the particles are superimposed on the flow field by the CFD solver using Lagrangian particle tracking. The server geometry was modeled in 2-D with a combination of rectangular and cylindrical obstructions. This was done to comprehend the effect of change in the obstruction type and aspect ratio on particle distribution. Identifying such discrete areas of contaminant proliferation based on concentration fields due to changing geometries will help with the mitigation of particulate contamination related failures in data centers.


2020 ◽  
Vol 10 (4) ◽  
pp. 32
Author(s):  
Sayed Ashraf Mamun ◽  
Alexander Gilday ◽  
Amit Kumar Singh ◽  
Amlan Ganguly ◽  
Geoff V. Merrett ◽  
...  

Servers in a data center are underutilized due to over-provisioning, which contributes heavily toward the high-power consumption of the data centers. Recent research in optimizing the energy consumption of High Performance Computing (HPC) data centers mostly focuses on consolidation of Virtual Machines (VMs) and using dynamic voltage and frequency scaling (DVFS). These approaches are inherently hardware-based, are frequently unique to individual systems, and often use simulation due to lack of access to HPC data centers. Other approaches require profiling information on the jobs in the HPC system to be available before run-time. In this paper, we propose a reinforcement learning based approach, which jointly optimizes profit and energy in the allocation of jobs to available resources, without the need for such prior information. The approach is implemented in a software scheduler used to allocate real applications from the Princeton Application Repository for Shared-Memory Computers (PARSEC) benchmark suite to a number of hardware nodes realized with Odroid-XU3 boards. Experiments show that the proposed approach increases the profit earned by 40% while simultaneously reducing energy consumption by 20% when compared to a heuristic-based approach. We also present a network-aware server consolidation algorithm called Bandwidth-Constrained Consolidation (BCC), for HPC data centers which can address the under-utilization problem of the servers. Our experiments show that the BCC consolidation technique can reduce the power consumption of a data center by up-to 37%.


Sign in / Sign up

Export Citation Format

Share Document