Analysis on the Cooling Efficiency of High-Performance Multicore Processors according to Cooling Methods

Seung-Gu Kang; Hong-Jun Choi; Jin-Woo Ahn; Jae-Hyung Park; Jong-Myon Kim; Cheol-Hong Kim

doi:10.9708/jksci.2011.16.7.001

High Performance Parallelization of COMPSYN on a Cluster of Multicore Processors with GPUs

Procedia Computer Science ◽

10.1016/j.procs.2012.04.103 ◽

2012 ◽

Vol 9 ◽

pp. 966-975

Author(s):

Ferdinando Alessi ◽

Annalisa Massini ◽

Roberto Basili

Keyword(s):

High Performance ◽

Multicore Processors

Download Full-text

High Performance Topology-Aware Communication in Multicore Processors

Chapman & Hall/CRC Computational Science - Scientific Computing with Multicore and Accelerators ◽

10.1201/b10376-30 ◽

2010 ◽

pp. 443-460

Author(s):

Hari Subramoni ◽

Fabrizio Petrini ◽

Virat Agarwal ◽

Davide Pasetto

Keyword(s):

High Performance ◽

Multicore Processors

Download Full-text

NUMA-Aware DGEMM Based on 64-Bit ARMv8 Multicore Processors Architecture

Electronics ◽

10.3390/electronics10161984 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1984

Author(s):

Wei Zhang ◽

Zihao Jiang ◽

Zhiguang Chen ◽

Nong Xiao ◽

Yang Ou

Keyword(s):

Energy Efficiency ◽

High Performance ◽

Multicore Processors ◽

Matrix Multiplication ◽

Memory Access ◽

Double Precision ◽

Competitive Performance ◽

General Matrix ◽

Remarkable Improvement ◽

Task Independence

Double-precision general matrix multiplication (DGEMM) is an essential kernel for measuring the potential performance of an HPC platform. ARMv8-based system-on-chips (SoCs) have become the candidates for the next-generation HPC systems with their highly competitive performance and energy efficiency. Therefore, it is meaningful to design high-performance DGEMM for ARMv8-based SoCs. However, as ARMv8-based SoCs integrate increasing cores, modern CPU uses non-uniform memory access (NUMA). NUMA restricts the performance and scalability of DGEMM when many threads access remote NUMA domains. This poses a challenge to develop high-performance DGEMM on multi-NUMA architecture. We present a NUMA-aware method to reduce the number of cross-die and cross-chip memory access events. The critical enabler for NUMA-aware DGEMM is to leverage two levels of parallelism between and within nodes in a purely threaded implementation, which allows the task independence and data localization of NUMA nodes. We have implemented NUMA-aware DGEMM in the OpenBLAS and evaluated it on a dual-socket server with 48-core processors based on the Kunpeng920 architecture. The results show that NUMA-aware DGEMM has effectively reduced the number of cross-die and cross-chip memory access, resulting in enhancing the scalability of DGEMM significantly and increasing the performance of DGEMM by 17.1% on average, with the most remarkable improvement being 21.9%.

Download Full-text

Designing of High Performance Multicore Processor with Improved Cache Configuration and Interconnect

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing ◽

10.4018/978-1-4666-8853-7.ch009 ◽

2016 ◽

pp. 204-219

Author(s):

Ram Prasad Mohanty ◽

Ashok Kumar Turuk ◽

Bibhudatta Sahoo

Keyword(s):

High Performance ◽

Multicore Processors ◽

Multicore Processor ◽

Cache Size ◽

L2 Cache ◽

Internal Network ◽

On Chip ◽

L1 And L2 ◽

The Impact ◽

Cache Configuration

The growing number of cores increases the demand for a powerful memory subsystem which leads to enhancement in the size of caches in multicore processors. Caches are responsible for giving processing elements a faster, higher bandwidth local memory to work with. In this chapter, an attempt has been made to analyze the impact of cache size on performance of Multi-core processors by varying L1 and L2 cache size on the multicore processor with internal network (MPIN) referenced from NIAGRA architecture. As the number of core's increases, traditional on-chip interconnects like bus and crossbar proves to be low in efficiency as well as suffer from poor scalability. In order to overcome the scalability and efficiency issues in these conventional interconnect, ring based design has been proposed. The effect of interconnect on the performance of multicore processors has been analyzed and a novel scalable on-chip interconnection mechanism (INOC) for multicore processors has been proposed. The benchmark results are presented by using a full system simulator. Results show that, using the proposed INoC, compared with the MPIN; the execution time are significantly reduced.

Download Full-text

Influence of the ambient temperature on the cooling efficiency of the high performance cooling device with thermosiphon effect

EPJ Web of Conferences ◽

10.1051/epjconf/201818002073 ◽

2018 ◽

Vol 180 ◽

pp. 02073

Author(s):

Patrik Nemec ◽

Milan Malcho

Keyword(s):

Ambient Temperature ◽

High Performance ◽

Convection Heat Transfer ◽

Heat Removal ◽

Measuring Method ◽

High Heat ◽

Heat Fluxes ◽

High Heat Flux ◽

Cooling Efficiency ◽

Cooling Device

This work deal with experimental measurement and calculation cooling efficiency of the cooling device working with a heat pipe technology. The referred device in the article is cooling device capable transfer high heat fluxes from electric elements to the surrounding. The work contain description, working principle and construction of cooling device. The main factor affected the dissipation of high heat flux from electronic elements through the cooling device to the surrounding is condenser construction, its capacity and option of heat removal. Experimental part describe the measuring method cooling efficiency of the cooling device depending on ambient temperature in range -20 to 40°C and at heat load of electronic components 750 W. Measured results are compared with results calculation based on physical phenomena of boiling, condensation and natural convection heat transfer.

Download Full-text

Parallel Skyline Computation Exploiting the Lattice Structure

Journal of Database Management ◽

10.4018/jdm.2015100102 ◽

2015 ◽

Vol 26 (4) ◽

pp. 18-43 ◽

Cited By ~ 7

Author(s):

Markus Endres ◽

Werner Kießling

Keyword(s):

High Performance ◽

Lattice Structure ◽

Multicore Processors ◽

Optimization Techniques ◽

Skyline Query ◽

Research Attention ◽

Evaluation Strategies ◽

Hardware Architectures ◽

Parallel Evaluation ◽

New Algorithms

The problem of Skyline computation has attracted considerable research attention in the last decade. A Skyline query selects those tuples from a dataset that are optimal with respect to a set of designated preference attributes. Since multicore processors are going mainstream, it has become imperative to develop parallel algorithms, which fully exploit the advantages of such modern hardware architectures. In this paper, the authors present high-performance parallel Skyline algorithms based on the lattice structure generated by a Skyline query. For this, they propose different evaluation strategies and compare several data structures for the parallel evaluation of Skyline queries. The authors present novel optimization techniques for lattice based Skyline algorithms based on pruning and removing one unrestricted attribute domain. They demonstrate through comprehensive experiments on synthetic and real datasets that their new algorithms outperform state-of-the-art multicore Skyline techniques for low-cardinality domains. The authors' algorithms have linear runtime complexity and fully play on modern hardware architectures.

Download Full-text

High Performance Memory Requests Scheduling Technique for Multicore Processors

2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems ◽

10.1109/hpcc.2012.26 ◽

2012 ◽

Cited By ~ 5

Author(s):

Walid El-Reedy ◽

Ali A. El-Moursy ◽

Hossam A.H. Fahmy

Keyword(s):

High Performance ◽

Multicore Processors ◽

Scheduling Technique

Download Full-text

FILM COOLING OVER A FLAT PLATE WITH COOLANT SUPPLY IN TO TRIANGULAR INDENTATION

Industrial Heat Engineering ◽

10.31472/ihe.3.2018.01 ◽

2018 ◽

Vol 40 (3) ◽

pp. 5-11

Author(s):

А.А. Khalatov ◽

N.A. Panchenko ◽

О.О. Petliak

Keyword(s):

Coriolis Force ◽

Film Cooling ◽

High Performance ◽

Turbine Blades ◽

Engineering Thermophysics ◽

Cooling Efficiency ◽

National Academy Of Sciences ◽

Peak Displacement ◽

Sst Turbulence Model ◽

Film Cooling Efficiency

The modern high-performance gas turbine engines operate at the flow temperatures exceeding the melting temperature of materials, which require the blade cooling. However, the traditional scheme of film cooling is characterized by appearance of secondary vortex structures that destroy the coolant film. From the existing alternative schemes of film cooling, which allow protecting the turbine blades from influence of high temperatures, the scheme with triangular dimples has demonstrated good results in the stationary conditions. This cooling scheme was patented and tested in the Institute of Engineering Thermophysics, National Academy of Sciences of Ukraine. In order to determine the feasibility of such a scheme, it is necessary to consider the effect of the blade rotation influencing the film cooling efficiency. The results are given towards theoretical investigation of the film cooling efficiency of this scheme under rotation conditions. The study was performed using the ANSYS CFX package using SST-turbulence model. The blowing ratio was varied from 0.5 to 2.0. Numerical simulation performed for rotation parameters corresponding to the dominant influence of the Coriolis force – 10, 100 rpm, and centrifugal forces – 3000, 5000 and 7000 rpm. Оn the basis of computer simulation, it has been shown that rotation does not affect weakly the average efficiency of film cooling at Coriolis force, but causes a peak displacement of local adiabatic efficiency, at rotation parameter of 7000 rpm, when there is a distortion of the flow lines.

Download Full-text

Droplet-Based Microfluidic Thermal Management Methods for High Performance Electronic Devices

Micromachines ◽

10.3390/mi10020089 ◽

2019 ◽

Vol 10 (2) ◽

pp. 89 ◽

Cited By ~ 2

Author(s):

Zhibin Yan ◽

Mingliang Jin ◽

Zhengguang Li ◽

Guofu Zhou ◽

Lingling Shui

Keyword(s):

Heat Flux ◽

Thermal Management ◽

High Performance ◽

Rapid Development ◽

Electronic Devices ◽

High Heat ◽

High Heat Flux ◽

Electrowetting On Dielectric ◽

Handling Methods ◽

Cooling Methods

Advanced thermal management methods have been the key issues for the rapid development of the electronic industry following Moore’s law. Droplet-based microfluidic cooling technologies are considered as promising solutions to conquer the major challenges of high heat flux removal and nonuniform temperature distribution in confined spaces for high performance electronic devices. In this paper, we review the state-of-the-art droplet-based microfluidic cooling methods in the literature, including the basic theory of electrocapillarity, cooling applications of continuous electrowetting (CEW), electrowetting (EW) and electrowetting-on-dielectric (EWOD), and jumping droplet microfluidic liquid handling methods. The droplet-based microfluidic cooling methods have shown an attractive capability of microscale liquid manipulation and a relatively high heat flux removal for hot spots. Recommendations are made for further research to develop advanced liquid coolant materials and the optimization of system operation parameters.

Download Full-text

High Performance Metal-Based Nanocomposite Thermal Interface Materials Toward Enhanced Cooling Efficiency in Electronic Applications

2018 IEEE 68th Electronic Components and Technology Conference (ECTC) ◽

10.1109/ectc.2018.00091 ◽

2018 ◽

Author(s):

Nirup Nagabandi ◽

Cengiz Yegin ◽

Mustafa Akbulut

Keyword(s):

High Performance ◽

Thermal Interface Materials ◽

Cooling Efficiency ◽

Thermal Interface ◽

Interface Materials

Download Full-text