scholarly journals MPI Runtime Error Detection with MUST: Advances in Deadlock Detection

2013 ◽  
Vol 21 (3-4) ◽  
pp. 109-121 ◽  
Author(s):  
Tobias Hilbrich ◽  
Joachim Protze ◽  
Martin Schulz ◽  
Bronis R. de Supinski ◽  
Matthias S. Müller

The widely used Message Passing Interface (MPI) is complex and rich. As a result, application developers require automated tools to avoid and to detect MPI programming errors. We present the Marmot Umpire Scalable Tool (MUST) that detects such errors with significantly increased scalability. We present improvements to our graph-based deadlock detection approach for MPI, which cover future MPI extensions. Our enhancements also check complex MPI constructs that no previous graph-based detection approach handled correctly. Finally, we present optimizations for the processing of MPI operations that reduce runtime deadlock detection overheads. Existing approaches often require 𝒪(p) analysis time per MPI operation, forpprocesses. We empirically observe that our improvements lead to sub-linear or better analysis time per operation for a wide range of real world applications.

Author(s):  
Anoosheh Niavarani-Kheirier ◽  
Masoud Darbandi ◽  
Gerry E. Schneider

The main objective of the current work is to utilize Lattice Boltzmann Method (LBM) for simulating buoyancy-driven flow considering the hybrid thermal lattice Boltzmann equation (HTLBE). After deriving the required formulations, they are validated against a wide range of Rayleigh numbers in buoyancy-driven square cavity problem. The performance of the method is investigated on parallel machines using Message Passing Interface (MPI) library and implementing domain decomposition technique to solve problems with large order of computations. The achieved results show that the code is highly efficient to solve large scale problems with excellent speedup.


Author(s):  
Raed AlDhubhani ◽  
Fathy Eassa ◽  
Faisal Saeed

Deadlock detection is one of the main issues of software testing in High Performance Computing (HPC) and also inexascale computing areas in the near future. Developing and testing programs for machines which have millions of cores is not an easy task. HPC program consists of thousands (or millions) of parallel processes which need to communicate with each other in the runtime. Message Passing Interface (MPI) is a standard library which provides this communication capability and it is frequently used in the HPC. Exascale programs are expected to be developed using MPI standard library. For parallel programs, deadlock is one of the expected problems. In this paper, we discuss the deadlock detection for exascale MPI-based programs where the scalability and efficiency are critical issues. The proposed method detects and flags the processes and communication operations which are potential to cause deadlocks in a scalable and efficient manner. MPI benchmark programs were used to test the proposed method.


2015 ◽  
Vol 8 (10) ◽  
pp. 3333-3348 ◽  
Author(s):  
W. He ◽  
C. Beyer ◽  
J. H. Fleckenstein ◽  
E. Jang ◽  
O. Kolditz ◽  
...  

Abstract. The open-source scientific software packages OpenGeoSys and IPhreeqc have been coupled to set up and simulate thermo-hydro-mechanical-chemical coupled processes with simultaneous consideration of aqueous geochemical reactions faster and easier on high-performance computers. In combination with the elaborated and extendable chemical database of IPhreeqc, it will be possible to set up a wide range of multiphysics problems with numerous chemical reactions that are known to influence water quality in porous and fractured media. A flexible parallelization scheme using MPI (Message Passing Interface) grouping techniques has been implemented, which allows an optimized allocation of computer resources for the node-wise calculation of chemical reactions on the one hand and the underlying processes such as for groundwater flow or solute transport on the other. This technical paper presents the implementation, verification, and parallelization scheme of the coupling interface, and discusses its performance and precision.


Author(s):  
Raed AlDhubhani ◽  
Fathy Eassa ◽  
Faisal Saeed

Deadlock detection is one of the main issues of software testing in High Performance Computing (HPC) and also inexascale computing areas in the near future. Developing and testing programs for machines which have millions of cores is not an easy task. HPC program consists of thousands (or millions) of parallel processes which need to communicate with each other in the runtime. Message Passing Interface (MPI) is a standard library which provides this communication capability and it is frequently used in the HPC. Exascale programs are expected to be developed using MPI standard library. For parallel programs, deadlock is one of the expected problems. In this paper, we discuss the deadlock detection for exascale MPI-based programs where the scalability and efficiency are critical issues. The proposed method detects and flags the processes and communication operations which are potential to cause deadlocks in a scalable and efficient manner. MPI benchmark programs were used to test the proposed method.


2020 ◽  
Vol 15 ◽  
Author(s):  
Weiwen Zhang ◽  
Long Wang ◽  
Theint Theint Aye ◽  
Juniarto Samsudin ◽  
Yongqing Zhu

Background: Genotype imputation as a service is developed to enable researchers to estimate genotypes on haplotyped data without performing whole genome sequencing. However, genotype imputation is computation intensive and thus it remains a challenge to satisfy the high performance requirement of genome wide association study (GWAS). Objective: In this paper, we propose a high performance computing solution for genotype imputation on supercomputers to enhance its execution performance. Method: We design and implement a multi-level parallelization that includes job level, process level and thread level parallelization, enabled by job scheduling management, message passing interface (MPI) and OpenMP, respectively. It involves job distribution, chunk partition and execution, parallelized iteration for imputation and data concatenation. Due to the design of multi-level parallelization, we can exploit the multi-machine/multi-core architecture to improve the performance of genotype imputation. Results: Experiment results show that our proposed method can outperform the Hadoop-based implementation of genotype imputation. Moreover, we conduct the experiments on supercomputers to evaluate the performance of the proposed method. The evaluation shows that it can significantly shorten the execution time, thus improving the performance for genotype imputation. Conclusion: The proposed multi-level parallelization, when deployed as an imputation as a service, will facilitate bioinformatics researchers in Singapore to conduct genotype imputation and enhance the association study.


Photonics ◽  
2020 ◽  
Vol 8 (1) ◽  
pp. 3
Author(s):  
Shun Qin ◽  
Wai Kin Chan

Accurate segmented mirror wavefront sensing and control is essential for next-generation large aperture telescope system design. In this paper, a direct tip–tilt and piston error detection technique based on model-based phase retrieval with multiple defocused images is proposed for segmented mirror wavefront sensing. In our technique, the tip–tilt and piston error are represented by a basis consisting of three basic plane functions with respect to the x, y, and z axis so that they can be parameterized by the coefficients of these bases; the coefficients then are solved by a non-linear optimization method with the defocus multi-images. Simulation results show that the proposed technique is capable of measuring high dynamic range wavefront error reaching 7λ, while resulting in high detection accuracy. The algorithm is demonstrated as robust to noise by introducing phase parameterization. In comparison, the proposed tip–tilt and piston error detection approach is much easier to implement than many existing methods, which usually introduce extra sensors and devices, as it is a technique based on multiple images. These characteristics make it promising for the application of wavefront sensing and control in next-generation large aperture telescopes.


2021 ◽  
Vol 11 (8) ◽  
pp. 3397
Author(s):  
Gustavo Assunção ◽  
Nuno Gonçalves ◽  
Paulo Menezes

Human beings have developed fantastic abilities to integrate information from various sensory sources exploring their inherent complementarity. Perceptual capabilities are therefore heightened, enabling, for instance, the well-known "cocktail party" and McGurk effects, i.e., speech disambiguation from a panoply of sound signals. This fusion ability is also key in refining the perception of sound source location, as in distinguishing whose voice is being heard in a group conversation. Furthermore, neuroscience has successfully identified the superior colliculus region in the brain as the one responsible for this modality fusion, with a handful of biological models having been proposed to approach its underlying neurophysiological process. Deriving inspiration from one of these models, this paper presents a methodology for effectively fusing correlated auditory and visual information for active speaker detection. Such an ability can have a wide range of applications, from teleconferencing systems to social robotics. The detection approach initially routes auditory and visual information through two specialized neural network structures. The resulting embeddings are fused via a novel layer based on the superior colliculus, whose topological structure emulates spatial neuron cross-mapping of unimodal perceptual fields. The validation process employed two publicly available datasets, with achieved results confirming and greatly surpassing initial expectations.


2021 ◽  
Vol 48 (4) ◽  
pp. 3-3
Author(s):  
Ingo Weber

Blockchain is a novel distributed ledger technology. Through its features and smart contract capabilities, a wide range of application areas opened up for blockchain-based innovation [5]. In order to analyse how concrete blockchain systems as well as blockchain applications are used, data must be extracted from these systems. Due to various complexities inherent in blockchain, the question how to interpret such data is non-trivial. Such interpretation should often be shared among parties, e.g., if they collaborate via a blockchain. To this end, we devised an approach codify the interpretation of blockchain data, to extract data from blockchains accordingly, and to output it in suitable formats [1, 2]. This work will be the main topic of the keynote. In addition, application developers and users of blockchain applications may want to estimate the cost of using or operating a blockchain application. In the keynote, I will also discuss our cost estimation method [3, 4]. This method was designed for the Ethereum blockchain platform, where cost also relates to transaction complexity, and therefore also to system throughput.


Energies ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 2284
Author(s):  
Krzysztof Przystupa ◽  
Mykola Beshley ◽  
Olena Hordiichuk-Bublivska ◽  
Marian Kyryk ◽  
Halyna Beshley ◽  
...  

The problem of analyzing a big amount of user data to determine their preferences and, based on these data, to provide recommendations on new products is important. Depending on the correctness and timeliness of the recommendations, significant profits or losses can be obtained. The task of analyzing data on users of services of companies is carried out in special recommendation systems. However, with a large number of users, the data for processing become very big, which causes complexity in the work of recommendation systems. For efficient data analysis in commercial systems, the Singular Value Decomposition (SVD) method can perform intelligent analysis of information. With a large amount of processed information we proposed to use distributed systems. This approach allows reducing time of data processing and recommendations to users. For the experimental study, we implemented the distributed SVD method using Message Passing Interface, Hadoop and Spark technologies and obtained the results of reducing the time of data processing when using distributed systems compared to non-distributed ones.


Sign in / Sign up

Export Citation Format

Share Document