Parallelization of an Unsteady ALE Solver with Deforming Mesh Using OpenACC

Scientific Programming ◽

10.1155/2017/4610138 ◽

2017 ◽

Vol 2017 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Wenpeng Ma ◽

Zhonghua Lu ◽

Wu Yuan ◽

Xiaodong Hu

Keyword(s):

Parallel Algorithms ◽

Unstructured Mesh ◽

Programming Model ◽

Implicit Time ◽

Moving Body ◽

Numerical Solver ◽

2D And 3D ◽

Deforming Mesh

This paper presents a parallel, GPU-based, deforming mesh-enabled unsteady numerical solver for solving moving body problems by using OpenACC. Both the 2D and 3D parallel algorithms based on spring-like deforming mesh methods are proposed and then implemented through OpenACC programming model. Furthermore, these algorithms are coupled with an unstructured mesh based, implicit time scheme integrated numerical solver, which makes the full GPU version of the solver capable of handling unsteady calculations with deforming mesh. Experiments results show that the proposed parallel deforming mesh algorithm can achieve over 2.5x speedup on K20 GPU card in comparison with 20 OpenMP threads on Intel E5-2658 V2 CPU cores. And both 2D and 3D cases are conducted to validate the efficiency, correctness, and accuracy of the present solver.

Download Full-text

A Parallel Adaptive Newton-Krylov-Schwarz Method for 3D Compressible Inviscid Flow Simulations

Modelling and Simulation in Engineering ◽

10.1155/2013/694354 ◽

2013 ◽

Vol 2013 ◽

pp. 1-16 ◽

Cited By ~ 1

Author(s):

Marzio Sala ◽

Pénélope Leyland ◽

Angelo Casagrande

Keyword(s):

Compressible Flows ◽

Time Derivative ◽

Inviscid Flow ◽

Implicit Time ◽

Residual Distribution ◽

Preconditioning Techniques ◽

Flow Simulations ◽

Efficient Data ◽

Time Marching ◽

2D And 3D

A parallel adaptive pseudo transient Newton-Krylov-Schwarz (αΨNKS) method for the solution of compressible flows is presented. Multidimensional upwind residual distribution schemes are used for space discretisation, while an implicit time-marching scheme is employed for the discretisation of the (pseudo)time derivative. The linear system arising from the Newton method applied to the resulting nonlinear system is solved by the means of Krylov iterations with Schwarz-type preconditioners. A scalable and efficient data structure for theαΨNKS procedure is presented. The main computational kernels are considered, and an extensive analysis is reported to compare the Krylov accelerators, the preconditioning techniques. Results, obtained on a distributed memory computer, are presented for 2D and 3D problems of aeronautical interest on unstructured grids.

Download Full-text

Optimization of large-scale water transfer networks: Conic integer programming model and distributed parallel algorithms

AIChE Journal ◽

10.1002/aic.15505 ◽

2016 ◽

Vol 63 (5) ◽

pp. 1566-1581 ◽

Cited By ~ 2

Author(s):

Li-Juan Li ◽

Rui-Jie Zhou

Keyword(s):

Integer Programming ◽

Parallel Algorithms ◽

Large Scale ◽

Programming Model ◽

Water Transfer ◽

Integer Programming Model

Download Full-text

A Parallel Apriori-Transaction Reduction Algorithm Using Hadoop-Mapreduce in Cloud

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2018/v1i124719 ◽

2018 ◽

pp. 1-24

Author(s):

A. L. Sayeth Saabith ◽

Elankovan Sundararajan ◽

Azuraliza Abu Bakar

Keyword(s):

Parallel Algorithms ◽

Execution Time ◽

Large Scale ◽

Programming Model ◽

Main Memory ◽

Frequent Itemset ◽

Massive Data ◽

Apriori Algorithm ◽

Hadoop Mapreduce ◽

High Level

Apriori algorithm is a classical algorithm of association rule mining and widely used for generating frequent item sets. However, the original Apriori algorithm has some limitation such as it needs to scan the dataset many times to discover all frequent itemset and generate huge number of candidate itemset. To overcome these limitations, researchers have made a lot of improvements to the Apriori such as candidate generation, without candidate generation, transaction reduction, partitioning, and sampling. When it comes to mine massive data, these algorithms failed to prove efficiency because limitation of the processing capacity, storage capacity, and main memory constraints. Therefore, parallel and distributed algorithms are developed to perform large-scale computing in ARM on multiple processors. However, the problems with most of the parallel and distributed framework are overheads of managing distributed system, lack of high level parallel programming language, and node failures. Hadoop-MapReduce is an efficient, scalable, and simplified programming model for massive data processing and it also available on cloud environment. Cloud computing offers huge computing resources, and capacities to solve big data challenges. Recently many parallel algorithms have been proposed on Hadoop-MapReduce to enhance the performance of Apriori algorithm but there are some drawbacks: since multiple scan over the dataset is needed to generate candidate itemset, it consume more execution time. The aim of this study is to propose a parallel Transaction Reduction MapReduce Apriori algorithm (TRMR-Apriori) which is reduce unnecessary transaction values and transactions from the dataset in parallel manner to overcome above problems. The experiments show that TRMR-Apriori is able to achieve better execution time to discover frequent itemset those of previous sequential ARM algorithms such as Apriori, AprioriTid, Eclat, and FP-Growth and the previous parallel algorithms such as PApriori, MRApriori, and Modified Apriori with different condition on homogeneous computing environment using Hadoop-MapReduce platform in cloud. Overall, the TRMR-Apriori shows the strength to extract the frequent itemset from massive dataset in cloud.

Download Full-text

Cross-platform programming model for many-core lattice Boltzmann simulations

PLoS ONE ◽

10.1371/journal.pone.0250306 ◽

2021 ◽

Vol 16 (4) ◽

pp. e0250306

Author(s):

Jonas Latt ◽

Christophe Coreixas ◽

Joël Beny

Keyword(s):

Parallel Algorithms ◽

Lattice Boltzmann ◽

Programming Model ◽

Simulation Software ◽

Performance Impact ◽

Lattice Boltzmann Simulations ◽

Cross Platform ◽

Access Patterns ◽

Many Core ◽

Code Annotations

We present a novel, hardware-agnostic implementation strategy for lattice Boltzmann (LB) simulations, which yields massive performance on homogeneous and heterogeneous many-core platforms. Based solely on C++17 Parallel Algorithms, our approach does not rely on any language extensions, external libraries, vendor-specific code annotations, or pre-compilation steps. Thanks in particular to a recently proposed GPU back-end to C++17 Parallel Algorithms, it is shown that a single code can compile and reach state-of-the-art performance on both many-core CPU and GPU environments for the solution of a given non trivial fluid dynamics problem. The proposed strategy is tested with six different, commonly used implementation schemes to test the performance impact of memory access patterns on different platforms. Nine different LB collision models are included in the tests and exhibit good performance, demonstrating the versatility of our parallel approach. This work shows that it is less than ever necessary to draw a distinction between research and production software, as a concise and generic LB implementation yields performances comparable to those achievable in a hardware specific programming language. The results also highlight the gains of performance achieved by modern many-core CPUs and their apparent capability to narrow the gap with the traditionally massively faster GPU platforms. All code is made available to the community in form of the open-source project stlbm, which serves both as a stand-alone simulation software and as a collection of reusable patterns for the acceleration of pre-existing LB codes.

Download Full-text

POD reduced-order unstructured mesh modeling applied to 2D and 3D fluid flow

Computers & Mathematics with Applications ◽

10.1016/j.camwa.2012.06.009 ◽

2013 ◽

Vol 65 (3) ◽

pp. 362-379 ◽

Cited By ~ 37

Author(s):

J. Du ◽

F. Fang ◽

C.C. Pain ◽

I.M. Navon ◽

J. Zhu ◽

...

Keyword(s):

Fluid Flow ◽

Unstructured Mesh ◽

Reduced Order ◽

Mesh Modeling ◽

2D And 3D

Download Full-text

Numerical simulation by the MAH-3 code of the interfaces using an unstructured mesh of markers

Laser and Particle Beams ◽

10.1017/s0263034600182072 ◽

2000 ◽

Vol 18 (2) ◽

pp. 197-205 ◽

Cited By ~ 1

Author(s):

V.I. VOLKOV ◽

V.A. GORDEYCHUK ◽

N.S. ES'KOV ◽

O.M. KOZYREV

Keyword(s):

Numerical Simulation ◽

Unstructured Mesh ◽

Triangular Mesh ◽

Taylor Instability ◽

Accurate Description ◽

Eulerian Mesh ◽

Rayleigh Taylor Instability ◽

Unstructured Triangular Mesh ◽

2D And 3D

The paper addresses Rayleigh–Taylor instability (RTI) problems and presents a method of describing an interface with markers on a Eulerian mesh, which is implemented in the MAH-3 code. The proposed method allows a more accurate description of the evolution of the interface caused by Rayleigh–Taylor perturbations and preserves symmetry of the interface under appropriate symmetry of the problem (planar, cylindrical, and spherical). The method employs an unstructured triangular mesh of makers. Method capabilities are demonstrated on 2D and 3D Rayleigh–Taylor instability problems.

Download Full-text

Unstructured mesh methods for the solution of the unsteady compressible flow equations with moving boundary components

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2007.2020 ◽

2007 ◽

Vol 365 (1859) ◽

pp. 2531-2552 ◽

Cited By ~ 9

Author(s):

Oubay Hassan ◽

Kenneth Morgan ◽

Nigel Weatherill

Keyword(s):

Compressible Flows ◽

Unstructured Mesh ◽

Moving Boundary ◽

Implicit Time ◽

Flow Equations ◽

Time Stepping ◽

Ale Formulation ◽

Dual Time Stepping ◽

Flow Features ◽

Viscous Compressible Flows

A review of a procedure for the simulation of time-dependent, inviscid and turbulent viscous, compressible flows involving geometries that change in time is presented. The adopted discretization technique employs unstructured meshes and both explicit and implicit time-stepping schemes. A dual time-stepping procedure and an ALE formulation enable flows involving moving boundary components to be included. Techniques that have been developed to maintain the validity of the unstructured mesh and to allow for the capture of moving flow features are also reviewed. Using the in-house developed techniques, some examples are included to demonstrate the use of the approach for the simulation of a number of flows of practical industrial interest.

Download Full-text

Evaluation of Parallel Algorithms of Traffic Volume Statistics per Cell in Cellular Network Based on MapReduce Programming Model

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.401-403.1859 ◽

2013 ◽

Vol 401-403 ◽

pp. 1859-1863

Author(s):

Qing Yang ◽

Jun Liu ◽

Huan Wang ◽

Wen Li Zhou ◽

Hua Yu

Keyword(s):

Big Data ◽

Parallel Algorithms ◽

Network Design ◽

Time Complexity ◽

Cellular Network ◽

Complexity Analysis ◽

Programming Model ◽

Resource Optimization ◽

Network Resource ◽

Data Network

Understanding traffic per unit time in cell dimension in cellular data network can be of great help for mobile operators to improve the performance of the cellular data network. It is important for network design and resource optimization. In this paper, we describe three methods to count the traffic per unit time per cell. Moreover, we compare the results of the three methods by the deviation distribution of the traffic and time complexity analysis. Our work is distinguished from other related work by using big data which contains around 1.4 billion records and 20 thousands cells. Generally, we expect this paper could deliver important insights into cellar data network resource optimization.

Download Full-text

Parallel Computation of Rough Set Approximations in Information Systems with Missing Decision Data

Computers ◽

10.3390/computers7030044 ◽

2018 ◽

Vol 7 (3) ◽

pp. 44 ◽

Cited By ~ 1

Author(s):

Thinh Cao ◽

Koichi Yamada ◽

Muneyuki Unehara ◽

Izumi Suzuki ◽

Do Nguyen

Keyword(s):

Information Systems ◽

Missing Data ◽

Parallel Algorithms ◽

Parallel Computation ◽

Rough Set ◽

Large Scale ◽

Programming Model ◽

Sequential Algorithm ◽

Distributed Programming ◽

Decision Attributes

The paper discusses the use of parallel computation to obtain rough set approximations from large-scale information systems where missing data exist in both condition and decision attributes. To date, many studies have focused on missing condition data, but very few have accounted for missing decision data, especially in enlarging datasets. One of the approaches for dealing with missing data in condition attributes is named twofold rough approximations. The paper aims to extend the approach to deal with missing data in the decision attribute. In addition, computing twofold rough approximations is very intensive, thus the approach is not suitable when input datasets are large. We propose parallel algorithms to compute twofold rough approximations in large-scale datasets. Our method is based on MapReduce, a distributed programming model for processing large-scale data. We introduce the original sequential algorithm first and then the parallel version is introduced. Comparison between the two approaches through experiments shows that our proposed parallel algorithms are suitable for and perform efficiently on large-scale datasets that have missing data in condition and decision attributes.

Download Full-text

Parallel Unstructured Mesh Adaptation for Transient Moving Body and Aeropropulsive Applications

42nd AIAA Aerospace Sciences Meeting and Exhibit ◽

10.2514/6.2004-1057 ◽

2004 ◽

Cited By ~ 13

Author(s):

Peter Cavallo ◽

Neeraj Sinha ◽

Gregory Feldman

Keyword(s):

Unstructured Mesh ◽

Mesh Adaptation ◽

Moving Body

Download Full-text