scholarly journals GPU-Accelerated Laplace Equation Model Development Based on CUDA Fortran

Water ◽  
2021 ◽  
Vol 13 (23) ◽  
pp. 3435
Author(s):  
Boram Kim ◽  
Kwang Seok Yoon ◽  
Hyung-Jun Kim

In this study, a CUDA Fortran-based GPU-accelerated Laplace equation model was developed and applied to several cases. The Laplace equation is one of the equations that can physically analyze the groundwater flows, and is an equation that can provide analytical solutions. Such a numerical model requires a large amount of data to physically regenerate the flow with high accuracy, and requires computational time. These numerical models require a large amount of data to physically reproduce the flow with high accuracy and require computational time. As a way to shorten the computation time by applying CUDA technology, large-scale parallel computations were performed on the GPU, and a program was written to reduce the number of data transfers between the CPU and GPU. A GPU consists of many ALUs specialized in graphic processing, and can perform more concurrent computations than a CPU using multiple ALUs. The computation results of the GPU-accelerated model were compared with the analytical solution of the Laplace equation to verify the accuracy. The computation results of the GPU-accelerated Laplace equation model were in good agreement with the analytical solution. As the number of grids increased, the computational time of the GPU-accelerated model gradually reduced compared to the computational time of the CPU-based Laplace equation model. As a result, the computational time of the GPU-accelerated Laplace equation model was reduced by up to about 50 times.

2021 ◽  
Author(s):  
Maha Mdini ◽  
Takemasa Miyoshi ◽  
Shigenori Otsuka

<p>In the era of modern science, scientists have developed numerical models to predict and understand the weather and ocean phenomena based on fluid dynamics. While these models have shown high accuracy at kilometer scales, they are operated with massive computer resources because of their computational complexity.  In recent years, new approaches to solve these models based on machine learning have been put forward. The results suggested that it be possible to reduce the computational complexity by Neural Networks (NNs) instead of classical numerical simulations. In this project, we aim to shed light upon different ways to accelerating physical models using NNs. We test two approaches: Data-Driven Statistical Model (DDSM) and Hybrid Physical-Statistical Model (HPSM) and compare their performance to the classical Process-Driven Physical Model (PDPM). DDSM emulates the physical model by a NN. The HPSM, also known as super-resolution, uses a low-resolution version of the physical model and maps its outputs to the original high-resolution domain via a NN. To evaluate these two methods, we measured their accuracy and their computation time. Our results of idealized experiments with a quasi-geostrophic model [SO3] show that HPSM reduces the computation time by a factor of 3 and it is capable to predict the output of the physical model at high accuracy up to 9.25 days. The DDSM, however, reduces the computation time by a factor of 4 and can predict the physical model output with an acceptable accuracy only within 2 days. These first results are promising and imply the possibility of bringing complex physical models into real time systems with lower-cost computer resources in the future.</p>


2021 ◽  
Author(s):  
Brett W. Larsen ◽  
Shaul Druckmann

AbstractLateral and recurrent connections are ubiquitous in biological neural circuits. The strong computational abilities of feedforward networks have been extensively studied; on the other hand, while certain roles for lateral and recurrent connections in specific computations have been described, a more complete understanding of the role and advantages of recurrent computations that might explain their prevalence remains an important open challenge. Previous key studies by Minsky and later by Roelfsema argued that the sequential, parallel computations for which recurrent networks are well suited can be highly effective approaches to complex computational problems. Such “tag propagation” algorithms perform repeated, local propagation of information and were introduced in the context of detecting connectedness, a task that is challenging for feedforward networks. Here, we advance the understanding of the utility of lateral and recurrent computation by first performing a large-scale empirical study of neural architectures for the computation of connectedness to explore feedforward solutions more fully and establish robustly the importance of recurrent architectures. In addition, we highlight a tradeoff between computation time and performance and demonstrate hybrid feedforward/recurrent models that perform well even in the presence of varying computational time limitations. We then generalize tag propagation architectures to multiple, interacting propagating tags and demonstrate that these are efficient computational substrates for more general computations by introducing and solving an abstracted biologically inspired decision-making task. More generally, our work clarifies and expands the set of computational tasks that can be solved efficiently by recurrent computation, yielding hypotheses for structure in population activity that may be present in such tasks.Author SummaryLateral and recurrent connections are ubiquitous in biological neural circuits; intriguingly, this stands in contrast to the majority of current-day artificial neural network research which primarily uses feedforward architectures except in the context of temporal sequences. This raises the possibility that part of the difference in computational capabilities between real neural circuits and artificial neural networks is accounted for by the role of recurrent connections, and as a result a more detailed understanding of the computational role played by such connections is of great importance. Making effective comparisons between architectures is a subtle challenge, however, and in this paper we leverage the computational capabilities of large-scale machine learning to robustly explore how differences in architectures affect a network’s ability to learn a task. We first focus on the task of determining whether two pixels are connected in an image which has an elegant and efficient recurrent solution: propagate a connected label or tag along paths. Inspired by this solution, we show that it can be generalized in many ways, including propagating multiple tags at once and changing the computation performed on the result of the propagation. To illustrate these generalizations, we introduce an abstracted decision-making task related to foraging in which an animal must determine whether it can avoid predators in a random environment. Our results shed light on the set of computational tasks that can be solved efficiently by recurrent computation and how these solutions may appear in neural activity.


Author(s):  
Masanobu Hasebe ◽  
Shigeru Tabeta

Most of ocean models employ hydrostatic approximation because the horizontal scale is usually much larger than the vertical scale in oceanic phenomena. In hydrostatic approximation, dynamic pressure is neglected and the momentum equation in vertical direction needs not to be solved. But for the phenomena of buoyant jet from the sea bottom such as submarine groundwater discharge, hydrothermal plume and so on, hydrodynamic pressure cannot be neglected and the momentum equation of vertical direction must to be taken into account. Non-hydrostatic analysis requires so much computation time that it is usually difficult to calculate the current field in the wide ocean area by this approach. On the other hand, analysis assuming the hydrostatic approximation needs less computational time and usually gives reasonable results for large scale ocean phenomena such as tidal current. In the present study, the authors developed a new type of ocean model for multi-scale analysis, which conducts hydrostatic analysis for phenomena in wide area and non-hydrostatic analysis for the detail flow around the buoyant jet simultaneously. The application limit of hydrostatic approximation for ocean model was investigated, and a dynamic connection method of hydrostatic zone with non-hydrostatic zone was developed. By theoretical consideration employing parameter δ and ε which represent the ratio of grid size Δz to Δx and the ratio of vertical velocity to horizontal velocity, it was found that hydrostatic approximation can be applied if δε and ε2 are minute. To examine the developed method, simulations for lock-exchange problem and vertical jet under oscillating current were conducted. The result by the present model was similar to that of non-hydrostatic model in the case that hydrostatic approximation was applied on the area of δε<0.005 and ε2<0.005.


2015 ◽  
Vol 12 (03) ◽  
pp. 1550019 ◽  
Author(s):  
George Markou

In this paper, a numerical investigation on the limits of an automatic procedure for the generation of embedded steel reinforcement inside hexahedral finite elements (FEs) is presented. In 3D detailed reinforced concrete simulations, mapping the reinforcement grid inside the concrete hexahedral FEs is performed using the end-point coordinates of the rebar reinforcement macro-elements. This procedure is computationally demanding while in cases of large-scale models, the required computational time for the reinforcement mesh generation is excessive. This research work scopes to study and present the limitations of the embedded mesh generation method that was proposed by Markou and Papadrakakis, through the use of a 64-bit operating system. The embedded mesh generation method is integrated with a filtering algorithm in order to allocate and discard relatively short embedded rebar elements that result from the arbitrary positioning of the embedded rebar macro-elements and the nonprismatic geometry of the hexahedral mesh. The computational robustness and efficiency of the integrated embedded mesh generation method are demonstrated through the analysis of three numerical models. The first two numerical models are a full-scale 2-story and a 7-story RC structures while the third model deals with a full-scale RC bridge with a trapezoidal section and a total span of 100 m. Through the third numerical implementation, the computational capacity of the integrated embedded rebar mesh generation method is investigated.


2011 ◽  
Vol 8 (2) ◽  
pp. 83-87
Author(s):  
Sathyanarayanan Raghavan ◽  
Raphael. I. Okereke ◽  
Suresh K. Sitaraman

Modeling of viscoelastic relaxation of polymer materials is important to understand the thermo-mechanical behavior of organic microelectronic systems. However, incorporation of viscoelastic behavior into numerical models makes the models compute-intensive. This paper presents a different technique to incorporate the polymer viscoelastic behavior into the numerical models such that the computation time is not adversely affected without compromising the accuracy of the results obtained. In the proposed “pseudo viscoelastic” modeling technique, the modulus of the viscoelastic material is computed as a function of time and temperature loading history outside of the finite-element simulation, and is then input into the simulation as a thermo-elastic material incorporating the viscoelastic relaxation of the material. This paper compares the warpage results obtained through the proposed technique against a complete viscoelastic simulation model and experimental data, and it is seen that the maximum warpage predicted using the proposed technique agree within 10% compared with the results obtained from a “full” viscoelastic model. Also, it is shown through some of our simulations that the proposed technique could result in a computational time saving of more than 50% and hard disk space saving of 65%.


2019 ◽  
Author(s):  
Liqun Cao ◽  
Jinzhe Zeng ◽  
Mingyuan Xu ◽  
Chih-Hao Chin ◽  
Tong Zhu ◽  
...  

Combustion is a kind of important reaction that affects people's daily lives and the development of aerospace. Exploring the reaction mechanism contributes to the understanding of combustion and the more efficient use of fuels. Ab initio quantum mechanical (QM) calculation is precise but limited by its computational time for large-scale systems. In order to carry out reactive molecular dynamics (MD) simulation for combustion accurately and quickly, we develop the MFCC-combustion method in this study, which calculates the interaction between atoms using QM method at the level of MN15/6-31G(d). Each molecule in systems is treated as a fragment, and when the distance between any two atoms in different molecules is greater than 3.5 Å, a new fragment involved two molecules is produced in order to consider the two-body interaction. The deviations of MFCC-combustion from full system calculations are within a few kcal/mol, and the result clearly shows that the calculated energies of the different systems using MFCC-combustion are close to converging after the distance thresholds are larger than 3.5 Å for the two-body QM interactions. The methane combustion was studied with the MFCC-combustion method to explore the combustion mechanism of the methane-oxygen system.


2018 ◽  
Author(s):  
Pavel Pokhilko ◽  
Evgeny Epifanovsky ◽  
Anna I. Krylov

Using single precision floating point representation reduces the size of data and computation time by a factor of two relative to double precision conventionally used in electronic structure programs. For large-scale calculations, such as those encountered in many-body theories, reduced memory footprint alleviates memory and input/output bottlenecks. Reduced size of data can lead to additional gains due to improved parallel performance on CPUs and various accelerators. However, using single precision can potentially reduce the accuracy of computed observables. Here we report an implementation of coupled-cluster and equation-of-motion coupled-cluster methods with single and double excitations in single precision. We consider both standard implementation and one using Cholesky decomposition or resolution-of-the-identity of electron-repulsion integrals. Numerical tests illustrate that when single precision is used in correlated calculations, the loss of accuracy is insignificant and pure single-precision implementation can be used for computing energies, analytic gradients, excited states, and molecular properties. In addition to pure single-precision calculations, our implementation allows one to follow a single-precision calculation by clean-up iterations, fully recovering double-precision results while retaining significant savings.


Energies ◽  
2020 ◽  
Vol 14 (1) ◽  
pp. 176
Author(s):  
Iñigo Aramendia ◽  
Unai Fernandez-Gamiz ◽  
Adrian Martinez-San-Vicente ◽  
Ekaitz Zulueta ◽  
Jose Manuel Lopez-Guede

Large-scale energy storage systems (ESS) are nowadays growing in popularity due to the increase in the energy production by renewable energy sources, which in general have a random intermittent nature. Currently, several redox flow batteries have been presented as an alternative of the classical ESS; the scalability, design flexibility and long life cycle of the vanadium redox flow battery (VRFB) have made it to stand out. In a VRFB cell, which consists of two electrodes and an ion exchange membrane, the electrolyte flows through the electrodes where the electrochemical reactions take place. Computational Fluid Dynamics (CFD) simulations are a very powerful tool to develop feasible numerical models to enhance the performance and lifetime of VRFBs. This review aims to present and discuss the numerical models developed in this field and, particularly, to analyze different types of flow fields and patterns that can be found in the literature. The numerical studies presented in this review are a helpful tool to evaluate several key parameters important to optimize the energy systems based on redox flow technologies.


2019 ◽  
Vol 17 (06) ◽  
pp. 947-975 ◽  
Author(s):  
Lei Shi

We investigate the distributed learning with coefficient-based regularization scheme under the framework of kernel regression methods. Compared with the classical kernel ridge regression (KRR), the algorithm under consideration does not require the kernel function to be positive semi-definite and hence provides a simple paradigm for designing indefinite kernel methods. The distributed learning approach partitions a massive data set into several disjoint data subsets, and then produces a global estimator by taking an average of the local estimator on each data subset. Easy exercisable partitions and performing algorithm on each subset in parallel lead to a substantial reduction in computation time versus the standard approach of performing the original algorithm on the entire samples. We establish the first mini-max optimal rates of convergence for distributed coefficient-based regularization scheme with indefinite kernels. We thus demonstrate that compared with distributed KRR, the concerned algorithm is more flexible and effective in regression problem for large-scale data sets.


2021 ◽  
Vol 9 (6) ◽  
pp. 635
Author(s):  
Hyeok Jin ◽  
Kideok Do ◽  
Sungwon Shin ◽  
Daniel Cox

Coastal dunes are important morphological features for both ecosystems and coastal hazard mitigation. Because understanding and predicting dune erosion phenomena is very important, various numerical models have been developed to improve the accuracy. In the present study, a process-based model (XBeachX) was tested and calibrated to improve the accuracy of the simulation of dune erosion from a storm event by adjusting the coefficients in the model and comparing it with the large-scale experimental data. The breaker slope coefficient was calibrated to predict cross-shore wave transformation more accurately. To improve the prediction of the dune erosion profile, the coefficients related to skewness and asymmetry were adjusted. Moreover, the bermslope coefficient was calibrated to improve the simulation performance of the bermslope near the dune face. Model performance was assessed based on the model-data comparisons. The calibrated XBeachX successfully predicted wave transformation and dune erosion phenomena. In addition, the results obtained from other two similar experiments on dune erosion with the same calibrated set matched well with the observed wave and profile data. However, the prediction of underwater sand bar evolution remains a challenge.


Sign in / Sign up

Export Citation Format

Share Document