Parallel computations of the step response of a floor heater with the use of a graphics processing unit. Part 2: results and their evaluation

2013 ◽  
Vol 61 (4) ◽  
pp. 949-954 ◽  
Author(s):  
J. Gołębiowski ◽  
J. Forenc

Abstract Using models and algorithms presented in the first part of the article, a spatio-temporal distribution of the step response of a floor heater was determined. The results have been presented in the form of heating curves and temperature profiles of the heater in the selected time moments. The computations results were verified through comparing them with the solution obtained with the use of a commercial program - NISA. Additionally, the distribution of the average time constant of thermal processes occurring in the heater was determined. The analysis of the use of a graphics processing unit in numerical computations based on the conjugate gradient method was done. It was proved that the use of a graphics processing unit is profitable in the case of solving linear systems of equations with dense coefficient matrices. In the case of a sparse matrix, the speed-up depends on the number of its non-zero elements.

Author(s):  
Franz Pichler ◽  
Gundolf Haase

A finite element code is developed in which all of the computationally expensive steps are performed on a graphics processing unit via the THRUST and the PARALUTION libraries. The code focuses on the simulation of transient problems where the repeated computations per time-step create the computational cost. It is used to solve partial and ordinary differential equations as they arise in thermal-runaway simulations of automotive batteries. The speed-up obtained by utilizing the graphics processing unit for every critical step is compared against the single core and the multi-threading solutions which are also supported by the chosen libraries. This way a high total speed-up on the graphics processing unit is achieved without the need for programming a single classical Compute Unified Device Architecture kernel.


Author(s):  
Aaron F. Shinn ◽  
S. P. Vanka

A semi-implicit pressure based multigrid algorithm for solving the incompressible Navier-Stokes equations was implemented on a Graphics Processing Unit (GPU) using CUDA (Compute Unified Device Architecture). The multigrid method employed was the Full Approximation Scheme (FAS), which is used for solving nonlinear equations. This algorithm is applied to the 2D driven cavity problem and compared to the CPU version of the code (written in Fortran) to assess computational speed-up.


2012 ◽  
Vol 4 (3) ◽  
pp. 63-84
Author(s):  
Jonathan Cazalas ◽  
Ratan K. Guha

The efficient processing of spatio-temporal data streams is an area of intense research. However, all methods rely on an unsuitable processor (Govindaraju, 2004), namely a CPU, to evaluate concurrent, continuous spatio-temporal queries over these data streams. This paper presents a performance model of the execution of spatio-temporal queries over the authors’ GEDS framework (Cazalas & Guha, 2010). GEDS is a scalable, Graphics Processing Unit (GPU)-based framework, employing computation sharing and parallel processing paradigms to deliver scalability in the evaluation of continuous, spatio-temporal queries over spatio temporal data streams. Experimental evaluation shows the scalability and efficacy of GEDS in spatio-temporal data streaming environments and demonstrates that, despite the costs associated with memory transfers, the parallel processing power provided by GEDS clearly counters and outweighs any associated costs. To move beyond the analysis of specific algorithms over the GEDS framework, the authors developed an abstract performance model, detailing the relationship of the CPU and the GPU. From this model, they are able to extrapolate a list of attributes common to successful GPU-based applications, thereby providing insight into which algorithms and applications are best suited for the GPU and also providing an estimated theoretical speedup for said GPU-based applications.


2014 ◽  
Vol 519-520 ◽  
pp. 102-107
Author(s):  
Yu Fei Yu ◽  
Bin Yan ◽  
Biao Wang ◽  
Lei Li ◽  
Yu Han ◽  
...  

An acceleration strategy for TV-ADM reconstruction algorithm in Compton scattering tomography (CST) is proposed. By analyzing the sparse characteristic of CST projection matrixes, firstly, the sparse matrix vector CSR format and ELL format are used to store them, which greatly reduce the memory consumption. Then, a Sparse Matrix Vector multiplication (SpMV) method is utilized to accelerate the projector and back projector process. Finally, based on the parallel features, the TV-ADM is computed with Graphics Processing Unit (GPU). Numerical experiments show that the TV-ADM with the presented acceleration strategy could achieve a 96 times speedup ratio and 224 times memory compression ratio without precision loss.


Author(s):  
Mohammad Y Al-Shorman ◽  
Majd M Al-Kofahi

A fast, highly parallelized, simulation of unidirectional ultrasonic pulse propagating in a two-dimensional environment is presented. The pulse intensity versus time is recorded using an array of unidirectional ultrasonic receivers located at known locations and arranged in a small circle around the transmitter. To speed up the simulation process, OpenCL 2.0 heterogeneous compute language on a graphics processing unit is used. The simulation result is then compared with experimental data to validate its accuracy. By comparing both simulated and experimental data, the collected intensity–time profiles can be used to map an environment. Environments can be mapped using not only direct reflections but also higher order reflections from objects that are not directly seen by the transmitter. With the help of this simulation, subtle characteristics in an environment, such as a slight tilt or curvature, can be measured. The front end of the simulation is written using C#, while the back end is written using C\C++ and OpenCL.


2021 ◽  
pp. 106-109
Author(s):  
Denis Kravchuk

The use of optical contrast between different blood particles allows the use of optoacoustic imaging to visualize the distribution of blood particles (erythrocytes, taking into account oxygen saturation), the delivery of drugs to organs through blood vessels. An algorithm for calculating the ultrasonic field obtained as a result of optoacoustic interaction has been developed to speed up calculations on the GPU board. An architecture for fast restoration of an optoacoustic signal based on graphics processing unit (GPU) programming is proposed. The algorithm used in combination with the pre-migration method provides an improvement in the resolution and sharpness of the optoacoustic image of the simulated biological tissues. Thanks to the advanced graphics processing unit (GPU) computing architecture, time-consuming main processing unit (CPU) computing is accelerated with great computational efficiency.


Sign in / Sign up

Export Citation Format

Share Document