scholarly journals Graphics processing unit (GPU) implementation of image processing algorithms to improve system performance of the control acquisition, processing, and image display system (CAPIDS) of the micro-angiographic fluoroscope (MAF)

Author(s):  
S. N. Swetadri Vasan ◽  
Ciprian N. Ionita ◽  
A. H. Titus ◽  
A. N. Cartwright ◽  
D. R. Bednarek ◽  
...  
Author(s):  
Liam Dunn ◽  
Patrick Clearwater ◽  
Andrew Melatos ◽  
Karl Wette

Abstract The F-statistic is a detection statistic used widely in searches for continuous gravitational waves with terrestrial, long-baseline interferometers. A new implementation of the F-statistic is presented which accelerates the existing "resampling" algorithm using graphics processing units (GPUs). The new implementation runs between 10 and 100 times faster than the existing implementation on central processing units without sacrificing numerical accuracy. The utility of the GPU implementation is demonstrated on a pilot narrowband search for four newly discovered millisecond pulsars in the globular cluster Omega Centauri using data from the second Laser Interferometer Gravitational-Wave Observatory observing run. The computational cost is 17:2 GPU-hours using the new implementation, compared to 1092 core-hours with the existing implementation.


2016 ◽  
Vol 15 (10) ◽  
pp. 7160-7163
Author(s):  
Gurpreet Kaur ◽  
Sonika Jindal

Image Segmentations play a heavy role in areas such as computer vision and image processing due to its broad usage and immense applications. Because of the large importance of image segmentation a number of algorithms have been proposed and different approaches have been adopted. Segmentation divides an image into distinct regions containing each pixel with similar attributes. The objective of apportioning is to simplify and/or alter the representation of an image into something that is more meaningful and more comfortable to break down. This paper discusses the various techniques implemented for image segmentation and discusses the various Computations that can be performed on the graphics processing unit (GPU) by means of the CUDA architecture in order to achieve fast performance and increase the utilization of available system resources.


Author(s):  
Daniel S Abdi ◽  
Lucas C Wilcox ◽  
Timothy C Warburton ◽  
Francis X Giraldo

We present a Graphics Processing Unit (GPU)-accelerated nodal discontinuous Galerkin method for the solution of the three-dimensional Euler equations that govern the motion and thermodynamic state of the atmosphere. Acceleration of the dynamical core of atmospheric models plays an important practical role in not only getting daily forecasts faster, but also in obtaining more accurate (high resolution) results within a given simulation time limit. We use algorithms suitable for the single instruction multiple thread architecture of GPUs to accelerate our model by two orders of magnitude relative to one core of a CPU. Tests on one node of the Titan supercomputer show a speedup of up to 15 times using the K20X GPU as compared to that on the 16-core AMD Opteron CPU. The scalability of the multi-GPU implementation is tested using 16,384 GPUs, which resulted in a weak scaling efficiency of about 90%. Finally, the accuracy and performance of our GPU implementation is verified using several benchmark problems representative of different scales of atmospheric dynamics.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Xingyi Zhang ◽  
Bangju Wang ◽  
Zhuanlian Ding ◽  
Jin Tang ◽  
Juanjuan He

Membrane algorithms are a new class of parallel algorithms, which attempt to incorporate some components of membrane computing models for designing efficient optimization algorithms, such as the structure of the models and the way of communication between cells. Although the importance of the parallelism of such algorithms has been well recognized, membrane algorithms were usually implemented on the serial computing device central processing unit (CPU), which makes the algorithms unable to work in an efficient way. In this work, we consider the implementation of membrane algorithms on the parallel computing device graphics processing unit (GPU). In such implementation, all cells of membrane algorithms can work simultaneously. Experimental results on two classical intractable problems, the point set matching problem and TSP, show that the GPU implementation of membrane algorithms is much more efficient than CPU implementation in terms of runtime, especially for solving problems with a high complexity.


Author(s):  
Bojan Novak

The random forest ensemble learning with the Graphics Processing Unit (GPU) version of prefix scan method is presented. The efficiency of the implementation of the random forest algorithm depends critically on the scan (prefix sum) algorithm. The prefix scan is used in the depth-first implementation of optimal split point computation. Described are different implementations of the prefix scan algorithms. The speeds of the algorithms depend on three factors: the algorithm itself, which could be improved, the programming skills, and the compiler. In parallel environments, things are even more complicated and depend on the programmer´s knowledge of the Central Processing Unit (CPU) or the GPU architecture. An efficient parallel scan algorithm that avoids bank conflicts is crucial for the prefix scan implementation. In our tests, multicore CPU and GPU implementation based on NVIDIA´s CUDA is compared.


Sign in / Sign up

Export Citation Format

Share Document