scholarly journals DEVELOPING PARALLEL COMPUTING ALGORITHMS USING GPU’S TO DETERMINE OIL AND GAS RESERVES PRESENTED IN THE UPSTREAM (EXPLORATION) SECTOR

Author(s):  
Stefan Boodoo ◽  
Ajay Joshi

Oil and Gas companies keep exploring every new possible method to increase the likelihood of finding a commercial hydrocarbon bearing prospect. Well logging generates gigabytes of data from various probes and sensors. After processing, a prospective reservoir will indicate areas of oil, gas, water and reservoir rock. Incorporating High Performance Computing (HPC) methodologies will allow for thousands of potential wells to be indicative of its hydrocarbon bearing potential. This study will present the use of the Graphics Processing Unit (GPU) computing as another method of analyzing probable reserves. Raw well log data from the Kansas Geological Society (1999-2018) forms the basis of the data analysis. Parallel algorithms are developed and make use of Nvidia’s Compute Unified Device Architecture (CUDA). The results gathered highlight a 5 times speedup using a Nvidia GeForce GT 330M GPU as compared to an Intel Core i7 740QM Central Processing Unit (CPU). The processed results display depth wise areas of shale and rock formations as well as water, oil and/or gas reserves.

2017 ◽  
Vol 3 ◽  
pp. e127 ◽  
Author(s):  
Rory Mitchell ◽  
Eibe Frank

We present a CUDA-based implementation of a decision tree construction algorithm within the gradient boosting library XGBoost. The tree construction algorithm is executed entirely on the graphics processing unit (GPU) and shows high performance with a variety of datasets and settings, including sparse input matrices. Individual boosting iterations are parallelised, combining two approaches. An interleaved approach is used for shallow trees, switching to a more conventional radix sort-based approach for larger depths. We show speedups of between 3× and 6× using a Titan X compared to a 4 core i7 CPU, and 1.2× using a Titan X compared to 2× Xeon CPUs (24 cores). We show that it is possible to process the Higgs dataset (10 million instances, 28 features) entirely within GPU memory. The algorithm is made available as a plug-in within the XGBoost library and fully supports all XGBoost features including classification, regression and ranking tasks.


2017 ◽  
Vol 14 (1) ◽  
pp. 789-795
Author(s):  
V Saveetha ◽  
S Sophia

Parallel data clustering aims at using algorithms and methods to extract knowledge from fat databases in rational time using high performance architectures. The computational challenge faced by cluster analysis due to increasing capacity of data can be overcome by exploiting the power of these architectures. The recent development in parallel power of Graphics Processing Unit enables low cost high performance solutions for general purpose applications. The Compute Unified Device Architecture programming model provides application programming interface methods to handle data proficiently on Graphics Processing Unit for iterative clustering algorithms like K-Means. The existing Graphics Processing Unit based K-Means algorithms highly focus on improvising the speedup of the algorithms and fall short to handle the high time spent on transfer of data between the Central Processing Unit and Graphics Processing Unit. A competent K-Means algorithm is proposed in this paper to lessen the transfer time by introducing a novel approach to check the convergence of the algorithm and utilize the pinned memory for direct access. This algorithm outperforms the other algorithms by maximizing parallelism and utilizing the memory features. The relative speedups and the validity measure for the proposed algorithm is elevated when compared with K-Means on Graphics Processing Unit and K-Means using Flag on Graphics Processing Unit. Thus the planned approach proves that communication overhead can be reduced in K-Means clustering.


Computation ◽  
2020 ◽  
Vol 8 (1) ◽  
pp. 4 ◽  
Author(s):  
Håvard H. Holm ◽  
André R. Brodtkorb ◽  
Martin L. Sætra

In this work, we examine the performance, energy efficiency, and usability when using Python for developing high-performance computing codes running on the graphics processing unit (GPU). We investigate the portability of performance and energy efficiency between Compute Unified Device Architecture (CUDA) and Open Compute Language (OpenCL); between GPU generations; and between low-end, mid-range, and high-end GPUs. Our findings showed that the impact of using Python is negligible for our applications, and furthermore, CUDA and OpenCL applications tuned to an equivalent level can in many cases obtain the same computational performance. Our experiments showed that performance in general varies more between different GPUs than between using CUDA and OpenCL. We also show that tuning for performance is a good way of tuning for energy efficiency, but that specific tuning is needed to obtain optimal energy efficiency.


Author(s):  
Baldomero Imbernón ◽  
Antonio Llanes ◽  
José-Matías Cutillas-Lozano ◽  
Domingo Giménez

Virtual screening (VS) methods aid clinical research by predicting the interaction of ligands with pharmacological targets. The computational requirements of VS, along with the size of the databases, propitiate the use of high-performance computing. METADOCK is a tool for the application of metaheuristics to VS in heterogeneous clusters of computers based on central processing unit (CPU) and graphics processing unit (GPU). HYPERDOCK represents a step forward; the exploration for satisfactory metaheuristics is systematically approached by means of hyperheuristics working on top of the metaheuristic schema of METADOCK. Multiple metaheuristics are explored, so the process is computationally demanding. HYPERDOCK exploits the parallelism of METADOCK and includes parallelism at its own level. The different levels of parallelism can be used to exploit the parallelism offered by computational systems composed of multicore CPU + multi-GPUs. The efficient exploitation of these systems enables HYPERDOCK to improve ligand–receptor binding.


Author(s):  
K. Bhargavi ◽  
Sathish Babu B.

The GPUs (Graphics Processing Unit) were mainly used to speed up computation intensive high performance computing applications. There are several tools and technologies available to perform general purpose computationally intensive application. This chapter primarily discusses about GPU parallelism, applications, probable challenges and also highlights some of the GPU computing platforms, which includes CUDA, OpenCL (Open Computing Language), OpenMPC (Open MP extended for CUDA), MPI (Message Passing Interface), OpenACC (Open Accelerator), DirectCompute, and C++ AMP (C++ Accelerated Massive Parallelism). Each of these platforms is discussed briefly along with their advantages and disadvantages.


2017 ◽  
Vol 10 (13) ◽  
pp. 251
Author(s):  
Ankush Rai ◽  
Jagadeesh Kannan R

This research work presents a novel central processing unit-graphics processing unit (CPU-GPU) computing scheme for multiple object trackingduring a surveillance operation. This facilitates nonlinear computational jobs to avail completion of computation in minimal processing time for tracking function. The work is divided into two essential objectives. First is to dynamically divide the processing operations into parallel units, and second is to reduce the communication between CPU-GPU processing units.


Author(s):  
Wisoot Sanhan ◽  
Kambiz Vafai ◽  
Niti Kammuang-Lue ◽  
Pradit Terdtoon ◽  
Phrut Sakulchangsatjatai

Abstract An investigation of the effect of the thermal performance of the flattened heat pipe on its double heat sources acting as central processing unit and graphics processing unit in laptop computers is presented in this work. A finite element method is used for predicting the flattening effect of the heat pipe. The cylindrical heat pipe with a diameter of 6 mm and the total length of 200 mm is flattened into three final thicknesses of 2, 3, and 4 mm. The heat pipe is placed under a horizontal configuration and heated with heater 1 and heater 2, 40 W in combination. The numerical model shows good agreement compared with the experimental data with the standard deviation of 1.85%. The results also show that flattening the cylindrical heat pipe to 66.7 and 41.7% of its original diameter could reduce its normalized thermal resistance by 5.2%. The optimized final thickness or the best design final thickness for the heat pipe is found to be 2.5 mm.


Author(s):  
Alan Gray ◽  
Kevin Stratford

Leading high performance computing systems achieve their status through use of highly parallel devices such as NVIDIA graphics processing units or Intel Xeon Phi many-core CPUs. The concept of performance portability across such architectures, as well as traditional CPUs, is vital for the application programmer. In this paper we describe targetDP, a lightweight abstraction layer which allows grid-based applications to target data parallel hardware in a platform agnostic manner. We demonstrate the effectiveness of our pragmatic approach by presenting performance results for a complex fluid application (with which the model was co-designed), plus separate lattice quantum chromodynamics particle physics code. For each application, a single source code base is seen to achieve portable performance, as assessed within the context of the Roofline model. TargetDP can be combined with Message Passing Interface (MPI) to allow use on systems containing multiple nodes: we demonstrate this through provision of scaling results on traditional and graphics processing unit-accelerated large scale supercomputers.


Electronics ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 884
Author(s):  
Stefano Rossi ◽  
Enrico Boni

Methods of increasing complexity are currently being proposed for ultrasound (US) echographic signal processing. Graphics Processing Unit (GPU) resources allowing massive exploitation of parallel computing are ideal candidates for these tasks. Many high-performance US instruments, including open scanners like ULA-OP 256, have an architecture based only on Field-Programmable Gate Arrays (FPGAs) and/or Digital Signal Processors (DSPs). This paper proposes the implementation of the embedded NVIDIA Jetson Xavier AGX module on board ULA-OP 256. The system architecture was revised to allow the introduction of a new Peripheral Component Interconnect Express (PCIe) communication channel, while maintaining backward compatibility with all other embedded computing resources already on board. Moreover, the Input/Output (I/O) peripherals of the module make the ultrasound system independent, freeing the user from the need to use an external controlling PC.


2020 ◽  
Vol 16 (12) ◽  
pp. 7232-7238
Author(s):  
Giuseppe M. J. Barca ◽  
Jorge L. Galvez-Vallejo ◽  
David L. Poole ◽  
Alistair P. Rendell ◽  
Mark S. Gordon

Sign in / Sign up

Export Citation Format

Share Document