scholarly journals Accelerating 3-D GPU-based Motion Tracking for Ultrasound Strain Elastography Using Sum-Tables: Analysis and Initial Results

2019 ◽  
Vol 9 (10) ◽  
pp. 1991 ◽  
Author(s):  
Bo Peng ◽  
Shasha Luo ◽  
Zhengqiu Xu ◽  
Jingfeng Jiang

Now, with the availability of 3-D ultrasound data, a lot of research efforts are being devoted to developing 3-D ultrasound strain elastography (USE) systems. Because 3-D motion tracking, a core component in any 3-D USE system, is computationally intensive, a lot of efforts are under way to accelerate 3-D motion tracking. In the literature, the concept of Sum-Table has been used in a serial computing environment to reduce the burden of computing signal correlation, which is the single most computationally intensive component in 3-D motion tracking. In this study, parallel programming using graphics processing units (GPU) is used in conjunction with the concept of Sum-Table to improve the computational efficiency of 3-D motion tracking. To our knowledge, sum-tables have not been used in a GPU environment for 3-D motion tracking. Our main objective here is to investigate the feasibility of using sum-table-based normalized correlation coefficient (ST-NCC) method for the above-mentioned GPU-accelerated 3-D USE. More specifically, two different implementations of ST-NCC methods proposed by Lewis et al. and Luo-Konofagou are compared against each other. During the performance comparison, the conventional method for calculating the normalized correlation coefficient (NCC) was used as the baseline. All three methods were implemented using compute unified device architecture (CUDA; Version 9.0, Nvidia Inc., CA, USA) and tested on a professional GeForce GTX TITAN X card (Nvidia Inc., CA, USA). Using 3-D ultrasound data acquired during a tissue-mimicking phantom experiment, both displacement tracking accuracy and computational efficiency were evaluated for the above-mentioned three different methods. Based on data investigated, we found that under the GPU platform, Lou-Konofaguo method can still improve the computational efficiency (17–46%), as compared to the classic NCC method implemented into the same GPU platform. However, the Lewis method does not improve the computational efficiency in some configuration or improves the computational efficiency at a lower rate (7–23%) under the GPU parallel computing environment. Comparable displacement tracking accuracy was obtained by both methods.

2015 ◽  
Vol 87 ◽  
pp. 47-60 ◽  
Author(s):  
Alexandru-Ciprian Zăvoianu ◽  
Edwin Lughofer ◽  
Werner Koppelstätter ◽  
Günther Weidenholzer ◽  
Wolfgang Amrhein ◽  
...  

2021 ◽  
Vol 38 (2) ◽  
Author(s):  
Nicholas Torres Okita ◽  
Tiago A. Coimbra ◽  
José Ribeiro ◽  
Martin Tygel

ABSTRACT. The usage of graphics processing units is already known as an alternative to traditional multi-core CPU processing, offering faster performance in the order of dozens of times in parallel tasks. Another new computing paradigm is cloud computing usage as a replacement to traditional in-house clusters, enabling seemingly unlimited computation power, no maintenance costs, and cutting-edge technology, dynamically on user demand. Previously those two tools were used to accelerate the estimation of Common Reflection Surface (CRS) traveltime parameters, both in zero-offset and finite-offset domain, delivering very satisfactory results with large time savings from GPU devices alongside cost savings on the cloud. This work extends those results by using GPUs on the cloud to accelerate the Offset Continuation Trajectory (OCT) traveltime parameter estimation. The results have shown that the time and cost savings from GPU devices’ usage are even larger than those seen in the CRS results, being up to fifty times faster and sixty times cheaper. This analysis reaffirms that it is possible to save both time and money when using GPU devices on the cloud and concludes that the larger the data sets are and the more computationally intensive the traveltime operators are, we can see larger improvements.Keywords: cloud computing, GPU, seismic processing. Estendendo o uso de placas gráficas na nuvem para economias em regularização de dados sísmicosRESUMO. O uso de aceleradores gráficos para processamento já é uma alternativa conhecida ao uso de CPUs multi-cores, oferecendo um desempenho na ordem de dezenas de vezes mais rápido em tarefas paralelas. Outro novo paradigma de computação é o uso da nuvem computacional como substituta para os tradicionais clusters internos, possibilitando o uso de um poder computacional aparentemente infinito sem custo de manutenção e com tecnologia de ponta, dinamicamente sob demanda de usuário. Anteriormente essas duas ferramentas foram utilizadas para acelerar a estimação de parâmetros do tempo de trânsito de Common Reflection Surface (CRS), tanto em zero-offset quanto em offsets finitos, obtendo resultados satisfatórios com amplas economias tanto de tempo quanto de dinheiro na nuvem. Este trabalho estende os resultados obtidos anteriormente, desta vez utilizando GPUs na nuvem para acelerar a estimação de parâmetros do tempo de trânsito em Offset Continuation Trajectory (OCT). Os resultados obtidos mostraram que as economias de tempo e dinheiro foram ainda maiores do que aquelas obtidas no CRS, sendo até cinquenta vezes mais rápido e sessenta vezes mais barato. Esta análise reafirma que é possível economizar tanto tempo quanto dinheiro usando GPUs na nuvem, e conclui que quanto maior for o dado e quanto mais computacionalmente intenso for o operador, maiores serão os ganhos de desempenho observados e economias.Palavras-chave: computação em nuvem, GPU, processamento sísmico. 


2013 ◽  
Vol 694-697 ◽  
pp. 927-935 ◽  
Author(s):  
Yi Sun ◽  
Tao Ma ◽  
Chia Yung Han ◽  
Joseph Ross ◽  
William Wee

This paper presents a simple and accurate coordinate transformation method for extending the tracking space of the Intersense IS-900 spatial and motion tracking system using multiple pre-configured emitter towers to form the emitter constellation, but without resorting to the use of a surveyor machine. The proposed approach uses the differences of positional coordinate readings from each emitter tower among a set of commonly viewed spatial points to calculate the parameters needed to define the coordinate transformation. By applying this method, the tracking accuracy using the entire emitter constellation can be achieved by less than 0.5 inches error in most of the working space, and as low as 0.2 inches error in the frontal part of the working space.


2018 ◽  
Vol 10 (1) ◽  
pp. 168781401775196 ◽  
Author(s):  
Ping Wang ◽  
Yabo Wang ◽  
He Huang ◽  
Feng Ru ◽  
Quan Pan

In order to improve the neurological recovery of hand neurorehabilitation, target-oriented, intensive, repetitive activities of daily living are used, such as training with recognition of hand gestures during robot-aided exercise. In this article, a cascade control algorithm integrating electromyography bio-feedback into hand gesture recognition is proposed. The outer loop is the trajectory motion tracking with Kinect-based gesture decoding classifier, and the inner loop is torque control with electromyography bio-feedback in the real time. This proposed method improves the tracking accuracy. The tracking error is effectively reduced from 70.56 to 28.07 in the simulation experiment. The initial test proves that the proposed method with additional torque control allows active assistance on the human–machine interface of other rehabilitation robots in future.


2019 ◽  
Vol 47 (2) ◽  
pp. 643-650 ◽  
Author(s):  
Alexander Jöhl ◽  
Stefanie Ehrbar ◽  
Matthias Guckenberger ◽  
Stephan Klöck ◽  
Mirko Meboldt ◽  
...  

2001 ◽  
Vol 1 (3) ◽  
pp. 235-244 ◽  
Author(s):  
Jonathan F. Gerhard ◽  
David Rosen ◽  
Janet K. Allen ◽  
Farrokh Mistree

Geographically distributed engineers must collaboratively develop, build and test solutions to design-manufacture problems to be competitive in the global marketplace. Engineers operate in a distributed system in which separate entities communicate cooperatively—ideas and information requests are generated anywhere within the system, rapid turn-around is essential, and multiple projects must be handled simultaneously. In this paper we present a prototype platform-independent framework to integrate distributed and heterogeneous software resources to support the computationally intensive activities in the product realization process. This framework, PRE-RMI, is based on an experimental event-based communications model; it has been coded in Java and uses the RMI messaging system. We describe its usage in a distributed product realization environment, the Rapid Tooling TestBed. PRE-RMI is compared to a previous environment, called P2 that was based on Java Servlet technology. PRE-RMI is adaptable to different design processes, is modular and extensible, is robust to network and computing failures, and is far preferable to P2. Further, we demonstrate the successful integration of CAD, CAE, design, and manufacturing software tools and resources in this flexible distributed computing environment.


2020 ◽  
Author(s):  
Nairit Sur ◽  
Leonardo Cristella ◽  
Adriano Di Florio ◽  
Vincenzo Mastrapasqua

Abstract The demand for computational resources is steadily increasing in experimental high energy physics as the current collider experiments continue to accumulate huge amounts of data and physicists indulge in more complex and ambitious analysis strategies. This is especially true in the fields of hadron spectroscopy and flavour physics where the analyses often depend on complex multidimensional unbinned maximum-likelihood fits, with several dozens of free parameters, with an aim to study the internal structure of hadrons. Graphics processing units (GPUs) represent one of the most sophisticated and versatile parallel computing architectures that are becoming popular toolkits for high energy physicists to meet their computational demands. GooFit is an upcoming open-source tool interfacing ROOT/RooFit to the CUDA platform on NVIDIA GPUs that acts as a bridge between the MINUIT minimization algorithm and a parallel processor, allowing probability density functions to be estimated on multiple cores simultaneously. In this article, a full-fledged amplitude analysis framework developed using GooFit is tested for its speed and reliability. The four-dimensional fitter framework, one of the firsts of its kind to be built on GooFit, is geared towards the search for exotic tetraquark states in the [[EQUATION]] decays and can also be seamlessly adapted for other similar analyses. The GooFit fitter, running on GPUs, shows a remarkable improvement in the computing speed compared to a ROOT/RooFit implementation of the same analysis running on multi-core CPU clusters. Furthermore, it shows sensitivity to components with small contributions to the overall fit. It has the potential to be a powerful tool for sensitive and computationally intensive physics analyses.


Sign in / Sign up

Export Citation Format

Share Document