scholarly journals Parallel computing with graphics processing units for high-speed Monte Carlo simulation of photon migration

2008 ◽  
Vol 13 (6) ◽  
pp. 060504 ◽  
Author(s):  
Erik Alerstam ◽  
Tomas Svensson ◽  
Stefan Andersson-Engels
2012 ◽  
Vol 26 (9) ◽  
pp. 1679-1697 ◽  
Author(s):  
L.A. Abbas-Turki ◽  
S. Vialle ◽  
B. Lapeyre ◽  
P. Mercier

Nanophotonics ◽  
2020 ◽  
Vol 9 (13) ◽  
pp. 4097-4108 ◽  
Author(s):  
Moustafa Ahmed ◽  
Yas Al-Hadeethi ◽  
Ahmed Bakry ◽  
Hamed Dalir ◽  
Volker J. Sorger

AbstractThe technologically-relevant task of feature extraction from data performed in deep-learning systems is routinely accomplished as repeated fast Fourier transforms (FFT) electronically in prevalent domain-specific architectures such as in graphics processing units (GPU). However, electronics systems are limited with respect to power dissipation and delay, due to wire-charging challenges related to interconnect capacitance. Here we present a silicon photonics-based architecture for convolutional neural networks that harnesses the phase property of light to perform FFTs efficiently by executing the convolution as a multiplication in the Fourier-domain. The algorithmic executing time is determined by the time-of-flight of the signal through this photonic reconfigurable passive FFT ‘filter’ circuit and is on the order of 10’s of picosecond short. A sensitivity analysis shows that this optical processor must be thermally phase stabilized corresponding to a few degrees. Furthermore, we find that for a small sample number, the obtainable number of convolutions per {time, power, and chip area) outperforms GPUs by about two orders of magnitude. Lastly, we show that, conceptually, the optical FFT and convolution-processing performance is indeed directly linked to optoelectronic device-level, and improvements in plasmonics, metamaterials or nanophotonics are fueling next generation densely interconnected intelligent photonic circuits with relevance for edge-computing 5G networks by processing tensor operations optically.


2007 ◽  
Vol 51 (2T) ◽  
pp. 82-85 ◽  
Author(s):  
Y. Nakashima ◽  
Y. Higashizono ◽  
N. Nishino ◽  
H. Kawano ◽  
M.K. Islam ◽  
...  

2013 ◽  
Vol 2013 ◽  
pp. 1-15 ◽  
Author(s):  
Carlos Couder-Castañeda ◽  
Carlos Ortiz-Alemán ◽  
Mauricio Gabriel Orozco-del-Castillo ◽  
Mauricio Nava-Flores

An implementation with the CUDA technology in a single and in several graphics processing units (GPUs) is presented for the calculation of the forward modeling of gravitational fields from a tridimensional volumetric ensemble composed by unitary prisms of constant density. We compared the performance results obtained with the GPUs against a previous version coded in OpenMP with MPI, and we analyzed the results on both platforms. Today, the use of GPUs represents a breakthrough in parallel computing, which has led to the development of several applications with various applications. Nevertheless, in some applications the decomposition of the tasks is not trivial, as can be appreciated in this paper. Unlike a trivial decomposition of the domain, we proposed to decompose the problem by sets of prisms and use different memory spaces per processing CUDA core, avoiding the performance decay as a result of the constant calls to kernels functions which would be needed in a parallelization by observations points. The design and implementation created are the main contributions of this work, because the parallelization scheme implemented is not trivial. The performance results obtained are comparable to those of a small processing cluster.


Sign in / Sign up

Export Citation Format

Share Document