integrated gpu
Recently Published Documents


TOTAL DOCUMENTS

26
(FIVE YEARS 2)

H-INDEX

5
(FIVE YEARS 0)

Electronics ◽  
2021 ◽  
Vol 10 (19) ◽  
pp. 2386
Author(s):  
Raúl Nozal ◽  
Jose Luis Bosque

Heterogeneous systems are the core architecture of most computing systems, from high-performance computing nodes to embedded devices, due to their excellent performance and energy efficiency. Efficiently programming these systems has become a major challenge due to the complexity of their architectures and the efforts required to provide them with co-execution capabilities that can fully exploit the applications. There are many proposals to simplify the programming and management of acceleration devices and multi-core CPUs. However, in many cases, portability and ease of use compromise the efficiency of different devices—even more so when co-executing. Intel oneAPI, a new and powerful standards-based unified programming model, built on top of SYCL, addresses these issues. In this paper, oneAPI is provided with co-execution strategies to run the same kernel between different devices, enabling the exploitation of static and dynamic policies. This work evaluates the performance and energy efficiency for a well-known set of regular and irregular HPC benchmarks, using two heterogeneous systems composed of an integrated GPU and CPU. Static and dynamic load balancers are integrated and evaluated, highlighting single and co-execution strategies and the most significant key points of this promising technology. Experimental results show that co-execution is worthwhile when using dynamic algorithms and improves the efficiency even further when using unified shared memory.


2021 ◽  
Vol 11 (2) ◽  
pp. 24
Author(s):  
Mirco De Marchi ◽  
Francesco Lumpp ◽  
Enrico Martini ◽  
Michele Boldo ◽  
Stefano Aldegheri ◽  
...  

Many modern programmable embedded devices contain CPUs and a GPU that share the same system memory on a single die. Such a unified memory architecture (UMA) allows programmers to implement different communication models between CPU and the integrated GPU (iGPU). Although the simpler model guarantees implicit synchronization at the cost of performance, the more advanced model allows, through the zero-copy paradigm, the explicit data copying between CPU and iGPU to be eliminated with the benefit of significantly improving performance and energy savings. On the other hand, the robot operating system (ROS) has become a de-facto reference standard for developing robotic applications. It allows for application re-use and the easy integration of software blocks in complex cyber-physical systems. Although ROS compliance is strongly required for SW portability and reuse, it can lead to performance loss and elude the benefits of the zero-copy communication. In this article we present efficient techniques to implement CPU–iGPU communication by guaranteeing compliance to the ROS standard. We show how key features of each communication model are maintained and the corresponding overhead involved by the ROS compliancy.


Author(s):  
Caleb Adams ◽  
Allen Spain ◽  
Jackson Parker ◽  
Matthew Hevert ◽  
James Roach ◽  
...  

Author(s):  
Santosh Kumar

TSV interconnect based 3D/2.5D packaging has gained significant attention since its introduction in FPGA (for die partitioning) and HBM integrated GPU module (for gaming application). The performance potential offered by this technology is unequalled by any other packaging platform today. High-end applications like deep learning, datacenter networking, AR/VR, and autonomous driving are becoming real, thereby pushing the limits of other current packaging platforms. Fueled by increasing bandwidth needs for moving data in cloud-computing and supercomputing applications, performance-driven markets have adopted 3D stacked technologies in a row. Imaging, as the first market adopter of 3D integration, is propelling the market with an increasing number of sensors in smartphones and tablets, including 3D imaging. TSV-based products can be classified in three ranges: low, middle, and high-end. The middle and high-end product markets like CMOS image sensor, memory cube, and interposer are based on a via-middle process. In low-end products, we can also find TSV based on via-middle (i.e. in Apple's fingerprint sensor), but for cost reasons the MEMS industry is using essentially a via-last process, which is cheaper than a via-middle process. TSV's penetration rate in low-end products will remain stable, with the main source of growth due to RF filters in smartphone front-end modules, which keep increasing in order to support the different frequency bands used in 5G mobile communications protocol. This presentation will discuss about the market and technology trends of the TSV based 3D/2.5D packaging.


2017 ◽  
Vol 23 (3) ◽  
pp. 827-836 ◽  
Author(s):  
Qiong Wang ◽  
Ning Li ◽  
Li Shen ◽  
Zhiying Wang

Sign in / Sign up

Export Citation Format

Share Document