scholarly journals GPU Accelerated Path Tracing of Massive Scenes

2021 ◽  
Vol 40 (2) ◽  
pp. 1-17
Author(s):  
Milan Jaroš ◽  
Lubomír Říha ◽  
Petr Strakoš ◽  
Matěj Špeťko

This article presents a solution to path tracing of massive scenes on multiple GPUs. Our approach analyzes the memory access pattern of a path tracer and defines how the scene data should be distributed across up to 16 GPUs with minimal effect on performance. The key concept is that the parts of the scene that have the highest amount of memory accesses are replicated on all GPUs. We propose two methods for maximizing the performance of path tracing when working with partially distributed scene data. Both methods work on the memory management level and therefore path tracer data structures do not have to be redesigned, making our approach applicable to other path tracers with only minor changes in their code. As a proof of concept, we have enhanced the open-source Blender Cycles path tracer. The approach was validated on scenes of sizes up to 169 GB. We show that only 1–5% of the scene data needs to be replicated to all machines for such large scenes. On smaller scenes we have verified that the performance is very close to rendering a fully replicated scene. In terms of scalability we have achieved a parallel efficiency of over 94% using up to 16 GPUs.

Author(s):  
Eduardo H. M. Cruz ◽  
Matthias Diener ◽  
Laércio L. Pilla ◽  
Philippe O. A. Navaux

Current and future architectures rely on thread-level parallelism to sustain performance growth. These architectures have introduced a complex memory hierarchy, consisting of several cores organized hierarchically with multiple cache levels and NUMA nodes. These memory hierarchies can have an impact on the performance and energy efficiency of parallel applications as the importance of memory access locality is increased. In order to improve locality, the analysis of the memory access behavior of parallel applications is critical for mapping threads and data. Nevertheless, most previous work relies on indirect information about the memory accesses, or does not combine thread and data mapping, resulting in less accurate mappings. In this paper, we propose the Sharing-Aware Memory Management Unit (SAMMU), an extension to the memory management unit that allows it to detect the memory access behavior in hardware. With this information, the operating system can perform online mapping without any previous knowledge about the behavior of the application. In the evaluation with a wide range of parallel applications (NAS Parallel Benchmarks and PARSEC Benchmark Suite), performance was improved by up to 35.7% (10.0% on average) and energy efficiency was improved by up to 11.9% (4.1% on average). These improvements happened due to a substantial reduction of cache misses and interconnection traffic.


2021 ◽  
Author(s):  
Zhen Yu

With the development of modern computers, memory latencies have become a key bottleneck for the performance of computer systems. Since then, much research work has targeted improving the performance of memory hierarchy. In this thesis, we examine the behavior of dynamically allocated data structures (DADS) and programs with irregular access patterns (PIAP). DADS and PIAP use dynamic memory management or algorithms with unpredictable behaviour. By simulating some applications of dynamically allocated data structures (DADS) and programs with irregular access patterns (PIAP), it is found that general cache management policies can not effectively use the treasurable cache resources for DADS and PIAP. We explored the use of mathematical formula applied to signal processing to improve the performance of memory hierarchy.


2021 ◽  
Author(s):  
Zhen Yu

With the development of modern computers, memory latencies have become a key bottleneck for the performance of computer systems. Since then, much research work has targeted improving the performance of memory hierarchy. In this thesis, we examine the behavior of dynamically allocated data structures (DADS) and programs with irregular access patterns (PIAP). DADS and PIAP use dynamic memory management or algorithms with unpredictable behaviour. By simulating some applications of dynamically allocated data structures (DADS) and programs with irregular access patterns (PIAP), it is found that general cache management policies can not effectively use the treasurable cache resources for DADS and PIAP. We explored the use of mathematical formula applied to signal processing to improve the performance of memory hierarchy.


Author(s):  
Aleix Roca Nonell ◽  
Balazs Gerofi ◽  
Leonardo Bautista-Gomez ◽  
Dominique Martinet ◽  
Vicenç Beltran Querol ◽  
...  

2005 ◽  
Vol 23 (2) ◽  
pp. 146-196 ◽  
Author(s):  
Maurice Herlihy ◽  
Victor Luchangco ◽  
Paul Martin ◽  
Mark Moir

Author(s):  
Yuto Nakano ◽  
Shinsaku Kiyomoto ◽  
Yutaka Miyake
Keyword(s):  

2021 ◽  
Vol 244 ◽  
pp. 07001
Author(s):  
Anatoliy Nyrkov ◽  
Konstantin Ianiushkin ◽  
Andrey Nyrkov ◽  
Yulia Romanova ◽  
Vagiz Gaskarov

Recent achievements in high-performance computing significantly narrow the performance gap between single and multi-node computing, and open up opportunities for systems with remote shared memory. The combination of in-memory storage, remote direct memory access and remote calls requires rethinking how data organized, protected and queried in distributed systems. Reviewed models let us implement new interpretations of distributed algorithms allowing us to validate different approaches to avoid race conditions, decrease resource acquisition or synchronization time. In this paper, we describe the data model for mixed memory access with analysis of optimized data structures. We also provide the result of experiments, which contain a performance comparison of data structures, operating with different approaches, evaluate the limitations of these models, and show that the model does not always meet expectations. The purpose of this paper to assist developers in designing data structures that will help to achieve architectural benefits or improve the design of existing distributed system.


Sign in / Sign up

Export Citation Format

Share Document