scholarly journals Online Thread and Data Mapping Using the Memory Management Unit

2017 ◽  
Author(s):  
Eduardo H. M. Cruz ◽  
Philippe O. A. Navaux

As arquiteturas de computadores atuais incluem complexas hierarquias de memória que introduzem diferentes tempos de acessoá memória. Uma das soluções adotadas para reduzir o tempo de acesso é aumentar a localidade dos acessosá memória através do mapeamento de threads e dados. Nesta tese de doutorado, são propostas soluções inovadoras para identificar um mapeamento que otimize o acessoá memória fazendo uso da unidade de gerência de memória para monitor os acessos. Na avaliação experimental, as soluções melhoraram o desempenho em até 39% e a eficiência energética em até 12,2%. Isto se deu por uma redução substancial da quantidade de faltas na cache, tráfego entre processadores e acessosá bancos de memória remotos.

Author(s):  
Eduardo H. M. Cruz ◽  
Matthias Diener ◽  
Laércio L. Pilla ◽  
Philippe O. A. Navaux

Current and future architectures rely on thread-level parallelism to sustain performance growth. These architectures have introduced a complex memory hierarchy, consisting of several cores organized hierarchically with multiple cache levels and NUMA nodes. These memory hierarchies can have an impact on the performance and energy efficiency of parallel applications as the importance of memory access locality is increased. In order to improve locality, the analysis of the memory access behavior of parallel applications is critical for mapping threads and data. Nevertheless, most previous work relies on indirect information about the memory accesses, or does not combine thread and data mapping, resulting in less accurate mappings. In this paper, we propose the Sharing-Aware Memory Management Unit (SAMMU), an extension to the memory management unit that allows it to detect the memory access behavior in hardware. With this information, the operating system can perform online mapping without any previous knowledge about the behavior of the application. In the evaluation with a wide range of parallel applications (NAS Parallel Benchmarks and PARSEC Benchmark Suite), performance was improved by up to 35.7% (10.0% on average) and energy efficiency was improved by up to 11.9% (4.1% on average). These improvements happened due to a substantial reduction of cache misses and interconnection traffic.


1988 ◽  
Author(s):  
A. K. Goksel ◽  
R. H. Krambeck ◽  
P. P. Thomas ◽  
M. S. Tsay

1993 ◽  
Vol 28 (11) ◽  
pp. 1078-1083 ◽  
Author(s):  
R.A. Heald ◽  
J.C. Holst

1991 ◽  
Vol 19 (4) ◽  
pp. 109-116 ◽  
Author(s):  
Alberto R. Cunha ◽  
Carlos N. Ribeiro ◽  
José A. Marques

Sign in / Sign up

Export Citation Format

Share Document