scholarly journals THE ART AND SCIENCE OF GPU AND MULTI-CORE PROGRAMMING

2014 ◽  
pp. 64-72
Author(s):  
Robert E. Hiromoto

This paper examines the computational programming issues that arise from the introduction of GPUs and multi-core computer systems. The discussions and analyses examine the implication of two principles (spatial and temporal locality) that provide useful metrics to guide programmers in designing and implementing efficient sequential and parallel application programs. Spatial and temporal locality represents a science of information flow and is relevant in the development of highly efficient computational programs. The art of high performance programming is to take combinations of these principles an d unravel the bottlenecks and latencies as sociate with the architecture for each manufacturer computer system, and develop appropriate coding and/or task scheduling schemes to mitigate or eliminate these latencies.

Author(s):  
Minjing Dong ◽  
Hanting Chen ◽  
Yunhe Wang ◽  
Chang Xu

Network pruning is widely applied to deep CNN models due to their heavy computation costs and achieves high performance by keeping important weights while removing the redundancy. Pruning redundant weights directly may hurt global information flow, which suggests that an efficient sparse network should take graph properties into account. Thus, instead of paying more attention to preserving important weight, we focus on the pruned architecture itself. We propose to use graph entropy as the measurement, which shows useful properties to craft high-quality neural graphs and enables us to propose efficient algorithm to construct them as the initial network architecture. Our algorithm can be easily implemented and deployed to different popular CNN models and achieve better trade-offs.


Sign in / Sign up

Export Citation Format

Share Document