A 1080p H.264/AVC Baseline Residual Encoder for a Fine-Grained Many-Core System

Author(s):  
Zhibin Xiao ◽  
Bevan M. Baas
Keyword(s):  
Author(s):  
Irfan Uddin

The microthreaded many-core architecture is comprised of multiple clusters of fine-grained multi-threaded cores. The management of concurrency is supported in the instruction set architecture of the cores and the computational work in application is asynchronously delegated to different clusters of cores, where the cluster is allocated dynamically. Computer architects are always interested in analyzing the complex interaction amongst the dynamically allocated resources. Generally a detailed simulation with a cycle-accurate simulation of the execution time is used. However, the cycle-accurate simulator for the microthreaded architecture executes at the rate of 100,000 instructions per second, divided over the number of simulated cores. This means that the evaluation of a complex application executing on a contemporary multi-core machine can be very slow. To perform efficient design space exploration we present a co-simulation environment, where the detailed execution of instructions in the pipeline of microthreaded cores and the interactions amongst the hardware components are abstracted. We present the evaluation of the high-level simulation framework against the cycle-accurate simulation framework. The results show that the high-level simulator is faster and less complicated than the cycle-accurate simulator but with the cost of losing accuracy.


2020 ◽  
Vol 1 (2) ◽  
pp. 101-123
Author(s):  
Hiroaki Shiokawa ◽  
Yasunori Futamura

This paper addressed the problem of finding clusters included in graph-structured data such as Web graphs, social networks, and others. Graph clustering is one of the fundamental techniques for understanding structures present in the complex graphs such as Web pages, social networks, and others. In the Web and data mining communities, the modularity-based graph clustering algorithm is successfully used in many applications. However, it is difficult for the modularity-based methods to find fine-grained clusters hidden in large-scale graphs; the methods fail to reproduce the ground truth. In this paper, we present a novel modularity-based algorithm, \textit{CAV}, that shows better clustering results than the traditional algorithm. The proposed algorithm employs a cohesiveness-aware vector partitioning into the graph spectral analysis to improve the clustering accuracy. Additionally, this paper also presents a novel efficient algorithm \textit{P-CAV} for further improving the clustering speed of CAV; P-CAV is an extension of CAV that utilizes the thread-based parallelization on a many-core CPU. Our extensive experiments on synthetic and public datasets demonstrate the performance superiority of our approaches over the state-of-the-art approaches.


2020 ◽  
Vol 138 ◽  
pp. 32-47
Author(s):  
Aaron Stillmaker ◽  
Brent Bohnenstiehl ◽  
Lucas Stillmaker ◽  
Bevan Baas

2016 ◽  
Vol 65 (5) ◽  
pp. 1453-1466 ◽  
Author(s):  
Michael A. Skitsas ◽  
Chrysostomos A. Nicopoulos ◽  
Maria K. Michael
Keyword(s):  

Author(s):  
Anuja A. Tapase ◽  
Siddheshwar V. Patil ◽  
Dinesh B. Kulkarni

2016 ◽  
Vol 26 (03) ◽  
pp. 1750037 ◽  
Author(s):  
Xiaofeng Zhou ◽  
Lu Liu ◽  
Zhangming Zhu

Network-on-Chip (NoC) has become a promising design methodology for the modern on-chip communication infrastructure of many-core system. To guarantee the reliability of traffic, effective fault-tolerant scheme is critical to NoC systems. In this paper, we propose a fault-tolerant deflection routing (FTDR) to address faults on links and router by redundancy technique. The proposed FTDR employs backup links and a redundant fault-tolerant unit (FTU) at router-level to sustain the traffic reliability of NoC. Experimental results show that the proposed FTDR yields an improvement of routing performance and fault-tolerant capability over the reported fault-tolerant routing schemes in average flit deflection rate, average packet latency, saturation throughput and reliability by up to 13.5%, 9.8%, 10.6% and 17.5%, respectively. The layout area and power consumption are increased merely 3.5% and 2.6%.


Sign in / Sign up

Export Citation Format

Share Document