Application-Specific SoC Design Using Core Mapping to 3D Mesh NoCs with Nonlinear Area Optimization and Simulated Annealing

Jan Moritz Joseph; Dominik Ermel; Lennart Bamberg; Alberto García-Oritz; Thilo Pionteck

doi:10.3390/technologies8010010

Application-Specific SoC Design Using Core Mapping to 3D Mesh NoCs with Nonlinear Area Optimization and Simulated Annealing

Technologies ◽

10.3390/technologies8010010 ◽

2020 ◽

Vol 8 (1) ◽

pp. 10

Author(s):

Jan Moritz Joseph ◽

Dominik Ermel ◽

Lennart Bamberg ◽

Alberto García-Oritz ◽

Thilo Pionteck

Keyword(s):

Simulated Annealing ◽

Cost Function ◽

Linear Models ◽

Nonlinear Models ◽

General Purpose ◽

Network Graph ◽

Systems On Chip ◽

On Chip ◽

Core Mapping ◽

Application Specific

Core mapping, in which a core graph is mapped to a network graph to minimize communication, is a common design problem for Systems-on-Chip interconnected by a Network-on-Chip. In conventional multiprocessors, this mapping is area-agnostic as the cores in the core graph are uniform and therefore iso-area. This changes for Systems-on-Chip because tasks are mapped to specific blocks and not general-purpose cores. Thus, the area of these specific cores is varying. This requires novel mapping methods. In this paper, we propose a an area-aware cost function for simulated annealing; Furthermore, we advocate the use of nonlinear models as the area is nonlinear: A semi-definite program (SDP) can be used as it is sufficiently fast and shows 20% better area than conventional linear models. Our cost function allows for up to 16.4% better area, 2% better communication (bandwidth times hop distance) and 13.8% better total bandwidth in the network in comparison to the standard approach that accounts for both the network communication and uses cores with varying areas as well.

Download Full-text

Adaptive multi-layer techniques for increased system dependability

it - Information Technology ◽

10.1515/itit-2014-1082 ◽

2015 ◽

Vol 57 (3) ◽

Author(s):

Lars Bauer ◽

Jörg Henkel ◽

Andreas Herkersdorf ◽

Michael A. Kochte ◽

Johannes M. Kühn ◽

...

Keyword(s):

General Purpose ◽

System Level ◽

Coarse Grained ◽

Common Goal ◽

Fine Grained ◽

Reconfigurable Processors ◽

Systems On Chip ◽

On Chip ◽

Application Specific ◽

Heterogeneous Mpsoc

AbstractAchieving system-level dependability is a demanding task. The manifold requirements and dependability threats can no longer be statically addressed at individual abstraction layers. Instead, all components of future multi-processor systems-on-chip (MPSoCs) have to contribute to this common goal in an adaptive manner.In this paper we target a generic heterogeneous MPSoC that combines general purpose processors along with dedicated application-specific hard-wired accelerators, fine-grained reconfigurable processors, and coarse-grained reconfigurable architectures. We present different

Download Full-text

Network Delays and Link Capacities in Application-Specific Wormhole NoCs

VLSI Design ◽

10.1155/2007/90941 ◽

2007 ◽

Vol 2007 ◽

pp. 1-15 ◽

Cited By ~ 32

Author(s):

Zvika Guz ◽

Isask'har Walter ◽

Evgeny Bolotin ◽

Israel Cidon ◽

Ran Ginosar ◽

...

Keyword(s):

Interconnection Networks ◽

Packet Delay ◽

Capacity Allocation ◽

Delay Model ◽

Capacity Assignment ◽

Systems On Chip ◽

Individual Capacity ◽

On Chip ◽

Analytical Delay Model ◽

Application Specific

Network-on-chip- (NoC-) based application-specific systems on chip, where information traffic is heterogeneous and delay requirements may largely vary, require individual capacity assignment for each link in the NoC. This is in contrast to the standard approach of on- and off-chip interconnection networks which employ uniform-capacity links. Therefore, the allocation of link capacities is an essential step in the automated design process of NoC-based systems. The algorithm should minimize the communication resource costs under Quality-of-Service timing constraints. This paper presents a novel analytical delay model for virtual channeled wormhole networks with nonuniform links and applies the analysis in devising an efficient capacity allocation algorithm which assigns link capacities such that packet delay requirements for each flow are satisfied.

Download Full-text

Modeling and Simulation of Network-on-Chip Systems with DEVS and DEUS

The Scientific World JOURNAL ◽

10.1155/2014/982569 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Michele Amoretti

Keyword(s):

Modeling And Simulation ◽

Discrete Event ◽

General Purpose ◽

System Specification ◽

Modeling Framework ◽

Steep Learning Curve ◽

Networks On Chip ◽

Systems On Chip ◽

Communication Architectures ◽

On Chip

Networks on-chip (NoCs) provide enhanced performance, scalability, modularity, and design productivity as compared with previous communication architectures for VLSI systems on-chip (SoCs), such as buses and dedicated signal wires. Since the NoC design space is very large and high dimensional, evaluation methodologies rely heavily on analytical modeling and simulation. Unfortunately, there is no standard modeling framework. In this paper we illustrate how to design and evaluate NoCs by integrating the Discrete Event System Specification (DEVS) modeling framework and the simulation environment called DEUS. The advantage of such an approach is that both DEVS and DEUS support modularity—the former being a sound and complete modeling framework and the latter being an open, general-purpose platform, characterized by a steep learning curve and the possibility to simulate any system at any level of detail.

Download Full-text

The TaPaSCo Open-Source Toolflow

Journal of Signal Processing Systems ◽

10.1007/s11265-021-01640-8 ◽

2021 ◽

Author(s):

Carsten Heinz ◽

Jaco Hofmann ◽

Jens Korinth ◽

Lukas Sommer ◽

Lukas Weber ◽

...

Keyword(s):

Open Source ◽

System Integration ◽

Design Space Exploration ◽

Systems On Chip ◽

Spatially Distributed ◽

On Chip ◽

Many Core ◽

Hardware Designs ◽

Application Specific

AbstractThe integration of FPGA-based accelerators into a complete heterogeneous system is a challenging task faced by many researchers and engineers, especially now that FPGAs enjoy increasing popularity as implementation platforms for efficient, application-specific accelerators for domains such as signal processing, machine learning and intelligent storage. To lighten the burden of system integration from the developers of accelerators, the open-source TaPaSCo framework presented in this work provides an automated toolflow for the construction of heterogeneous many-core architectures from custom processing elements, and a simple, uniform programming interface to utilize spatially distributed, parallel computation on FPGAs. TaPaSCo aims to increase the scalability and portability of FPGA designs through automated design space exploration, greatly simplifying the scaling of hardware designs and facilitating iterative growth and portability across FPGA devices and families. This work describes TaPaSCo with its primary design abstractions and shows how TaPaSCo addresses portability and extensibility of FPGA hardware designs for systems-on-chip. A study of successful projects using TaPaSCo shows its versatility and can serve as inspiration and reference for future users, with more details on the usage of TaPaSCo presented in an in-depth case study and a short overview of the workflow.

Download Full-text

Simulated Annealing Based Placement Optimization for Reconfigurable Systems-on-Chip

2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus) ◽

10.1109/eiconrus.2019.8657251 ◽

2019 ◽

Author(s):

Gavrilov Sergey ◽

Zheleznikov Daniil ◽

Chochaev Rustam

Keyword(s):

Simulated Annealing ◽

Reconfigurable Systems ◽

Placement Optimization ◽

Systems On Chip ◽

On Chip

Download Full-text

Partitioning Algorithm Based on Simulated Annealing for Reconfigurable Systems-on-Chip

Problems of advanced micro- and nanoelectronic systems development ◽

10.31114/2078-7707-2018-1-199-204 ◽

2018 ◽

pp. 199-204

Author(s):

S.V. Gavrilov ◽

D.A. Zheleznikov ◽

R. Chochaev ◽

V.M. Khvatov ◽

◽

...

Keyword(s):

Simulated Annealing ◽

Reconfigurable Systems ◽

Systems On Chip ◽

Partitioning Algorithm ◽

On Chip

Download Full-text

Performance analysis of general purpose and digital signal processor kernels for heterogeneous systems-on-chip

Advances in Radio Science ◽

10.5194/ars-1-171-2003 ◽

2003 ◽

Vol 1 ◽

pp. 171-175

Author(s):

T. von Sydow ◽

H. Blume ◽

T. G. Noll

Keyword(s):

Digital Signal Processor ◽

Design Space Exploration ◽

Heterogeneous Systems ◽

Digital Signal ◽

General Purpose ◽

Optimization Techniques ◽

Product Cycle ◽

Systems On Chip ◽

Programmable Architecture ◽

On Chip

Abstract. Various reasons like technology progress, flexibility demands, shortened product cycle time and shortened time to market have brought up the possibility and necessity to integrate different architecture blocks on one heterogeneous System-on-Chip (SoC). Architecture blocks like programmable processor cores (DSP- and GPP-kernels), embedded FPGAs as well as dedicated macros will be integral parts of such a SoC. Especially programmable architecture blocks and associated optimization techniques are discussed in this contribution. Design space exploration and thus the choice which architecture blocks should be integrated in a SoC is a challenging task. Crucial to this exploration is the evaluation of the application domain characteristics and the costs caused by individual architecture blocks integrated on a SoC. An ATE-cost function has been applied to examine the performance of the aforementioned programmable architecture blocks. Therefore, representative discrete devices have been analyzed. Furthermore, several architecture dependent optimization steps and their effects on the cost ratios are presented.

Download Full-text

On-Chip Reconfigurable Hardware Accelerators for Popcount Computations

International Journal of Reconfigurable Computing ◽

10.1155/2016/8972065 ◽

2016 ◽

Vol 2016 ◽

pp. 1-11 ◽

Cited By ~ 3

Author(s):

Valery Sklyarov ◽

Iouliia Skliarova ◽

João Silva

Keyword(s):

Processing System ◽

General Purpose ◽

Reconfigurable Hardware ◽

Hardware Accelerators ◽

Programmable Logic ◽

Combinatorial Search ◽

Pci Express ◽

Systems On Chip ◽

Search Data ◽

On Chip

Popcount computations are widely used in such areas as combinatorial search, data processing, statistical analysis, and bio- and chemical informatics. In many practical problems the size of initial data is very large and increase in throughput is important. The paper suggests two types of hardware accelerators that are (1) designed in FPGAs and (2) implemented in Zynq-7000 all programmable systems-on-chip with partitioning of algorithms that use popcounts between software of ARM Cortex-A9 processing system and advanced programmable logic. A three-level system architecture that includes a general-purpose computer, the problem-specific ARM, and reconfigurable hardware is then proposed. The results of experiments and comparisons with existing benchmarks demonstrate that although throughput of popcount computations is increased in FPGA-based designs interacting with general-purpose computers, communication overheads (in experiments with PCI express) are significant and actual advantages can be gained if not only popcount but also other types of relevant computations are implemented in hardware. The comparison of software/hardware designs for Zynq-7000 all programmable systems-on-chip with pure software implementations in the same Zynq-7000 devices demonstrates increase in performance by a factor ranging from 5 to 19 (taking into account all the involved communication overheads between the programmable logic and the processing systems).

Download Full-text

Scalable performance monitoring of application specific multiprocessor Systems-on-Chip

2013 IEEE 8th International Conference on Industrial and Information Systems ◽

10.1109/iciinfs.2013.6732002 ◽

2013 ◽

Cited By ~ 3

Author(s):

Jude Angelo Ambrose ◽

Vito Cassisi ◽

Daniel Murphy ◽

Tuo Li ◽

Darshana Jayasinghe ◽

...

Keyword(s):

Performance Monitoring ◽

Multiprocessor Systems ◽

Systems On Chip ◽

On Chip ◽

Application Specific

Download Full-text

ARCHER: Communication-based predictive architecture selection for application specific multiprocessor Systems-on-Chip

2015 IEEE International Symposium on Circuits and Systems (ISCAS) ◽

10.1109/iscas.2015.7168658 ◽

2015 ◽

Author(s):

Jude Angelo Ambrose ◽

Nick Higgins ◽

Mrinal Chakravarthy ◽

Shivam Garg ◽

Tuo Li ◽

...

Keyword(s):

Multiprocessor Systems ◽

Systems On Chip ◽

Selection For ◽

On Chip ◽

Application Specific

Download Full-text