Orchestrating Multiple Data-Parallel Kernels on Multiple Devices

An Improved Framework for C-V2X Systems with Data Integration and Identity-based Authentication

10.5121/csit.2021.111905 ◽

2021 ◽

Author(s):

Rui Huang

Keyword(s):

Bottom Layer ◽

Autonomous Driving ◽

Integrated System ◽

Smart Devices ◽

Fusion Model ◽

Current Trends ◽

Multiple Data ◽

Multiple Devices ◽

Identity Based ◽

Time Frames

Current trends of autonomous driving apply the hybrid use of on-vehicle and roadside smart devices to perform collaborative data sensing and computing, so as to achieve a comprehensive and stable decision making. The integrated system is usually named as C-V2X. However, several challenges have significantly hindered the development and adoption of such systems. For example, the difficulty of accessing multiple data protocols of multiple devices at the bottom layer, and the centralized deployment of computing arithmetic power. Therefore, this work proposes a novel framework for the design of C-V2X systems. First, a highly aggregated architecture is designed with fully integration with multiple traffic data resources. Then a multilevel information fusion model is designed based on multi-sensors in vehicle-road coordination. The model can fit different detection environments, detection mechanisms, and time frames. Finally, a lightweight and efficient identity-based authentication method is given. The method can realize bidirectional authentication between end devices and edge gateways.

Download Full-text

Static Mapping with Dynamic Switching of Multiple Data-Parallel Applications on Embedded Many-Core SoCs

IEICE Transactions on Information and Systems ◽

10.1587/transinf.2014edp7012 ◽

2014 ◽

Vol E97.D (11) ◽

pp. 2827-2834 ◽

Cited By ~ 2

Author(s):

Ittetsu TANIGUCHI ◽

Junya KAIDA ◽

Takuji HIEDA ◽

Yuko HARA-AZUMI ◽

Hiroyuki TOMIYAMA

Keyword(s):

Parallel Applications ◽

Data Parallel ◽

Dynamic Switching ◽

Multiple Data ◽

Many Core

Download Full-text

Static Mapping of Multiple Data-Parallel Applications on Embedded Many-Core SoCs

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e96.d.2268 ◽

2013 ◽

Vol E96.D (10) ◽

pp. 2268-2271

Author(s):

Junya KAIDA ◽

Yuko HARA-AZUMI ◽

Takuji HIEDA ◽

Ittetsu TANIGUCHI ◽

Hiroyuki TOMIYAMA ◽

...

Keyword(s):

Parallel Applications ◽

Data Parallel ◽

Multiple Data ◽

Many Core

Download Full-text

A Linear Algebra Framework for Static High Performance Fortran Code Distribution

Scientific Programming ◽

10.1155/1997/195689 ◽

1997 ◽

Vol 6 (1) ◽

pp. 3-27 ◽

Cited By ~ 22

Author(s):

Corinne Ancourt ◽

Fabien Coelho ◽

FranÇois Irigoin ◽

Ronan Keryell

Keyword(s):

Linear Algebra ◽

High Performance ◽

Address Space ◽

Data Parallel ◽

High Performance Fortran ◽

Multiple Data ◽

Fortran Code ◽

Code Distribution ◽

Overlap Analysis ◽

Data Parallel Programming

High Performance Fortran (HPF) was developed to support data parallel programming for single-instruction multiple-data (SIMD) and multiple-instruction multiple-data (MIMD) machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors, and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode HPF directives and to synthesize distributed code with space-efficient array allocation, tight loop bounds, and vectorized communications forINDEPENDENTloops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, and overlap analysis. The systematic use of an affine framework makes it possible to prove the compilation scheme correct.

Download Full-text

Static Analysis to Reduce Synchronization Costs Data-Parallel Programs with Remote Memory Copy

Parallel Processing Letters ◽

10.1142/s0129626497000164 ◽

1997 ◽

Vol 07 (02) ◽

pp. 145-156

Author(s):

Manish Gupta ◽

Edith Schonberg

Keyword(s):

Cache Coherence ◽

Remote Memory ◽

Communication Analysis ◽

Data Parallel ◽

Multiple Data ◽

Production And Consumption ◽

Cache Coherence Protocol ◽

Data Transfers ◽

Redundant Synchronization ◽

Important Objective

For a program with sufficient parallelism, reducing synchronization costs is an important objective for achieving efficient execution. This paper presents a novel methodology for reducing synchronization costs of programs compiled for SPMD execution. This methodology combines data flow analyisis with communication analysis to determine the ordering between production and consumption of data on different processors, which helps in identifying redundant synchronization. The resulting framework is more powerful than any that have been previously presented, as it provides the first algorithm that can eliminate synchronization messages even from computations that need communication. We show that several commonly occuring computation patterns such as reductions and stencil computations with reciprocal producer-consumer relationship between processors lend themselves well to this optimization, an observation that is confirmed by an examination of some HPE benchmark programs. Our framework also recognizes situations where the synchronization needs for multiple data transfers can be satisfied by a single synchronization message. This analysis, while applicable to all shared memory machines as well, is especially useful for those with a flexible cache-coherence protocol, as it identifies efficient ways of moving data directly from producers to consumers, often without any extra synchronization.

Download Full-text

A parallel programming environment supporting multiple data-parallel modules

International Journal of Parallel Programming ◽

10.1007/bf01407837 ◽

1992 ◽

Vol 21 (5) ◽

pp. 363-386 ◽

Cited By ~ 6

Author(s):

Bradley K. Seevers ◽

Michael J. Quinn ◽

Philip J. Hatcher

Keyword(s):

Parallel Programming ◽

Programming Environment ◽

Data Parallel ◽

Multiple Data

Download Full-text

Simulation of Compressible Flow on a Massively Parallel Architecture

Scientific Programming ◽

10.1155/1995/453684 ◽

1995 ◽

Vol 4 (3) ◽

pp. 193-201

Author(s):

Dan Williams ◽

Luc Bauwens

Keyword(s):

Test Problem ◽

Parallel Architecture ◽

Parallel Computer ◽

Two Dimensional ◽

Fine Grained ◽

Data Parallel ◽

Transport Code ◽

Multiple Data ◽

Flux Corrected Transport ◽

Better Than

This article describes the porting and optimization of an explicit, time-dependent, computational fluid dynamics code on an 8,192-node MasPar MP-1. The MasPar is a very fine-grained, single instruction, multiple data parallel computer. The code uses the flux-corrected transport algorithm. We describe the techniques used to port and optimize the code, and the behavior of a test problem. The test problem used to benchmark the flux-corrected transport code on the MasPar was a two-dimensional exploding shock with periodic boundary conditions. We discuss the performance that our code achieved on the MasPar, and compare its performance on the MasPar with its performance on other architectures. The comparisons show that the performance of the code on the MasPar is slightly better than on a CRAY Y-MP for a functionally equivalent, optimized two-dimensional code.

Download Full-text

A parallel programming environment supporting multiple data-parallel modules

ACM SIGPLAN Notices ◽

10.1145/156668.156684 ◽

1993 ◽

Vol 28 (1) ◽

pp. 44-47 ◽

Cited By ~ 1

Author(s):

Bradley K. Seevers ◽

Michael J. Quinn ◽

Philip J. Hatcher

Keyword(s):

Parallel Programming ◽

Programming Environment ◽

Data Parallel ◽

Multiple Data

Download Full-text

A Single Program Multiple Data Parallel Processing Platform for FPGAs

12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines ◽

10.1109/fccm.2004.9 ◽

2004 ◽

Cited By ~ 14

Author(s):

P. James-Roxby ◽

P. Schumacher ◽

C. Ross

Keyword(s):

Parallel Processing ◽

Data Parallel ◽

Multiple Data ◽

Processing Platform

Download Full-text

Flexible Language Constructs for Large Parallel Programs

Scientific Programming ◽

10.1155/1994/209864 ◽

1994 ◽

Vol 3 (3) ◽

pp. 169-186 ◽

Cited By ~ 2

Author(s):

Matt Rosing ◽

Robert Schnabel

Keyword(s):

Large Data ◽

Sequential Programs ◽

Global Synchronization ◽

Implicit Communication ◽

Data Parallel ◽

Communication Models ◽

Multiple Data ◽

Language Constructs ◽

Efficient Expression ◽

Selection Of

The goal of the research described in this article is to develop flexible language constructs for writing large data parallel numerical programs for distributed memory (multiple instruction multiple data [MIMD]) multiprocessors. Previously, several models have been developed to support synchronization and communication. Models for global synchronization include single instruction multiple data (SIMD), single program multiple data (SPMD), and sequential programs annotated with data distribution statements. The two primary models for communication include implicit communication based on shared memory and explicit communication based on messages. None of these models by themselves seem sufficient to permit the natural and efficient expression of the variety of algorithms that occur in large scientific computations. In this article, we give an overview of a new language that combines many of these programming models in a clean manner. This is done in a modular fashion such that different models can be combined to support large programs. Within a module, the selection of a model depends on the algorithm and its efficiency requirements. In this article, we give an overview of the language and discuss some of the critical implementation details.

Download Full-text