Implementation of Special Function Unit for Vertex Shader Processor Using Hybrid Number System

A Hybrid Scheme Based on Pipelining and Multitasking in Mobile Application Processors for Advanced Video Coding

Scientific Programming ◽

10.1155/2015/197843 ◽

2015 ◽

Vol 2015 ◽

pp. 1-16

Author(s):

Muhammad Asif ◽

Imtiaz A. Taj ◽

S. M. Ziauddin ◽

Maaz Bin Ahmad ◽

M. Tahir

Keyword(s):

Video Processing ◽

Mobile Application ◽

High Performance ◽

Optimization Techniques ◽

Processing Unit ◽

Software Modules ◽

Computationally Intensive ◽

Hardware Processing ◽

Memory Accesses ◽

Advanced Video Coding

One of the key requirements for mobile devices is to provide high-performance computing at lower power consumption. The processors used in these devices provide specific hardware resources to handle computationally intensive video processing and interactive graphical applications. Moreover, processors designed for low-power applications may introduce limitations on the availability and usage of resources, which present additional challenges to the system designers. Owing to the specific design of the JZ47x series of mobile application processors, a hybrid software-hardware implementation scheme for H.264/AVC encoder is proposed in this work. The proposed scheme distributes the encoding tasks among hardware and software modules. A series of optimization techniques are developed to speed up the memory access and data transferring among memories. Moreover, an efficient data reusage design is proposed for the deblock filter video processing unit to reduce the memory accesses. Furthermore, fine grained macroblock (MB) level parallelism is effectively exploited and a pipelined approach is proposed for efficient utilization of hardware processing cores. Finally, based on parallelism in the proposed design, encoding tasks are distributed between two processing cores. Experiments show that the hybrid encoder is 12 times faster than a highly optimized sequential encoder due to proposed techniques.

Download Full-text

Design of a low-cost floating-point programmable vertex processor for mobile graphics applications based on hybrid number system

2011 IEEE International Conference on IC Design & Technology ◽

10.1109/icicdt.2011.5783231 ◽

2011 ◽

Cited By ~ 3

Author(s):

Shen-Fu Hsiao ◽

Chan-Feng Chiu ◽

Chia-Sheng Wen

Keyword(s):

Low Cost ◽

Number System ◽

Floating Point ◽

Mobile Graphics ◽

Hybrid Number

Download Full-text

A Web-Lab Environment for the Study of the Job Shop Problem

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.463-464.1073 ◽

2012 ◽

Vol 463-464 ◽

pp. 1073-1076

Author(s):

Helmar Alvares ◽

Eliana Prado Lopes Aude ◽

Ernesto Prado Lopes

Keyword(s):

High Performance ◽

Job Shop ◽

Response Times ◽

Graphics Processing Unit ◽

Low Cost ◽

Parallel Architecture ◽

Efficient Solutions ◽

Processing Unit ◽

Scheduling Problems ◽

Solution Methods

This work proposes a Web-Based laboratory where researchers share the facilities of a simulation environment for parallel algorithms which solves scheduling problems known as Job Shop Problem (JSP). The environment supports multi-language platforms and uses a low cost, high performance Graphics Processing Unit (GPU) connected to a Java application server to help design more efficient solutions for JSP. Within a single web environment one can analyze and compare different methods and meta-heuristics. Each newly developed method is stored in an environment library and made available to all other users of the environment. This amassment of openly accessible solution methods will allow for the rapid convergence towards optimal solutions for JSP. The algorithm uses the parallel architecture of the system to handle threads. Each thread represents a job operation and the number of threads scales with the problem’s size. The threads exchange information in order to find the best solution. This cooperation decreases response times by one or two orders of magnitude.

Download Full-text

High speed special function unit for graphics processing unit

2014 9th International Design and Test Symposium (IDT) ◽

10.1109/idt.2014.7038581 ◽

2014 ◽

Cited By ~ 3

Author(s):

Abd-Elrahman G. Qoutb ◽

Abdullah M. El-Gunidy ◽

Mohammed F. Tolba ◽

Magdy A. El-Moursy

Keyword(s):

High Speed ◽

Special Function ◽

Graphics Processing Unit ◽

Processing Unit ◽

Function Unit ◽

Graphics Processing

Download Full-text

SecureDL: A privacy preserving deep learning model for image recognition over cloud

10.36227/techrxiv.13650059.v1 ◽

2021 ◽

Author(s):

Vishesh Kumar Tanwar ◽

Balasubramanian Raman ◽

Amitesh Singh Rajput ◽

Rama Bhargava

Keyword(s):

Deep Learning ◽

Image Recognition ◽

Visual Information ◽

Low Cost ◽

Number System ◽

Privacy Preserving ◽

Cloud Services ◽

Binary Number ◽

Encryption Scheme ◽

Encrypted Data

<div>The key benefits of cloud services, such as low cost, access flexibility, and mobility, have attracted users worldwide to utilize the deep learning algorithms for developing computer vision tasks. Untrusted third parties maintain these cloud servers, and users are always concerned about sharing their confidential data with them. In this paper, we addressed these concerns for by developing SecureDL, a privacy-preserving image recognition model for encrypted data over cloud. Additionally, we proposed a block-based image encryption scheme to protect images’ visual information. The scheme constitutes an order-preserving permutation ordered binary number system and pseudo-random matrices. The encryption scheme is proved to be secure in a probabilistic viewpoint and through various cryptographic attacks. Experiments are performed for several image recognition datasets, and the achieved recognition accuracy for encrypted data is close with non-encrypted data. SecureDL overcomes the storage, and computational overheads occurred in fully-homomorphic and multi-party computations based secure recognition schemes. </div>

Download Full-text

A high performance floating-point special function unit using constrained piecewise quadratic approximation

2008 IEEE International Symposium on Circuits and Systems ◽

10.1109/iscas.2008.4541457 ◽

2008 ◽

Cited By ~ 3

Author(s):

Davide De Caro ◽

Nicola Petra ◽

Antonio G. M. Strollo

Keyword(s):

High Performance ◽

Special Function ◽

Quadratic Approximation ◽

Floating Point ◽

Function Unit

Download Full-text

High Performance and Low Power Fixed-point Special Function Unit for Mobile Vertex Processors

JOURNAL OF ELECTRONICS INFORMATION TECHNOLOGY ◽

10.3724/sp.j.1146.2011.00480 ◽

2011 ◽

Vol 33 (11) ◽

pp. 2764-2770 ◽

Cited By ~ 1

Author(s):

Ji-ye Jiao ◽

Rong Mu ◽

Yue Hao ◽

You-yao Liu

Keyword(s):

Fixed Point ◽

Low Power ◽

High Performance ◽

Special Function ◽

Function Unit

Download Full-text

SecureDL: A privacy preserving deep learning model for image recognition over cloud

10.36227/techrxiv.13650059 ◽

2021 ◽

Author(s):

Vishesh Kumar Tanwar ◽

Balasubramanian Raman ◽

Amitesh Singh Rajput ◽

Rama Bhargava

Keyword(s):

Deep Learning ◽

Image Recognition ◽

Visual Information ◽

Low Cost ◽

Number System ◽

Privacy Preserving ◽

Cloud Services ◽

Binary Number ◽

Encryption Scheme ◽

Encrypted Data

<div>The key benefits of cloud services, such as low cost, access flexibility, and mobility, have attracted users worldwide to utilize the deep learning algorithms for developing computer vision tasks. Untrusted third parties maintain these cloud servers, and users are always concerned about sharing their confidential data with them. In this paper, we addressed these concerns for by developing SecureDL, a privacy-preserving image recognition model for encrypted data over cloud. Additionally, we proposed a block-based image encryption scheme to protect images’ visual information. The scheme constitutes an order-preserving permutation ordered binary number system and pseudo-random matrices. The encryption scheme is proved to be secure in a probabilistic viewpoint and through various cryptographic attacks. Experiments are performed for several image recognition datasets, and the achieved recognition accuracy for encrypted data is close with non-encrypted data. SecureDL overcomes the storage, and computational overheads occurred in fully-homomorphic and multi-party computations based secure recognition schemes. </div>

Download Full-text

GPU-Based Soil Parameter Parallel Inversion for PolSAR Data

Remote Sensing ◽

10.3390/rs12030415 ◽

2020 ◽

Vol 12 (3) ◽

pp. 415 ◽

Cited By ~ 1

Author(s):

Qiang Yin ◽

You Wu ◽

Fan Zhang ◽

Yongsheng Zhou

Keyword(s):

Data Storage ◽

High Performance ◽

Graphics Processing Unit ◽

Processing Unit ◽

Inversion Process ◽

Soil Parameter ◽

Parameter Inversion ◽

Level Data ◽

Computationally Intensive ◽

Graphics Processing

With the development of polarimetric synthetic aperture radar (PolSAR), quantitative parameter inversion has been seen great progress, especially in the field of soil parameter inversion, which has achieved good results for applications. However, PolSAR data is also often many terabytes large. This huge amount of data also directly affects the efficiency of the inversion. Therefore, the efficiency of soil moisture and roughness inversion has become a problem in the application of this PolSAR technique. A parallel realization based on a graphics processing unit (GPU) for multiple inversion models of PolSAR data is proposed in this paper. This method utilizes the high-performance parallel computing capability of a GPU to optimize the realization of the surface inversion models for polarimetric SAR data. Three classical forward scattering models and their corresponding inversion algorithms are analyzed. They are different in terms of polarimetric data requirements, application situation, as well as inversion performance. Specifically, the inversion process of PolSAR data is mainly improved by the use of the high concurrent threads of GPU. According to the inversion process, various optimization strategies are applied, such as the parallel task allocation, and optimizations of instruction level, data storage, data transmission between CPU and GPU. The advantages of a GPU in processing computationally-intensive data are shown in the data experiments, where the efficiency of soil roughness and moisture inversion is increased by one or two orders of magnitude.

Download Full-text

GPU Computation and Platforms

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Emerging Research Surrounding Power Consumption and Performance Issues in Utility Computing ◽

10.4018/978-1-4666-8853-7.ch007 ◽

2016 ◽

pp. 136-174

Author(s):

K. Bhargavi ◽

Sathish Babu B.

Keyword(s):

Message Passing ◽

High Performance ◽

Message Passing Interface ◽

Gpu Computing ◽

Graphics Processing Unit ◽

General Purpose ◽

Processing Unit ◽

Computing Platforms ◽

Computationally Intensive ◽

Graphics Processing

The GPUs (Graphics Processing Unit) were mainly used to speed up computation intensive high performance computing applications. There are several tools and technologies available to perform general purpose computationally intensive application. This chapter primarily discusses about GPU parallelism, applications, probable challenges and also highlights some of the GPU computing platforms, which includes CUDA, OpenCL (Open Computing Language), OpenMPC (Open MP extended for CUDA), MPI (Message Passing Interface), OpenACC (Open Accelerator), DirectCompute, and C++ AMP (C++ Accelerated Massive Parallelism). Each of these platforms is discussed briefly along with their advantages and disadvantages.

Download Full-text