scholarly journals Implementation of Special Function Unit for Vertex Shader Processor Using Hybrid Number System

2014 ◽  
Vol 2014 ◽  
pp. 1-7
Author(s):  
Avni Agarwal ◽  
P. Harsha ◽  
Swati Vasishta ◽  
S. Sivanantham

The world of 3D graphic computing has undergone a revolution in the recent past, making devices more computationally intensive, providing high-end imaging to the user. The OpenGL ES Standard documents the requirements of graphic processing unit. A prime feature of this standard is a special function unit (SFU), which performs all the required mathematical computations on the vertex information corresponding to the image. This paper presents a low-cost, high-performance SFU architecture with improved speed and reduced area. Hybrid number system is employed here in order to reduce the complexity of operations by suitably switching between logarithmic number system (LNS) and binary number system (BNS). In this work, reduction of area and a higher operating frequency are achieved with almost the same power consumption as that of the existing implementations.

2015 ◽  
Vol 2015 ◽  
pp. 1-16
Author(s):  
Muhammad Asif ◽  
Imtiaz A. Taj ◽  
S. M. Ziauddin ◽  
Maaz Bin Ahmad ◽  
M. Tahir

One of the key requirements for mobile devices is to provide high-performance computing at lower power consumption. The processors used in these devices provide specific hardware resources to handle computationally intensive video processing and interactive graphical applications. Moreover, processors designed for low-power applications may introduce limitations on the availability and usage of resources, which present additional challenges to the system designers. Owing to the specific design of the JZ47x series of mobile application processors, a hybrid software-hardware implementation scheme for H.264/AVC encoder is proposed in this work. The proposed scheme distributes the encoding tasks among hardware and software modules. A series of optimization techniques are developed to speed up the memory access and data transferring among memories. Moreover, an efficient data reusage design is proposed for the deblock filter video processing unit to reduce the memory accesses. Furthermore, fine grained macroblock (MB) level parallelism is effectively exploited and a pipelined approach is proposed for efficient utilization of hardware processing cores. Finally, based on parallelism in the proposed design, encoding tasks are distributed between two processing cores. Experiments show that the hybrid encoder is 12 times faster than a highly optimized sequential encoder due to proposed techniques.


2012 ◽  
Vol 463-464 ◽  
pp. 1073-1076
Author(s):  
Helmar Alvares ◽  
Eliana Prado Lopes Aude ◽  
Ernesto Prado Lopes

This work proposes a Web-Based laboratory where researchers share the facilities of a simulation environment for parallel algorithms which solves scheduling problems known as Job Shop Problem (JSP). The environment supports multi-language platforms and uses a low cost, high performance Graphics Processing Unit (GPU) connected to a Java application server to help design more efficient solutions for JSP. Within a single web environment one can analyze and compare different methods and meta-heuristics. Each newly developed method is stored in an environment library and made available to all other users of the environment. This amassment of openly accessible solution methods will allow for the rapid convergence towards optimal solutions for JSP. The algorithm uses the parallel architecture of the system to handle threads. Each thread represents a job operation and the number of threads scales with the problem’s size. The threads exchange information in order to find the best solution. This cooperation decreases response times by one or two orders of magnitude.


2021 ◽  
Author(s):  
Vishesh Kumar Tanwar ◽  
Balasubramanian Raman ◽  
Amitesh Singh Rajput ◽  
Rama Bhargava

<div>The key benefits of cloud services, such as low cost, access flexibility, and mobility, have attracted users worldwide to utilize the deep learning algorithms for developing computer vision tasks. Untrusted third parties maintain these cloud servers, and users are always concerned about sharing their confidential data with them. In this paper, we addressed these concerns for by developing SecureDL, a privacy-preserving image recognition model for encrypted data over cloud. Additionally, we proposed a block-based image encryption scheme to protect images’ visual information. The scheme constitutes an order-preserving permutation ordered binary number system and pseudo-random matrices. The encryption scheme is proved to be secure in a probabilistic viewpoint and through various cryptographic attacks. Experiments are performed for several image recognition datasets, and the achieved recognition accuracy for encrypted data is close with non-encrypted data. SecureDL overcomes the storage, and computational overheads occurred in fully-homomorphic and multi-party computations based secure recognition schemes. </div>


2021 ◽  
Author(s):  
Vishesh Kumar Tanwar ◽  
Balasubramanian Raman ◽  
Amitesh Singh Rajput ◽  
Rama Bhargava

<div>The key benefits of cloud services, such as low cost, access flexibility, and mobility, have attracted users worldwide to utilize the deep learning algorithms for developing computer vision tasks. Untrusted third parties maintain these cloud servers, and users are always concerned about sharing their confidential data with them. In this paper, we addressed these concerns for by developing SecureDL, a privacy-preserving image recognition model for encrypted data over cloud. Additionally, we proposed a block-based image encryption scheme to protect images’ visual information. The scheme constitutes an order-preserving permutation ordered binary number system and pseudo-random matrices. The encryption scheme is proved to be secure in a probabilistic viewpoint and through various cryptographic attacks. Experiments are performed for several image recognition datasets, and the achieved recognition accuracy for encrypted data is close with non-encrypted data. SecureDL overcomes the storage, and computational overheads occurred in fully-homomorphic and multi-party computations based secure recognition schemes. </div>


2020 ◽  
Vol 12 (3) ◽  
pp. 415 ◽  
Author(s):  
Qiang Yin ◽  
You Wu ◽  
Fan Zhang ◽  
Yongsheng Zhou

With the development of polarimetric synthetic aperture radar (PolSAR), quantitative parameter inversion has been seen great progress, especially in the field of soil parameter inversion, which has achieved good results for applications. However, PolSAR data is also often many terabytes large. This huge amount of data also directly affects the efficiency of the inversion. Therefore, the efficiency of soil moisture and roughness inversion has become a problem in the application of this PolSAR technique. A parallel realization based on a graphics processing unit (GPU) for multiple inversion models of PolSAR data is proposed in this paper. This method utilizes the high-performance parallel computing capability of a GPU to optimize the realization of the surface inversion models for polarimetric SAR data. Three classical forward scattering models and their corresponding inversion algorithms are analyzed. They are different in terms of polarimetric data requirements, application situation, as well as inversion performance. Specifically, the inversion process of PolSAR data is mainly improved by the use of the high concurrent threads of GPU. According to the inversion process, various optimization strategies are applied, such as the parallel task allocation, and optimizations of instruction level, data storage, data transmission between CPU and GPU. The advantages of a GPU in processing computationally-intensive data are shown in the data experiments, where the efficiency of soil roughness and moisture inversion is increased by one or two orders of magnitude.


Author(s):  
K. Bhargavi ◽  
Sathish Babu B.

The GPUs (Graphics Processing Unit) were mainly used to speed up computation intensive high performance computing applications. There are several tools and technologies available to perform general purpose computationally intensive application. This chapter primarily discusses about GPU parallelism, applications, probable challenges and also highlights some of the GPU computing platforms, which includes CUDA, OpenCL (Open Computing Language), OpenMPC (Open MP extended for CUDA), MPI (Message Passing Interface), OpenACC (Open Accelerator), DirectCompute, and C++ AMP (C++ Accelerated Massive Parallelism). Each of these platforms is discussed briefly along with their advantages and disadvantages.


Sign in / Sign up

Export Citation Format

Share Document