scholarly journals A Novel Low-Area Point Multiplication Architecture for Elliptic-Curve Cryptography

Electronics ◽  
2021 ◽  
Vol 10 (21) ◽  
pp. 2698
Author(s):  
Muhammad Rashid ◽  
Mohammad Mazyad Hazzazi  ◽  
Sikandar Zulqarnain Khan ◽  
Adel R. Alharbi  ◽  
Asher Sajid  ◽  
...  

This paper presents a Point Multiplication (PM) architecture of Elliptic-Curve Cryptography (ECC) over GF(2163) with a focus on the optimization of hardware resources and latency at the same time. The hardware resources are reduced with the use of a bit-serial (traditional schoolbook) multiplication method. Similarly, the latency is optimized with the reduction in a critical path using pipeline registers. To cope with the pipelining, we propose to reschedule point addition and double instructions, required for the computation of a PM operation in ECC. Subsequently, the proposed architecture over GF(2163) is modeled in Verilog Hardware Description Language (HDL) using Vivado Design Suite. To provide a fair performance evaluation, we synthesize our design on various FPGA (field-programmable gate array) devices. These FPGA devices are Virtex-4, Virtex-5, Virtex-6, Virtex-7, Spartan-7, Artix-7, and Kintex-7. The lowest area (433 FPGA slices) is achieved on Spartan-7. The highest speed is realized on Virtex-7, where our design achieves 391 MHz clock frequency and requires 416 μs for one PM computation (latency). For power, the lowest values are achieved on the Artix-7 (56 μW) and Kintex-7 (61 μW) devices. A ratio of throughput over area value of 4.89 is reached for Virtex-7. Our design outperforms most recent state-of-the-art solutions (in terms of area) with an overhead of latency.

2021 ◽  
Vol 11 (15) ◽  
pp. 7079
Author(s):  
Muhammad Rashid ◽  
Sajjad Shaukat Jamal ◽  
Sikandar Zulqarnain Khan ◽  
Adel R. Alharbi ◽  
Amer Aljaedi ◽  
...  

This work presents an Elliptic-curve Point Multiplication (ECP) architecture with a focus on low latency and low area for radio-frequency-identification (RFID) applications over GF(2163). To achieve low latency, we have reduced the clock cycles by using: (i) three-shift buffers in the datapath to load Elliptic-curve parameters as well as an initial point, (ii) the identical size of input/output interfaces in all building blocks of the architecture. The low area is preserved by using the same hardware resources of squaring and multiplication for inversion computation. Finally, an efficient controller is used to control the inferred logic. The proposed ECP architecture is modeled in Verilog and the synthesis results are given on three different 7-series FPGA (Field Programmable Gate Array) devices, i.e., Kintex-7, Artix-7, and Virtex-7. The performance of the architecture is provided with the integration of a schoolbook multiplier (implemented with two different logic styles, i.e., combinational and sequential). On Kintex-7, the combinational implementation style of a schoolbook multiplier results in power-optimized, i.e., 161 μW, values with an expense of (i) hardware resources, i.e., 3561 look-up-tables and 1527 flip-flops, (ii) clock frequency, i.e., 227 MHz, and (iii) latency, i.e., 11.57 μs. On the same Kintex-7 device, the sequential implementation style of a schoolbook multiplier provides, (i) 2.88 μs latency, (ii) 1786 look-up-tables and 1855 flip-flops, (iii) 647 μW power, and (iv) 909 MHz clock frequency. Therefore, the reported area, latency and power results make the proposed ECP architecture well-suited for RFID applications.


Author(s):  
Mohan Rao Thokala

Elliptic curve cryptography processor implemented for point multiplication on field programmable gate array. Segmented pipelined full-precision multiplier is used to reduce the latency and also data dependency can be avoided by modifying Lopez-Dahab Montgomery PM Algorithm, results in drastic reduction in the number of clock cycles required. The proposed ECC processor is implemented on Xilinx FPGA families i.e. virtex-4, vitrtex-5, virtex-7.single and three multiplier based designs show the fastest performance compared with reported work individually. Our three multiplier based ECC processor implementation is taking the lowest number of clock cycles on FPGA based design processor.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1490
Author(s):  
Asher Sajid ◽  
Muhammad Rashid ◽  
Sajjad Shaukat Jamal ◽  
Malik Imran ◽  
Saud S. Alotaibi ◽  
...  

Elliptic curve cryptography is the most widely employed class of asymmetric cryptography algorithm. However, it is exposed to simple power analysis attacks due to the lack of unifiedness over point doubling and addition operations. The unified crypto systems such as Binary Edward, Hessian and Huff curves provide resistance against power analysis attacks. Furthermore, Huff curves are more secure than Edward and Hessian curves but require more computational resources. Therefore, this article has provided a low area hardware architecture for point multiplication computation of Binary Huff curves over GF(2163) and GF(2233). To achieve this, a segmented least significant digit multiplier for polynomial multiplications is proposed. In order to provide a realistic and reasonable comparison with state of the art solutions, the proposed architecture is modeled in Verilog and synthesized for different field programmable gate arrays. For Virtex-4, Virtex-5, Virtex-6, and Virtex-7 devices, the utilized hardware resources in terms of hardware slices over GF(2163) are 5302, 2412, 2982 and 3508, respectively. The corresponding achieved values over GF(2233) are 11,557, 10,065, 4370 and 4261, respectively. The reported low area values provide the acceptability of this work in area-constrained applications.


2014 ◽  
pp. 27-33
Author(s):  
Mounir Bouhedda ◽  
Mokhtar Attari

The aim of this paper is to introduce a new architecture using Artificial Neural Networks (ANN) in designing a 6-bit nonlinear Analog to Digital Converter (ADC). A study was conducted to synthesise an optimal ANN in view to FPGA (Field Programmable Gate Array) implementation using Very High-speed Integrated Circuit Hardware Description Language (VHDL). Simulation and tests results are carried out to show the efficiency of the designed ANN.


2019 ◽  
Vol 28 (09) ◽  
pp. 1950149
Author(s):  
Bahram Rashidi ◽  
Mohammad Abedini

This paper presents efficient lightweight hardware implementations of the complete point multiplication on binary Edwards curves (BECs). The implementations are based on general and special cases of binary Edwards curves. The complete differential addition formulas have the cost of [Formula: see text] and [Formula: see text] for general and special cases of BECs, respectively, where [Formula: see text] and [Formula: see text] denote the costs of a field multiplication, a field squaring and a field multiplication by a constant, respectively. In the general case of BECs, the structure is implemented based on 3 concurrent multipliers. Also in the special case of BECs, two structures by employing 3 and 2 field multipliers are proposed for achieving the highest degree of parallelization and utilization of resources, respectively. The field multipliers are implemented based on the proposed efficient digit–digit polynomial basis multiplier. Two input operands of the multiplier proceed in digit level. This property leads to reduce hardware consumption and critical path delay. Also, in the structure, based on the change of input digit size from low digit size to high digit size the number of clock cycles and input words are different. Therefore, the multiplier can be flexible for different cryptographic considerations such as low-area and high-speed implementations. The point multiplication computation requires field inversion, therefore, we use a low-cost Extended Euclidean Algorithm (EEA) based inversion for implementation of this field operation. Implementation results of the proposed architectures based on Virtex-5 XC5VLX110 FPGA for two fields [Formula: see text] and [Formula: see text] are achieved. The results show improvements in terms of area and efficiency for the proposed structures compared to previous works.


2020 ◽  
Vol 29 (11) ◽  
pp. 2050179 ◽  
Author(s):  
Muhammad Rashid ◽  
Malik Imran ◽  
Atif Raza Jafri ◽  
Zahid Mehmood

This work has proposed a 4-stage pipelined architecture to achieve an optimized throughput over area ratio for point multiplication (PM) computation in binary huff curves (BHC) cryptography. The original mathematical formulation of BHC is revisited with an objective to reduce the required area. Consequently, a simplified formulation of BHC is obtained with 43% reduction in the hardware resources. As far as the throughput is concerned, it is improved first by reducing the critical path and second by minimizing the number of clock cycles (CCs) required to compute one PM. The critical path is reduced through the placement of pipeline registers, whereas the number of required CCs are minimized through an efficient scheduling of computations. These two factors i.e., the area reduction and throughput optimizations, have resulted in maximizing the throughput over area ratio. The proposed pipelined architecture is implemented over [Formula: see text] field, using standard NIST curve parameters. The architecture is modeled in Verilog and synthesized using Xilinx (ISE 14.7) design tool on Virtex 7 FPGA. The implementation results show that 17% improvement in clock frequency, 13% reduction in the time required to compute one PM and 2.6% improvement in throughput/area are achieved when compared with the most recent state of the art solutions.


2019 ◽  
Vol 29 (01) ◽  
pp. 2050002
Author(s):  
Khaled Ben Khalifa ◽  
Ahmed Ghazi Blaiech ◽  
Mehdi Abadi ◽  
Mohamed Hedi Bedoui

In this paper, we present a new generic architectural approach of a Self-Organizing Map (SOM). The proposed architecture, called the Diagonal-SOM (D-SOM), is described as an Hardware–Description-Language as an intellectual property kernel with easily adjustable parameters.The D-SOM architecture is based on a generic formalism that exploits two levels of the nested parallelism of neurons and connections. This solution is therefore considered as a system based on the cooperation of a distributed set of independent computations. The organization and structure of these calculations process an oriented data flow in order to find a better treatment distribution between different neuroprocessors. To validate the D-SOM architecture, we evaluate the performance of several SOM network architectures after their integration on a Xilinx Virtex-7 Field Programmable Gate Array support. The proposed solution allows the easy adaptation of learning to a large number of SOM topologies without any considerable design effort. [Formula: see text] SOM hardware is validated through FPGA implementation, where temporal performance is almost twice as fast as that obtained in the recent literature. The suggested D-SOM architecture is also validated through simulation on variable-sized SOM networks applied to color vector quantization.


Author(s):  
B Murali Krishna ◽  
◽  
B.T. Krishna ◽  
K Babulu ◽  
◽  
...  

A comparison of linear and quadratic transform implementation on field programmable gate array (FPGA) is presented. Popular linear transform namely Stockwell Transform and Smoothed Pseudo Wigner Ville Distribution (SPWVD) transform from Quadratic transforms is considered for the implementation on FPGA. Both the transforms are coded in Verilog hardware description language (Verilog HDL). Complex calculations of transformation are performed by using CORDIC algorithm. From FPGA family, Spartan-6 is chosen as hardware device to implement. Synthetic chirp signal is taken as input to test the both designed transforms. Summary of hardware resource utilization on Spartan-6 for both the transforms is presented. Finally, it is observed that both the transforms S-Transform and SPWVD are computed with low elapsed time with respect to MATLAB simulation.


2021 ◽  
Vol 13 (0) ◽  
pp. 1-5
Author(s):  
Kęstutis Bartnykas

Field-programmable logic arrays are often used in courses on computer architecture. The student must describe the processor with the external components necessary for its operation in the specified HDL (hardware description language) language according to the provided specification during a certain number of projects. The weakness of this approach is that the basis of such projects is a processor of one specific architecture, so the lecturer faces the issue of individualization of projects. This article proposes a solution based on dedicated processors instead of one programmable processor of a specific architecture. It’s shown here that the issue of project individualization is easier solvable in the proposed way, and it does not deviate from the theory of computer architecture, because the programmable processor is a generalization of a dedicated processor. The article describes project design ideas based on dedicated processors and gives some examples. Represented different instance than was applied during practical sessions of Computer Architecture that are held at the Department of Electronic Systems within VILNIUS TECH, i.e. certain modifications, and additions were applied.


2022 ◽  
Vol 12 (2) ◽  
pp. 655
Author(s):  
Baligh Naji ◽  
Chokri Abdelmoula ◽  
Mohamed Masmoudi

This paper presents the design and development of a technique for an Autonomous and Versatile mode Parking System (AVPS) that combines a various number of parking modes. The proposed approach is different from that of many developed parking systems. Previous research has focused on choosing only a parking lot starting from two parking modes (which are parallel and perpendicular). This research aims at developing a parking system that automatically chooses a parking lot starting from four parking modes. The automatic AVPS was proposed for the car-parking control problem, and could be potentially exploited for future vehicle generation. A specific mode can be easily computed using the proposed strategy. A variety of candidate modes could be generated using one developed real time VHDL (VHSIC Hardware Description Language) algorithm providing optimal solutions with performance measures. Based on simulation and experimental results, the AVPS is able to find and recognize in advance which parking mode to select. This combination describes full implementation on a mobile robot, such as a car, based on a specific FPGA (Field-Programmable Gate Array) card. To prove the effectiveness of the proposed innovation, an evaluation process comparing the proposed technique with existing techniques was conducted and outlined.


Sign in / Sign up

Export Citation Format

Share Document