A High-speed 32-bit Signed/Unsigned Pipelined Multiplier

In this paper, we present a high-speed, unified elliptic curve cryptography (ECC) processor for arbitrary Weierstrass curves over GF(p), which to the best of our knowledge, outperforms other similar works in terms of execution time. Our approach employs the combination of the schoolbook long and Karatsuba multiplication algorithm for the elliptic curve point multiplication (ECPM) to achieve better parallelization while retaining low complexity. In the hardware implementation, the substantial gain in speed is also contributed by our n-bit pipelined Montgomery Modular Multiplier (pMMM), which is constructed from our n-bit pipelined multiplier-accumulators that utilizes digital signal processor (DSP) primitives as digit multipliers. Additionally, we also introduce our unified, pipelined modular adder/subtractor (pMAS) for the underlying field arithmetic, and leverage a more efficient yet compact scheduling of the Montgomery ladder algorithm. The implementation for 256-bit modulus size on the 7-series FPGA: Virtex-7, Kintex-7, and XC7Z020 yields 0.139, 0.138, and 0.206 ms of execution time, respectively. Furthermore, since our pMMM module is generic for any curve in Weierstrass form, we support multi-curve parameters, resulting in a unified ECC architecture. Lastly, our method also works in constant time, making it suitable for applications requiring high speed and SCA-resistant characteristics.

Download Full-text

High-Speed and Unified ECC Processor for Generic Weierstrass Curves over GF(p) on FPGA

10.20944/preprints202101.0250.v1 ◽

2021 ◽

Author(s):

Asep Muhamad Awaludin ◽

Harashta Tatimma Larasati ◽

Howon Kim

Keyword(s):

Elliptic Curve ◽

Elliptic Curve Cryptography ◽

Execution Time ◽

High Speed ◽

Hardware Implementation ◽

Low Complexity ◽

Multiplication Algorithm ◽

Montgomery Ladder ◽

Modular Multiplier ◽

Pipelined Multiplier

In this paper, we present a high-speed, unified elliptic curve cryptography (ECC) processor for arbitrary Weierstrass curves over GF(p), which to the best of our knowledge, outperforms other similar works in terms of execution time. Our approach employs the combination of the schoolbook long and Karatsuba multiplication algorithm for the elliptic curve point multiplication (ECPM) to achieve better parallelization while retaining low complexity. In the hardware implementation, the substantial gain in speed is also contributed by our n-bit pipelined Montgomery Modular Multiplier (pMMM), which is constructed from our n-bit pipelined multiplier-accumulators that utilizes DSP primitives as digit multipliers. Additionally, we also introduce our unified, pipelined modular adder/subtractor (pMAS) for the underlying field arithmetic, and leverage a more efficient yet compact scheduling of the Montgomery ladder algorithm. The implementation on the 7-series FPGA: Virtex-7, Kintex-7, and XC7Z020, yields 0.139, 0.138, and 0.206 ms of execution time, respectively. Furthermore, since our pMMM module is generic for any curve in Weierstrass form, we support multi-curve parameters, resulting in a unified ECC architecture. Lastly, our method also works in constant time, making it suitable for applications requiring high speed and SCA-resistant characteristics.

Download Full-text

Implementation and modeling of parametrizable high-speed Reed Solomon decoders on FPGAs

Advances in Radio Science ◽

10.5194/ars-3-271-2005 ◽

2005 ◽

Vol 3 ◽

pp. 271-276 ◽

Cited By ~ 1

Author(s):

A. Flocke ◽

H. Blume ◽

T. G. Noll

Keyword(s):

Resource Sharing ◽

High Speed ◽

Critical Path ◽

Digital Signal ◽

Error Correction Codes ◽

Description Language ◽

Reed Solomon Code ◽

Hardware Description ◽

A Chain ◽

Pipelined Multiplier

Abstract. One of the most important error correction codes in digital signal processing is the Reed Solomon code. A lot of VLSI implementations have been described in literature. This paper introduces a highly parametrizable RS-decoder for FPGAs. By implementing resource-sharing and by using a fully pipelined multiplier/adder-unit in GF(2m) it was possible to achieve high throughput rates up to 1.3Gbit/s on a standard FPGA, while using only an attractive small amount of logical elements (LE). The implementation, written in a hardware description language (HDL), is based on an inversionless Berlekamp Algorithm (iBA), whose structure leads to a chain of identical processing elements (PE). The critical path of one PE runs only through one adder and one multiplier. A detailed description of a resource-sharing methodology for this Berlekamp Algorithm and the achievable gain are presented in this paper. The benchmarking for the design was done for different 8bit-codes against state-of-the-art FPGA-solutions and showed a gain of up to a factor of six regarding the AT-product, compared to other implementations.

Download Full-text

A novel pipelined multiplier for high-speed DSP applications

International Symposium on Signals, Circuits and Systems, 2005. ISSCS 2005. ◽

10.1109/isscs.2005.1509862 ◽

2006 ◽

Author(s):

A. Khatibzadeh ◽

K. Raahemifar

Keyword(s):

High Speed ◽

Pipelined Multiplier ◽

Dsp Applications

Download Full-text

A pipelined multiplier-accumulator using a high-speed, low-power static and dynamic full adder design

IEEE Journal of Solid-State Circuits ◽

10.1109/4.553190 ◽

1997 ◽

Vol 32 (1) ◽

pp. 114-118 ◽

Cited By ~ 18

Author(s):

Shyh-Jye Jou ◽

Chang-Yu Chen ◽

En-Chung Yang ◽

Chau-Chin Su

Keyword(s):

Low Power ◽

High Speed ◽

Full Adder ◽

Pipelined Multiplier

Download Full-text

Microfabrication research at the National Submicron Facility

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100074331 ◽

1983 ◽

Vol 41 ◽

pp. 86-89

Author(s):

E.D. Wolf

Keyword(s):

Integrated Circuits ◽

Particle Physics ◽

High Speed ◽

New Technology ◽

High Energy ◽

High Energy Particle ◽

New Science ◽

Wide Range ◽

Intermediate Domain ◽

Circuit Function

Most microelectronics devices and circuits operate faster, consume less power, execute more functions and cost less per circuit function when the feature-sizes internal to the devices and circuits are made smaller. This is part of the stimulus for the Very High-Speed Integrated Circuits (VHSIC) program. There is also a need for smaller, more sensitive sensors in a wide range of disciplines that includes electrochemistry, neurophysiology and ultra-high pressure solid state research. There is often fundamental new science (and sometimes new technology) to be revealed (and used) when a basic parameter such as size is extended to new dimensions, as is evident at the two extremes of smallness and largeness, high energy particle physics and cosmology, respectively. However, there is also a very important intermediate domain of size that spans from the diameter of a small cluster of atoms up to near one micrometer which may also have just as profound effects on society as “big” physics.

Download Full-text

Vacuum System to Minimize the Specimen Contamination of High-Performance EM

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100077967 ◽

1977 ◽

Vol 35 ◽

pp. 68-69

Author(s):

N. Yoshimura ◽

K. Shirota ◽

T. Etoh

Keyword(s):

Electron Microscope ◽

High Speed ◽

High Performance ◽

High Vacuum ◽

Vacuum System ◽

Pump System ◽

Pumping System ◽

Diffusion Pump ◽

Almost All ◽

Cascade Type

One of the most important requirements for a high-performance EM, especially an analytical EM using a fine beam probe, is to prevent specimen contamination by providing a clean high vacuum in the vicinity of the specimen. However, in almost all commercial EMs, the pressure in the vicinity of the specimen under observation is usually more than ten times higher than the pressure measured at the punping line. The EM column inevitably requires the use of greased Viton O-rings for fine movement, and specimens and films need to be exchanged frequently and several attachments may also be exchanged. For these reasons, a high speed pumping system, as well as a clean vacuum system, is now required. A newly developed electron microscope, the JEM-100CX features clean high vacuum in the vicinity of the specimen, realized by the use of a CASCADE type diffusion pump system which has been essentially improved over its predeces- sorD employed on the JEM-100C.

Download Full-text

The on-line use of histogram data for high-speed correction and alignment of electron microscope images

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100112713 ◽

1984 ◽

Vol 42 ◽

pp. 556-557

Author(s):

William Krakow

Keyword(s):

Power Spectrum ◽

High Speed ◽

Direct Memory Access ◽

Digital Television ◽

Contrast Transfer Function ◽

The Past ◽

High Resolution Tem ◽

On Line ◽

Correction Of Astigmatism ◽

The Fourier Transform

In the past few years on-line digital television frame store devices coupled to computers have been employed to attempt to measure the microscope parameters of defocus and astigmatism. The ultimate goal of such tasks is to fully adjust the operating parameters of the microscope and obtain an optimum image for viewing in terms of its information content. The initial approach to this problem, for high resolution TEM imaging, was to obtain the power spectrum from the Fourier transform of an image, find the contrast transfer function oscillation maxima, and subsequently correct the image. This technique requires a fast computer, a direct memory access device and even an array processor to accomplish these tasks on limited size arrays in a few seconds per image. It is not clear that the power spectrum could be used for more than defocus correction since the correction of astigmatism is a formidable problem of pattern recognition.

Download Full-text