High-Performance Symmetric Block Ciphers on CUDA

Heuristic method for bitsliced representation of randomly generated 8×8 cryptographic S-Box

Ukrainian Journal of Information Technology ◽

10.23939/ujit2021.02.058 ◽

2021 ◽

Vol 3 (2) ◽

pp. 58-65

Author(s):

Ya. R. Sovyn ◽

◽

V. V. Khoma ◽

Keyword(s):

High Speed ◽

High Performance ◽

Heuristic Method ◽

Block Ciphers ◽

Truth Table ◽

Software Implementation ◽

Logical Operations ◽

Resistance To Power ◽

Symmetric Block ◽

Cache Attacks

The article is devoted to the issues of increasing the security and efficiency of software implementation for the symmetric block ciphers. For the implementation of cryptoalgorithms on low-end CPUs (8/16/32-bit microcontrollers), it is important to provide increased resistance to power consumption analysis attacks. With regard to the implementation of ciphers on high-end CPUs (x86, ARM Cortex-A), it is important to eliminate the vulnerability primarily to timing and cache attacks. The authors used a bitslice approach to securely implement block ciphers, which has potential advantages such as high speed and low computing resources. However, the known bitsliced methods have a significant limitation, since they work with deterministic S-Boxes or arbitrary S-Boxes of smaller sizes. The paper proposes a new heuristic method for bitsliced representation of cryptographic 8×8 S-Boxes containing randomly generated values. These values defy description using algebraic expressions. The method is based on the decomposition of the truth table, which describes the S-Box, into two parts. One part of the table forms logical masks, and the other is split into bit vectors. To find a logical description of these vectors an exhaustive search is used. After finding the description of all vectors, these two parts of the table are combined into one using logical operations. The use of this method oriented on software implementation in the logical basis {AND, OR, XOR, NOT} ensures the minimization of arbitrary 8×8 S-Boxes. The proposed method can be implemented using standard logical instructions on any 8/16/32/64-bit processors. It is also possible to use logical SIMD instructions from the SSE, AVX, AVX-512 extensions for x86-64 processors, which provides high performance due to the use of long registers. The corresponding software has been developed that implements the method of searching for bitsliced representations of a given S-Box, and also automatically generates C++ code for it based on SSE, AVX and AVX-512 instructions. The effectiveness of the method on the S-Box of known block ciphers, in particular the Ukrainian encryption standard "Kalyna", has been investigated. It was found that the developed algorithm requires almost half as many gates for the bitsliced description of an arbitrary S-Box than the best of known algorithm (370 gates versus 680, respectively). For ciphers that use two or four S-Box tables, joint minimization can yield up to 330 or 300 gates per table, respectively. Keywords: bitslicing; S-Box; logical minimization; SIMD; x86-64 CPU; software implementation; block ciphers.

Download Full-text

High-Performance Symmetric Block Ciphers on Multicore CPU and GPUs

International Journal of Networking and Computing ◽

10.15803/ijnc.2.2_251 ◽

2012 ◽

Vol 2 (2) ◽

pp. 251-268 ◽

Cited By ~ 19

Author(s):

Naoki Nishikawa ◽

Keisuke Iwai ◽

Takakazu Kurokawa

Keyword(s):

High Performance ◽

Block Ciphers ◽

Multicore Cpu ◽

Symmetric Block

Download Full-text

Analysis of Software Implemented Low Entropy Masking Schemes

Security and Communication Networks ◽

10.1155/2018/7206835 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 1

Author(s):

Dan Li ◽

Jiazhe Chen ◽

An Wang ◽

Xiaoyun Wang

Keyword(s):

High Performance ◽

Selection Criterion ◽

Block Ciphers ◽

Absolute Difference ◽

Hardware Implementations ◽

Software Implementations ◽

Low Entropy ◽

Symmetric Block

Low Entropy Masking Schemes (LEMS) are countermeasure techniques to mitigate the high performance overhead of masked hardware and software implementations of symmetric block ciphers by reducing the entropy of the mask sets. The security of LEMS depends on the choice of the mask sets. Previous research mainly focused on searching balanced mask sets for hardware implementations. In this paper, we find that those balanced mask sets may have vulnerabilities in terms of absolute difference when applied in software implemented LEMS. The experiments verify that such vulnerabilities certainly make the software LEMS implementations insecure. To fix the vulnerabilities, we present a selection criterion to choose the mask sets. When some feasible mask sets are already picked out by certain searching algorithms, our selection criterion could be a reference factor to help decide on a more secure one for software LEMS.

Download Full-text

DIFFERENTIAL PROPERTIES OF SYMMETRIC BLOCK CIPHERS WITH ROUND KEY MODULAR OPERATIONS OTHER THAN XOR

Radio Electronics Computer Science Control ◽

10.15588/1607-3274-2012-2-18 ◽

2013 ◽

Vol 0 (2) ◽

Author(s):

I.V. Lysytska ◽

A.A. Nastenko

Keyword(s):

Block Ciphers ◽

Differential Properties ◽

Symmetric Block

Download Full-text

Concurrent error detection schemes for fault-based side-channel cryptanalysis of symmetric block ciphers

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems ◽

10.1109/tcad.2002.804378 ◽

2002 ◽

Vol 21 (12) ◽

pp. 1509-1517 ◽

Cited By ~ 114

Author(s):

R. Karri ◽

K. Wu ◽

P. Mishra ◽

Yongkook Kim

Keyword(s):

Error Detection ◽

Block Ciphers ◽

Side Channel ◽

Concurrent Error Detection ◽

Detection Schemes ◽

Symmetric Block

Download Full-text

A fault-tolerant pipelined architecture for symmetric block ciphers

Computers & Electrical Engineering ◽

10.1016/j.compeleceng.2005.07.003 ◽

2005 ◽

Vol 31 (6) ◽

pp. 380-390

Author(s):

Min-Kyu Joo ◽

Yoon-Hwa Choi

Keyword(s):

Fault Tolerant ◽

Block Ciphers ◽

Pipelined Architecture ◽

Symmetric Block

Download Full-text

VLSI implementation of high performance burst mode for 128-bit block ciphers

Proceedings 14th Annual IEEE International ASIC/SOC Conference (IEEE Cat. No.01TH8558) ◽

10.1109/asic.2001.954663 ◽

2002 ◽

Cited By ~ 1

Author(s):

Y. Mitsuyama ◽

Z. Andales ◽

T. Onoye ◽

I. Shirakawa

Keyword(s):

High Performance ◽

Block Ciphers ◽

Vlsi Implementation ◽

Burst Mode

Download Full-text

Evaluation of Framework for the Comparative Analysis of Symmetric Block Ciphers

2009 2nd International Conference on Computer Science and its Applications ◽

10.1109/csa.2009.5404195 ◽

2009 ◽

Author(s):

Muhammad Junaid ◽

Mukhtar Hussain ◽

Ashraf Masood ◽

Firdous Kausar ◽

Ayesha Noreen ◽

...

Keyword(s):

Comparative Analysis ◽

Block Ciphers ◽

Symmetric Block

Download Full-text

On the semantic security of cellular automata based pseudo-random permutation using results from the Luby-Rackoff construction

Annales Universitatis Mariae Curie-Sklodowska sectio AI – Informatica ◽

10.17951/ai.2015.15.1.21-31 ◽

2015 ◽

Vol 15 (1) ◽

pp. 21

Author(s):

Kamel Mohammed Faraoun

Keyword(s):

Cellular Automata ◽

Block Cipher ◽

Random Permutation ◽

Block Ciphers ◽

Random Permutations ◽

Semantic Security ◽

Transition Rules ◽

Reversible Cellular Automata ◽

Symmetric Block ◽

Number Of Iterations

This paper proposes a semantically secure construction of pseudo-random permutations using second-order reversible cellular automata. We show that the proposed construction is equivalent to the Luby-Rackoff model if it is built using non-uniform transition rules, and we prove that the construction is strongly secure if an adequate number of iterations is performed. Moreover, a corresponding symmetric block cipher is constructed and analysed experimentally in comparison with popular ciphers. Obtained results approve robustness and efficacy of the construction, while achieved performances overcome those of some existing block ciphers.

Download Full-text

Fully Automated Differential Fault Analysis on Software Implementations of Block Ciphers

IACR Transactions on Cryptographic Hardware and Embedded Systems ◽

10.46586/tches.v2019.i3.1-29 ◽

2019 ◽

pp. 1-29 ◽

Cited By ~ 1

Author(s):

Xiaolu Hou ◽

Jakub Breier ◽

Fuyuan Zhang ◽

Yang Liu

Keyword(s):

Block Ciphers ◽

Fault Analysis ◽

Smt Solver ◽

Multiplication Operation ◽

Analysis Methodology ◽

Differential Fault Analysis ◽

Software Implementations ◽

Symmetric Block ◽

Assembly Analysis ◽

Effective Description

Differential Fault Analysis (DFA) is considered as the most popular fault analysis method. While there are techniques that provide a fault analysis automation on the cipher level to some degree, it can be shown that when it comes to software implementations, there are new vulnerabilities, which cannot be found by observing the cipher design specification.This work bridges the gap by providing a fully automated way to carry out DFA on assembly implementations of symmetric block ciphers. We use a customized data flow graph to represent the program and develop a novel fault analysis methodology to capture the program behavior under faults. We establish an effective description of DFA as constraints that are passed to an SMT solver. We create a tool that takes assembly code as input, analyzes the dependencies among instructions, automatically attacks vulnerable instructions using SMT solver and outputs the attack details that recover the last round key (and possibly the earlier keys). We support our design with evaluations on lightweight ciphers SIMON, SPECK, and PRIDE, and a current NIST standard, AES. By automated assembly analysis, we were able to find new efficient DFA attacks on SPECK and PRIDE, exploiting implementation specific vulnerabilities, and previously published DFA on SIMON and AES. Moreover, we present a novel DFA on multiplication operation that has never been shown for symmetric block ciphers before. Our experimental evaluation also shows reasonable execution times that are scalable to current cipher designs and can easily outclass the manual analysis. Moreover, we present a method to check the countermeasure-protected implementations in a way that helps implementers to decide how many rounds should be protected. We note that this is the first work that automatically carries out DFA on cipher implementations without any plaintext or ciphertext information and therefore, can be generally applied to any input data to the cipher.

Download Full-text