Formal Verification of Fault-Tolerant and Recovery Mechanisms for Safe Node Sequence Protocol

Author(s):  
Rui Zhou ◽  
Rong Min ◽  
Qi Yu ◽  
Chanjuan Li ◽  
Yong Sheng ◽  
...  
2021 ◽  
Vol 93 ◽  
pp. 01006
Author(s):  
Vladimir Kukharenko ◽  
Kirill Ziborov ◽  
Rafael Sadykov ◽  
Ruslan Rezin

The extent of formal verification methods applied in industrial projects has always been limited. The proliferation of distributed ledger systems (DLS), also known as blockchain, is rapidly changing the situation. Since the main area of DLSs’ application is the automation of financial transactions, the properties of predictability and reliability are critical for implementing such systems. The actual behavior of the DLS is largely determined by the chosen consensus protocol, which properties require strict specification and formal verification. Formal specification and verification of the consensus protocol is necessary but not sufficient. It is also required to ensure that the software implementation of the DLS nodes complies with this protocol. Finally, the verified software implementation of the protocol must run on a fairly reliable operating system. The financial focus of DLS application has also led to the emergence of the so-called smart contracts, which are an important part of the applied implementations of specific business processes based on DLSs. Therefore, the verifiability of smart contracts is also a critical requirement for industrial DLSs. In this paper, we describe an ongoing industrial project between a large Russian airline and three universities – Innopolis University (IU), Moscow Institute of Physics and Technology (MIPT) and Lomonosov Moscow State University (MSU). The main expected project result is a DLS for more flexible refueling of aircrafts, verified at least at the four technological levels described above. After brief project overview, we focus on our experience with the formal specification and verification of HotStuff, a leader-based fault-tolerant protocol that ensures reaching distributed consensus in the presence of Byzantine processes. The formal specification of the protocol is performed in the TLA+ language and then verified with a specialized TLC tool to verify models based on TLA+ specifications.


2020 ◽  
Vol 27 (4) ◽  
pp. 454-471
Author(s):  
Vladimir Aleksandrovich Kukharenko ◽  
Kirill Viktorovich Ziborov ◽  
Rafael Faritovich Sadykov ◽  
Alexandr Vladimirovich Naumchev ◽  
Ruslan Maratovich Rezin ◽  
...  

The extent of formal verification methods applied to industrial projects has always been limited. The proliferation of distributed ledger systems (DLS), also known as blockchain, is rapidly changing the situation. Since the main area of DLSs' application is the automation of financial transactions, the properties of predictability and reliability are critical for implementing such systems. The actual behavior of the DLS is determined by the chosen consensus protocol, which properties require strict specification and formal verification. Formal specification and verification of the consensus protocol is necessary but not sufficient. It is required to ensure that the software implementation of the DLS nodes complies with this protocol. The verified software implementation of the protocol must run on a fairly reliable operating system. The so-called “smart contracts”, which are an important part of the applied implementations of specific business processes based on DLSs, must be verifiable as well. In this paper, we describe an ongoing industrial project that will result in a DLS verified at least at the four technological levels described above. We then share our experience with the formal specification and verification of HotStuff, a leader-based fault-tolerant protocol that ensures reaching distributed consensus in the presence of Byzantine processes.


2021 ◽  
Author(s):  
Lukas Hübner ◽  
Alexey M. Kozlov ◽  
Demian Hespe ◽  
Peter Sanders ◽  
Alexandros Stamatakis

Phylogenetic trees are now routinely inferred on large scale HPC systems with thousands of cores as the parallel scalability of phylogenetic inference tools has improved over the past years to cope with the molecular data avalanche. Thus, the parallel fault tolerance of phylogenetic inference tools has become a relevant challenge. To this end, we explore parallel fault tolerance mechanisms and algorithms, the software modifications required, and the performance penalties induced via enabling parallel fault tolerance by example of RAxML-NG, the successor of the widely used RAxML tool for maximum likelihood based phylogenetic tree inference. We find that the slowdown induced by the necessary additional recovery mechanisms in RAxML-NG is on average 2%. The overall slowdown by using these recovery mechanisms in conjunction with a fault tolerant MPI implementation amounts to 8% on average for large empirical datasets. Via failure simulations, we show that RAxML-NG can successfully recover from multiple simultaneous failures, subsequent failures, failures during recovery, and failures during checkpointing. Recoveries are automatic and transparent to the user. The modified fault tolerant RAxML-NG code is available under GNU GPL at https://github.com/lukashuebner/ft-raxml-ng Contact: lukas.huebner@{kit.edu,h-its.org};, [email protected], [email protected], [email protected], [email protected] Supplementary information: Supplementary data are available at bioRχiv.


Sign in / Sign up

Export Citation Format

Share Document