Much ADO about failures: a fault-aware model for compositional verification of strongly consistent distributed systems

Wolf Honoré; Jieung Kim; Ji-Yong Shin; Zhong Shao

doi:10.1145/3485474

Architecture-Centered Integrated Verification

Computer Engineering ◽

10.4018/978-1-61350-456-7.ch202 ◽

2012 ◽

pp. 201-222

Author(s):

Yujian Fu ◽

Zhijang Dong ◽

Xudong He

Keyword(s):

Petri Nets ◽

Software Architecture ◽

Large Scale ◽

Future Research ◽

Software Systems ◽

Compositional Verification ◽

Software Product ◽

Implementation Level ◽

High Level ◽

Florida International University

The approach aims at solving the above problems by including the analysis and verification of two different levels of software development process–design level and implementation level-and bridging the gap between software architecture analysis and verification and the software product. In the architecture design level, to make sure the design correctness and attack the large scale of complex systems, the compositional verification is used by dividing and verifying each component individually and synthesizing them based on the driving theory. Then for those properties that cannot be verified on the design level, the design model is translated to implementation and runtime verification technique is adapted to the program. This approach can highly reduce the work on the design verification and avoid the state-explosion problem using model checking. Moreover, this approach can ensure both design and implementation correctness, and can further provide a high confident final software product. This approach is based on Software Architecture Model (SAM) that was proposed by Florida International University in 1999. SAM is a formal specification and built on the pair of component-connector with two formalisms – Petri nets and temporal logic. The ACV approach places strong demands on an organization to articulate those quality attributes of primary importance. It also requires a selection of benchmark combination points with which to verify integrated properties. The purpose of the ACV is not to commend particular architectures, but to provide a method for verification and analysis of large scale software systems in architecture level. The future research works fall in two directions. In the compositional verification of SAM model, it is possible that there is circular waiting of certain data among different component and connectors. This problem was not discussed in the current work. The translation of SAM to implementation is based on the restricted Petri nets due to the undecidable issue of high level Petri nets. In the runtime analysis of implementation, extraction of the execution trace of the program is still needed to get a white box view, and further analysis of execution can provide more information of the product correctness.

Download Full-text

Security

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Large-Scale Distributed Computing and Applications ◽

10.4018/978-1-61520-703-9.ch009 ◽

2010 ◽

pp. 194-216

Author(s):

Valentin Cristea ◽

Ciprian Dobre ◽

Corina Stratan ◽

Florin Pop

Keyword(s):

Distributed Systems ◽

Access Control ◽

Key Management ◽

Secure Communication ◽

Large Scale ◽

Internet Banking ◽

Distributed Applications ◽

Sensitive Information ◽

Security Research ◽

Security Models

Security in distributed systems is a combination of confidentiality, integrity and availability of their components. It mainly targets the communication channels between users and/or processes located in different computers, the access control of users / processes to resources and services, and the management of keys, users and user groups. Distributed systems are more vulnerable to security threats due to several characteristics such as their large scale, the distributed nature of the control, and the remote nature of the access. In addition, an increasing number of distributed applications (such as Internet banking) manipulate sensitive information and have special security requirements. After discussing important security concepts in the Background section, this chapter addresses several important problems that are at the aim of current research in the security of large scale distributed systems: security models (which represent the theoretical foundation for solving security problems), access control (more specific the access control in distributed multi-organizational platforms), secure communication (with emphasis on the secure group communication, which is a hot topic in security research today), security management (especially key management for collaborative environments), secure distributed architectures (which are the blueprints for designing and building security systems), and security environments / frameworks.

Download Full-text

Technologies of Nanomodification of Low-Carbon Low Alloyed Steels

Materials Science Forum ◽

10.4028/www.scientific.net/msf.638-642.3123 ◽

2010 ◽

Vol 638-642 ◽

pp. 3123-3127

Author(s):

V.A. Malyshevsky ◽

E.I. Khlusova ◽

V.V. Orlov

Keyword(s):

Large Scale ◽

Physical And Mechanical Properties ◽

Metallurgical Industry ◽

Low Alloy Steels ◽

Scale Production ◽

Low Carbon ◽

Fine Grained ◽

Interphase Boundaries ◽

Alloy Steels ◽

High Level

Metallurgical industry can be considered as a field most accommodated for perception of nano-technologies, which in the near future will be able to provide large scale production and high level of investments return. Specially noted should physical and mechanical properties of nano-structured steels and alloys (strength, plasticity, toughness and so on) which will cardinally excel characteristics of respective materials developed using conventional technologies. Investigations have shown that basic principles of selection of a structure up to nano-level for low-carbon low-alloy steels can be put forward, that is: 1) morphological similarity of structural components, pre-domination of globular type structures due to reduction in carbon components and rational alloying; 2) formation of fine-dispersed carbide phase of globular morphology; 3) exclusion of lengthy interphase boundaries; 4) formation of fragmented structure with boundaries close to wide-angle ones, which inherited structure of fine-grained deformed austenite.

Download Full-text

Fault Tolerance

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Large-Scale Distributed Computing and Applications ◽

10.4018/978-1-61520-703-9.ch008 ◽

2010 ◽

pp. 168-193

Author(s):

Valentin Cristea ◽

Ciprian Dobre ◽

Corina Stratan ◽

Florin Pop

Keyword(s):

Distributed Systems ◽

Fault Tolerance ◽

Large Scale ◽

Fault Tolerant ◽

Distributed Applications ◽

Future Trends ◽

Distributed Architectures ◽

P2p Systems ◽

Geographically Distributed ◽

Commercial Applications

The domains of usage of large scale distributed systems have been extending during the past years from scientific to commercial applications. Together with the extension of the application domains, new requirements have emerged for large scale distributed systems. Among these requirements, fault tolerance is needed by more and more modern distributed applications, not only by the critical ones. In this chapter we analyze current existing work in enabling fault tolerance in case of large scale distributed systems, presenting specific problem, existing solution, as well as several future trends. The characteristics of these systems pose problems to ensuring fault tolerance especially because of their complexity, involving many resources and users geographically distributed, because of the volatility of resources that are available only for limited amounts of time, and because of the constraints imposed by the applications and resource owners. A general fault tolerant architecture should, at a minimum, be comprised of at least a mechanism to detect failures and a component capable to recover and handle the detected failures, usually using some form of a replication mechanism. In this chapter we analyzed existing fault tolerance implementations, as well as solutions adopted in real world large scale distributed systems. We analyzed the fault tolerance architectures being proposed for particular distributed architectures, such as Grid or P2P systems.

Download Full-text

REQUIREMENTS FOR AN EVENT-BASED SIMULATION PACKAGE FOR GRID SYSTEMS

Journal of Interconnection Networks ◽

10.1142/s0219265907001965 ◽

2007 ◽

Vol 08 (02) ◽

pp. 163-178 ◽

Cited By ~ 29

Author(s):

FATOS XHAFA ◽

JAVIER CARRETERO ◽

LEONARD BAROLLI ◽

ARJAN DURRESI

Keyword(s):

Distributed Systems ◽

Distributed Computing ◽

Large Scale ◽

Experimental Studies ◽

Distributed Applications ◽

Distributed Computing Systems ◽

Computing Systems ◽

Grid Systems ◽

Recent Developments ◽

Simulation Package

In this paper we present a study on the requirements for the design and implementation of simulation packages for Grid systems. Grids are emerging as new distributed computing systems whose main objective is to manage and allocate geographically distributed computing resources to applications and users in an efficient and transparent manner. Grid systems are at present very difficult and complex to use for experimental studies of large-scale distributed applications. Although the field of simulation of distributed computing systems is mature, recent developments in large-scale distributed systems are raising needs not present in the simulation of the traditional distributed systems. Motivated by this, we present in this work a set of basic requirements that any simulation package for Grid computing should offer. This set of functionalities is obtained after a careful review of most important existing Grid simulation packages and includes new requirements not considered in such simulation packages. Based on the identified set of requirements, a Grid simulator is developed and exemplified for the Grid scheduling problem.

Download Full-text

A Fault Tolerant Decentralized Scheduling in Large Scale Distributed Systems

Handbook of Research on P2P and Grid Systems for Service-Oriented Computing ◽

10.4018/978-1-61520-686-5.ch024 ◽

2010 ◽

pp. 566-588 ◽

Cited By ~ 2

Author(s):

Florin Pop

Keyword(s):

Distributed Systems ◽

High Performance ◽

Large Scale ◽

Fault Tolerant ◽

Optimal Algorithm ◽

Distributed Applications ◽

Distributed Scheduling ◽

Agent Based ◽

Decentralized Scheduling ◽

Optimization Schemes

This chapter presents a fault tolerant framework for the applications scheduling in large scale distributed systems (LSDS). Due to the specific characteristics and requirements of distributed systems, a good scheduling model should be dynamic. More specifically, it should adapt the scheduling decisions to resource state changes, which are commonly captured through monitoring. The scheduler and the monitor are two important middleware pieces that correlate their actions to ensure the high performance execution of distributed applications. The chapter presents and analyses agent based architecture for scheduling in large scale distributed systems. Then the user and resources management are presented. Optimization schemes for scheduling consider the near-optimal algorithm for distributed scheduling. The chapter presents the solution for scheduling optimization. The chapter covers and explains the fault tolerance cases for Grid environments and describes two possible scenarios for scheduling system.

Download Full-text

Towards a High Level Programming Paradigm to Deploy e-Science Applications with Dynamic Workflows on Large Scale Distributed Systems

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing ◽

10.1109/ccgrid.2015.147 ◽

2015 ◽

Cited By ~ 1

Author(s):

Mohamed Ben Belgacem ◽

Nabil Abdennadher

Keyword(s):

Distributed Systems ◽

Large Scale ◽

Programming Paradigm ◽

High Level

Download Full-text

Implicitly threaded parallelism in Manticore

Journal of Functional Programming ◽

10.1017/s0956796810000201 ◽

2010 ◽

Vol 20 (5-6) ◽

pp. 537-576 ◽

Cited By ~ 29

Author(s):

MATTHEW FLUET ◽

MIKE RAINEY ◽

JOHN REPPY ◽

ADAM SHAW

Keyword(s):

Large Scale ◽

Multicore Processors ◽

Regular Structure ◽

Parallel Applications ◽

Parallel Language ◽

Fine Grained ◽

Data Parallel ◽

Parallel Languages ◽

Parallel Case ◽

High Level

AbstractThe increasing availability of commodity multicore processors is making parallel computing ever more widespread. In order to exploit its potential, programmers need languages that make the benefits of parallelism accessible and understandable. Previous parallel languages have traditionally been intended for large-scale scientific computing, and they tend not to be well suited to programming the applications one typically finds on a desktop system. Thus, we need new parallel-language designs that address a broader spectrum of applications. The Manticore project is our effort to address this need. At its core is Parallel ML, a high-level functional language for programming parallel applications on commodity multicore hardware. Parallel ML provides a diverse collection of parallel constructs for different granularities of work. In this paper, we focus on the implicitly threaded parallel constructs of the language, which support fine-grained parallelism. We concentrate on those elements that distinguish our design from related ones, namely, a novel parallel binding form, a nondeterministic parallel case form, and the treatment of exceptions in the presence of data parallelism. These features differentiate the present work from related work on functional data-parallel language designs, which have focused largely on parallel problems with regular structure and the compiler transformations—most notably, flattening—that make such designs feasible. We present detailed examples utilizing various mechanisms of the language and give a formal description of our implementation.

Download Full-text

Architecture-Centered Integrated Verification

Modern Software Engineering Concepts and Practices - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-60960-215-4.ch005 ◽

2011 ◽

pp. 104-124

Author(s):

Yujian Fu ◽

Zhijang Dong ◽

Xudong He

Keyword(s):

Petri Nets ◽

Software Architecture ◽

Large Scale ◽

Future Research ◽

Software Systems ◽

Compositional Verification ◽

Software Product ◽

Implementation Level ◽

High Level ◽

Florida International University

The approach aims at solving the above problems by including the analysis and verification of two different levels of software development process–design level and implementation level-and bridging the gap between software architecture analysis and verification and the software product. In the architecture design level, to make sure the design correctness and attack the large scale of complex systems, the compositional verification is used by dividing and verifying each component individually and synthesizing them based on the driving theory. Then for those properties that cannot be verified on the design level, the design model is translated to implementation and runtime verification technique is adapted to the program. This approach can highly reduce the work on the design verification and avoid the state-explosion problem using model checking. Moreover, this approach can ensure both design and implementation correctness, and can further provide a high confident final software product. This approach is based on Software Architecture Model (SAM) that was proposed by Florida International University in 1999. SAM is a formal specification and built on the pair of component-connector with two formalisms – Petri nets and temporal logic. The ACV approach places strong demands on an organization to articulate those quality attributes of primary importance. It also requires a selection of benchmark combination points with which to verify integrated properties. The purpose of the ACV is not to commend particular architectures, but to provide a method for verification and analysis of large scale software systems in architecture level. The future research works fall in two directions. In the compositional verification of SAM model, it is possible that there is circular waiting of certain data among different component and connectors. This problem was not discussed in the current work. The translation of SAM to implementation is based on the restricted Petri nets due to the undecidable issue of high level Petri nets. In the runtime analysis of implementation, extraction of the execution trace of the program is still needed to get a white box view, and further analysis of execution can provide more information of the product correctness.

Download Full-text

A General Framework for the Modeling and Simulation of Grid and P2P Systems

Handbook of Research on P2P and Grid Systems for Service-Oriented Computing ◽

10.4018/978-1-61520-686-5.ch028 ◽

2010 ◽

pp. 657-685 ◽

Cited By ~ 2

Author(s):

Ciprian Dobre

Keyword(s):

Distributed Systems ◽

Modeling And Simulation ◽

General Framework ◽

Large Scale ◽

Discrete Event ◽

P2p Systems ◽

Online Simulation ◽

P2p Applications ◽

High Level ◽

New Algorithms

The field of modeling and simulation was long seen as a viable alternative to develop new algorithms and technologies and to enable the development of large-scale distributed systems, where analytical validations are prohibited by the nature of the encountered problems. The use of discrete-event simulators in the design and development of large scale distributed systems is appealing due to their efficiency and scalability. In this chapter we focus on the challenge to enable scalable, high-level, online simulation of applications, middleware, resources and networks to support scientific and systematic study of Grid and P2P applications and environments. We describe alternatives to designing and implementing simulators to be used in the validation of distributed systems, particularly Grid and P2Ps.

Download Full-text