scholarly journals Multi-path Coverage of all Final States for Model-Based Testing Theory using Spark In-memory Design

Author(s):  
Wilfried Yves Hamilton Adoni ◽  
Moez Krichen ◽  
Tarik Nahhal ◽  
Abdeltif Elbyed

This paper deals with an efficient and robust distributed framework for finite state machine coverage in the field model based testing theory. All final states coverage in large-scale automaton is inherently computing-intensive and memory exhausting with impractical time complexity because of an explosion of the number of states. Thus, it is important to propose a faster solution that reduces the time complexity by exploiting big data concept based on Spark RDD computation. To cope with this situation, we propose a parallel and distributed approach based on Spark in-memory design which exploits A* algorithm for optimal coverage. The experiments performed on multi-node cluster prove that the proposed framework achieves significant gain of the computation time.

2020 ◽  
Author(s):  
Wilfried Yves Hamilton Adoni ◽  
Moez Krichen ◽  
Tarik Nahhal ◽  
Abdeltif Elbyed

This paper deals with an efficient and robust distributed framework for finite state machine coverage in the field model based testing theory. All final states coverage in large-scale automaton is inherently computing-intensive and memory exhausting with impractical time complexity because of an explosion of the number of states. Thus, it is important to propose a faster solution that reduces the time complexity by exploiting big data concept based on Spark RDD computation. To cope with this situation, we propose a parallel and distributed approach based on Spark in-memory design which exploits A* algorithm for optimal coverage. The experiments performed on multi-node cluster prove that the proposed framework achieves significant gain of the computation time.


Author(s):  
Linwei Li ◽  
Liangchen Guo ◽  
Zhenying He ◽  
Yinan Jing ◽  
X. Sean Wang

Text clustering is a widely studied problem in the text mining domain. The Dirichlet Multinomial Mixture (DMM) model based clustering algorithms have shown good performance to cope with high dimensional sparse text data, obtaining reasonable results in both clustering accuracy and computational efficiency. However, the time complexity of DMM model training is proportional to the average document length and the number of clusters, making it inefficient for scaling up to long text and large corpora, which is common in realworld applications such as documents organization, retrieval and recommendation. In this paper, we leverage a symmetric prior setting for Dirichlet distribution, and build indices to decrease the time complexity of the sampling-based training for DMM from O(K∗L) to O(K∗U), where K is the number of clusters, L the average length of document, and U the average number of unique words in each document. We introduce a Metropolis-Hastings sampling algorithm, which further reduces the sampling time complexity from O(K∗U) to O(U) in the nearly-to-convergence training stages. Moreover, we also parallelize the DMM model training to obtain a further acceleration by using an uncollapsed Gibbs sampler. We combine all these optimizations into a highly efficient implementation, called X-DMM, which enables the DMM model to scale up for long and large-scale text clustering. We evaluate the performance of X-DMM on several real world datasets, and the experimental results show that XDMM achieves substantial speed up compared with existing state-of-the-art algorithms without clustering accuracy degradation.


2013 ◽  
Vol 2013 ◽  
pp. 1-7 ◽  
Author(s):  
DongSheng Yin ◽  
DeBo Liu

Generally, time complexity of algorithms for content-based image retrial is extremely high. In order to retrieve images on large-scale databases efficiently, a new way for retrieving based on Hadoop distributed framework is proposed. Firstly, a database of images features is built by using Speeded Up Robust Features algorithm and Locality-Sensitive Hashing and then perform the search on Hadoop platform in a parallel way specially designed. Considerable experimental results show that it is able to retrieve images based on content on large-scale cluster and image sets effectively.


2014 ◽  
Vol 56 (7) ◽  
pp. 749-762 ◽  
Author(s):  
Gerson Sunyé ◽  
Eduardo Cunha de Almeida ◽  
Yves Le Traon ◽  
Benoit Baudry ◽  
Jean-Marc Jézéquel

2020 ◽  
Vol 140 (4) ◽  
pp. 272-280
Author(s):  
Wataru Ohnishi ◽  
Hiroshi Fujimoto ◽  
Koichi Sakata

2018 ◽  
Author(s):  
Pavel Pokhilko ◽  
Evgeny Epifanovsky ◽  
Anna I. Krylov

Using single precision floating point representation reduces the size of data and computation time by a factor of two relative to double precision conventionally used in electronic structure programs. For large-scale calculations, such as those encountered in many-body theories, reduced memory footprint alleviates memory and input/output bottlenecks. Reduced size of data can lead to additional gains due to improved parallel performance on CPUs and various accelerators. However, using single precision can potentially reduce the accuracy of computed observables. Here we report an implementation of coupled-cluster and equation-of-motion coupled-cluster methods with single and double excitations in single precision. We consider both standard implementation and one using Cholesky decomposition or resolution-of-the-identity of electron-repulsion integrals. Numerical tests illustrate that when single precision is used in correlated calculations, the loss of accuracy is insignificant and pure single-precision implementation can be used for computing energies, analytic gradients, excited states, and molecular properties. In addition to pure single-precision calculations, our implementation allows one to follow a single-precision calculation by clean-up iterations, fully recovering double-precision results while retaining significant savings.


2011 ◽  
Vol 34 (6) ◽  
pp. 1012-1028 ◽  
Author(s):  
Huai-Kou MIAO ◽  
Sheng-Bo CHEN ◽  
Hong-Wei ZENG

Sign in / Sign up

Export Citation Format

Share Document