Using machine learning to refine Category-Partition test specifications and test suites

AbstractSwarm intelligence techniques have a vast range of real world applications.Some applications are in the domain of medical data mining where, main attention is on structure models for the classification and expectation of numerous diseases. These biomedical applications have grabbed the interest of numerous researchers because these are most serious and prevalent causes of death among the human whole world out of which breast cancer is the most serious issue. Mammography is the initial screening assessment of breast cancer. In this study, an enhanced version of Harris Hawks Optimization (HHO) approach has been developed for biomedical databases, known as DLHO. This approach has been introduced by integrating the merits of dimension learning-based hunting (DLH) search strategy with HHO. The main objective of this study is to alleviate the lack of crowd diversity, premature convergence of the HHO and the imbalance amid the exploration and exploitation. DLH search strategy utilizes a dissimilar method to paradigm a neighborhood for each search member in which the neighboring information can be shared amid search agents. This strategy helps in maintaining the diversity and the balance amid global and local search. To evaluate the DLHO lot of experiments have been taken such as (i) the performance of optimizers have analysed by using 29-CEC -2017 test suites, (ii) to demonstrate the effectiveness of the DLHO it has been tested on different biomedical databases out of which we have used two different databases for Breast i.e. MIAS and second database has been taken from the University of California at Irvine (UCI) Machine Learning Repository.Also to test the robustness of the proposed method its been tested on two other databases of such as Balloon and Heart taken from the UCI Machine Learning Repository. All the results are in the favour of the proposed technique.

Download Full-text

Role of Machine Learning & Artificial Intelligence Techniques in Software Testing

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i6.5800 ◽

2021 ◽

Vol 12 (6) ◽

pp. 2913-2921

Author(s):

Nilofar Mulla, Dr. Naveenkumar Jayakumar

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Software Testing ◽

Machine Learning Techniques ◽

Business Growth ◽

Growth Machine ◽

Learning Techniques ◽

Test Suites ◽

Applications Of Ai

This study provides information about the use of artificial intelligence (AI) and machine learning (ML) techniques in the field of software testing. The use of AI in software testing is still in its initial stages. Also the automation level is lesser compared to more evolved areas of work.AI and ML can be used to help reduce tediousness and automate tasks in software testing. Testing can be made more efficient and smarter with the help of AI. Researchers recognize potential of AI to bridge the gap between human and machine driven testing capabilities. There are still number of challenges to fully utilize AI and ML techniques in testing but it will definitely enhance the entire testing process and skills of testers and will contribute in business growth. Machine learning research is a subset of overall AI research. The life-cycle of software is increasingly shortening and becoming more complicated. There is a struggle in software development between the competing pressures of developing software and meeting deadlines. AI-powered automated testing makes conducting full test suites in a timely manner on every change. In this article a detailed overview about the various applications of AI in software testing have been demonstrated. Also the implementation of machine learning in software testing has been discussed in detail and use of different machine learning techniques has been explained as well.

Download Full-text

Restricted-Variance Molecular Geometry Optimization Based on Gradient-Enhanced Kriging

10.26434/chemrxiv.11950491.v3 ◽

2020 ◽

Author(s):

Gerardo Raggi ◽

Ignacio Fernández Galván ◽

Christian L. Ritterhoff ◽

Morgane Vacher ◽

Roland Lindh

Keyword(s):

Machine Learning ◽

Density Functional ◽

Geometry Optimization ◽

Arbitrary Point ◽

Molecular Geometry ◽

Second Order ◽

Machine Learning Techniques ◽

Stationary Points ◽

Molecular Geometry Optimization ◽

Test Suites

Machine learning techniques, specifically gradient-enhanced Kriging (GEK), have been implemented for molecular geometry optimization. GEK-based optimization has many advantages compared to conventional - step-restricted second-order truncated expansion - molecular optimization methods. In particular, the surrogate model given by GEK can have multiple stationary points, will smoothly converge to the exact model as the number of sample points increases, and contains an explicit expression for the expected error of the model function at an arbitrary point. Machine learning is, however, associated with abundance of data, contrary to the situation desired for efficient geometry optimizations. In the paper we demonstrate how the GEK procedure can be utilized in a fashion such that in the presence of few data points, the surrogate surface will in a robust way guide the optimization to a minimum of a potential energy surface. In this respect the GEK procedure will be used to mimic the behavior of a conventional second-order scheme, but retaining the flexibility of the superior machine learning approach. Moreover, the expected error will be used in the optimization to facilitate restricted-variance optimizations (RVO). A procedure which relates the eigenvalues of the approximate guessed Hessian with the individual characteristic lengths, used in the GEK model, reduces the number of empirical parameters to optimize to two - the value of the trend function and the maximum allowed variance. These parameters are determined using the extended Baker (e-Baker) and part of the Baker transition-state (Baker-TS) test suites as a training set. The so-created optimization procedure is tested using the e-Baker, the full Baker-TS, and the S22 test suites, at the density-functional-theory and second order Møller-Plesset levels of approximation. The results show that the new method is generally of similar or better performance than a state-of-the-art conventional method, even for cases where no significant improvement was expected.

Download Full-text

Restricted-Variance Molecular Geometry Optimization Based on Gradient-Enhanced Kriging

10.26434/chemrxiv.11950491.v2 ◽

2020 ◽

Author(s):

Gerardo Raggi ◽

Christian L. Ritterhoff ◽

Ignacio Fernández Galván ◽

Morgane Vacher ◽

Roland Lindh

Keyword(s):

Machine Learning ◽

Density Functional ◽

Geometry Optimization ◽

Arbitrary Point ◽

Molecular Geometry ◽

Second Order ◽

Machine Learning Techniques ◽

Stationary Points ◽

Molecular Geometry Optimization ◽

Test Suites

Machine learning techniques, specifically gradient-enhanced Kriging (GEK), have been implemented for molecular geometry optimization. GEK-based optimization has many advantages compared to conventional - step-restricted second-order truncated expansion - molecular optimization methods. In particular, the surrogate model given by GEK can have multiple stationary points, will smoothly converge to the exact model as the number of sample points increases, and contains an explicit expression for the expected error of the model function at an arbitrary point. Machine learning is, however, associated with abundance of data, contrary to the situation desired for efficient geometry optimizations. In the paper we demonstrate how the GEK procedure can be utilized in a fashion such that in the presence of few data points, the surrogate surface will in a robust way guide the optimization to a minimum of a potential energy surface. In this respect the GEK procedure will be used to mimic the behavior of a conventional second-order scheme, but retaining the flexibility of the superior machine learning approach. Moreover, the expected error will be used in the optimization to facilitate restricted-variance optimizations (RVO). A procedure which relates the eigenvalues of the approximate guessed Hessian with the individual characteristic lengths, used in the GEK model, reduces the number of empirical parameters to optimize to two - the value of the trend function and the maximum allowed variance. These parameters are determined using the extended Baker (e-Baker) and part of the Baker transition-state (Baker-TS) test suites as a training set. The so-created optimization procedure is tested using the e-Baker, the full Baker-TS, and the S22 test suites, at the density-functional-theory and second order Møller-Plesset levels of approximation. The results show that the new method is generally of similar or better performance than a state-of-the-art conventional method, even for cases where no significant improvement was expected.

Download Full-text

Restricted-Variance Molecular Geometry Optimization Based on Gradient-Enhanced Kriging

10.26434/chemrxiv.11950491.v1 ◽

2020 ◽

Author(s):

Gerardo Raggi ◽

Christian L. Ritterhoff ◽

Ignacio Fernández Galván ◽

Morgane Vacher ◽

Roland Lindh

Keyword(s):

Machine Learning ◽

Geometry Optimization ◽

Molecular Geometry ◽

Second Order ◽

Test Suite ◽

Stationary Points ◽

Model Function ◽

Data Set ◽

Molecular Geometry Optimization ◽

Test Suites

Machine learning techniques, specifically Gradient-Enhanced Kriging (GEK), has been implemented for molecular geometry optimization. GEK has many advantages as compared to conventional -- step-restricted second-order truncated -- molecular optimization methods. In particular, the surrogate model associated with GEK can have multiple stationary points, will smoothly converge to the exact model as the size of the data set increases, and contains an explicit expression for the expected average error of the model function at an arbitrary point in space. In this respect GEK can be of interest for methods used in molecular geometry optimizations. GEK is usually, however, associated with abundance of data, contrary to the situation desired for efficient geometry optimizations. In the paper we will demonstrate how the GEK procedure can be utilized in a fashion such that in the presence of few data points, the surrogate surface will in a robust way guide the optimization to a minimum of a molecular structure. In this respect the GEK procedure will be used to mimic the behavior of a conventional second-order scheme, but retaining the flexibility of the superior machine learning approach -- GEK is an exact interpolator. Moreover, the expected variance will be used in the optimization to facilitate restricted-variance rational function optimizations (RV-RFO). A procedure which relates the eigenvalues of the Hessian-model-function Hessian with the individual characteristic lengths, used in the GEK, reduces the number of empirical parameters to two -- the value of the trend function and the maximum allowed variance. These parameters are determined using the extended Baker (e-Baker) test suite, at the Hartree-Fock level of approximation, and a single reaction of the Baker transition-state (Baker-TS) test suite as a training set. The so-created optimization procedure -- RV-RFO-GEK -- is tested using the e-Baker, the full Baker-TS, and the S22 test suites, at the density-functional-theory level for the two Baker test suites and at the second order Møller-Plesset level of approximation for the S22 test suite, respectively. The tests show that the new method is generally on par with a state-of-the-art conventional method, while for difficult cases it exhibits a definite advantage.

Download Full-text

Into the Noddyverse: A massive data store of 3D geological models for Machine Learning & inversion applications

10.5194/essd-2021-304 ◽

2021 ◽

Author(s):

Mark Jessell ◽

Jiateng Guo ◽

Yunqiang Li ◽

Mark Lindsay ◽

Richard Scalzo ◽

...

Keyword(s):

Machine Learning ◽

Massive Data ◽

Training Set ◽

Data Store ◽

Gravity And Magnetic ◽

Geological Models ◽

Synthetic Test ◽

Comprehensive Test ◽

And Inversion ◽

Test Suites

Abstract. Unlike some other well-known challenges such as facial recognition, where Machine Learning and Inversion algorithms are widely developed, the geosciences suffer from a lack of large, labelled datasets that can be used to validate or train robust Machine Learning and inversion schemes. Publicly available 3D geological models are far too restricted in both number and the range of geological scenarios to serve these purposes. With reference to inverting geophysical data this problem is further exacerbated as in most cases real geophysical observations result from unknown 3D geology, and synthetic test datasets are often not particularly geological, nor geologically diverse. To overcome these limitations, we have used the Noddy modelling platform to generate one million models, which represent the first publicly accessible massive training set for 3D geology and resulting gravity and magnetic datasets. This model suite can be used to train Machine Learning systems, and to provide comprehensive test suites for geophysical inversion. We describe the methodology for producing the model suite, and discuss the opportunities such a model suit affords, as well as its limitations, and how we can grow and access this resource.

Download Full-text

Reducing the Discard of MBT Test Cases

Journal of Software Engineering Research and Development ◽

10.5753/jserd.2020.602 ◽

2020 ◽

Vol 8 ◽

pp. 4-1 - 4:15

Author(s):

Thomaz Diniz ◽

Everton L G Alves ◽

Anderson G F Silva ◽

Wilkerson L Andrade

Keyword(s):

Machine Learning ◽

Distance Functions ◽

Test Case ◽

Test Cases ◽

Use Case ◽

Industrial Data ◽

Model Based Testing ◽

Industrial Projects ◽

The Impact ◽

Test Suites

Model-Based Testing (MBT) is used for generating test suites from system models. However, as software evolves, its models tend to be updated, which may lead to obsolete test cases that are often discarded. Test case discard can be very costly since essential data, such as execution history, are lost. In this paper, we investigate the use of distance functions and machine learning to help to reduce the discard of MBT tests. First, we assess the problem of managing MBT suites in the context of agile industrial projects. Then, we propose two strategies to cope with this problem: (i) a pure distance function-based. An empirical study using industrial data and ten different distance functions showed that distance functions could be effective for identifying low impact edits that lead to test cases that can be updated with little effort. We also found the optimal configuration for each function. Moreover, we showed that, by using this strategy, one could reduce the discard of test cases by 9.53%; (ii) a strategy that combines machine learning with distance values. This strategy can classify the impact of edits in use case documents with accuracy above 80%; it was able to reduce the discard of test cases by 10.4% and to identify test cases that should, in fact, be discarded.

Download Full-text

Restricted-Variance Molecular Geometry Optimization Based on Gradient-Enhanced Kriging

10.26434/chemrxiv.11950491 ◽

2020 ◽

Author(s):

Gerardo Raggi ◽

Ignacio Fernández Galván ◽

Christian L. Ritterhoff ◽

Morgane Vacher ◽

Roland Lindh

Keyword(s):

Machine Learning ◽

Density Functional ◽

Geometry Optimization ◽

Arbitrary Point ◽

Molecular Geometry ◽

Second Order ◽

Machine Learning Techniques ◽

Stationary Points ◽

Molecular Geometry Optimization ◽

Test Suites

Machine learning techniques, specifically gradient-enhanced Kriging (GEK), have been implemented for molecular geometry optimization. GEK-based optimization has many advantages compared to conventional - step-restricted second-order truncated expansion - molecular optimization methods. In particular, the surrogate model given by GEK can have multiple stationary points, will smoothly converge to the exact model as the number of sample points increases, and contains an explicit expression for the expected error of the model function at an arbitrary point. Machine learning is, however, associated with abundance of data, contrary to the situation desired for efficient geometry optimizations. In the paper we demonstrate how the GEK procedure can be utilized in a fashion such that in the presence of few data points, the surrogate surface will in a robust way guide the optimization to a minimum of a potential energy surface. In this respect the GEK procedure will be used to mimic the behavior of a conventional second-order scheme, but retaining the flexibility of the superior machine learning approach. Moreover, the expected error will be used in the optimization to facilitate restricted-variance optimizations (RVO). A procedure which relates the eigenvalues of the approximate guessed Hessian with the individual characteristic lengths, used in the GEK model, reduces the number of empirical parameters to optimize to two - the value of the trend function and the maximum allowed variance. These parameters are determined using the extended Baker (e-Baker) and part of the Baker transition-state (Baker-TS) test suites as a training set. The so-created optimization procedure is tested using the e-Baker, the full Baker-TS, and the S22 test suites, at the density-functional-theory and second order Møller-Plesset levels of approximation. The results show that the new method is generally of similar or better performance than a state-of-the-art conventional method, even for cases where no significant improvement was expected.

Download Full-text