Provenance-and machine learning-based recommendation of parameter values in scientific workflows

PeerJ Computer Science ◽

10.7717/peerj-cs.606 ◽

2021 ◽

Vol 7 ◽

pp. e606

Author(s):

Daniel Silva Junior ◽

Esther Pacitti ◽

Aline Paes ◽

Daniel de Oliveira

Keyword(s):

Machine Learning ◽

High Performance ◽

User Preferences ◽

Scientific Workflows ◽

Machine Learning Techniques ◽

Provenance Data ◽

Learning Techniques ◽

Parameter Values ◽

And Storage ◽

Composition Monitoring

Scientific Workflows (SWfs) have revolutionized how scientists in various domains of science conduct their experiments. The management of SWfs is performed by complex tools that provide support for workflow composition, monitoring, execution, capturing, and storage of the data generated during execution. In some cases, they also provide components to ease the visualization and analysis of the generated data. During the workflow’s composition phase, programs must be selected to perform the activities defined in the workflow specification. These programs often require additional parameters that serve to adjust the program’s behavior according to the experiment’s goals. Consequently, workflows commonly have many parameters to be manually configured, encompassing even more than one hundred in many cases. Wrongly parameters’ values choosing can lead to crash workflows executions or provide undesired results. As the execution of data- and compute-intensive workflows is commonly performed in a high-performance computing environment e.g., (a cluster, a supercomputer, or a public cloud), an unsuccessful execution configures a waste of time and resources. In this article, we present FReeP—Feature Recommender from Preferences, a parameter value recommendation method that is designed to suggest values for workflow parameters, taking into account past user preferences. FReeP is based on Machine Learning techniques, particularly in Preference Learning. FReeP is composed of three algorithms, where two of them aim at recommending the value for one parameter at a time, and the third makes recommendations for n parameters at once. The experimental results obtained with provenance data from two broadly used workflows showed FReeP usefulness in the recommendation of values for one parameter. Furthermore, the results indicate the potential of FReeP to recommend values for n parameters in scientific workflows.

Download Full-text

Combination of effective machine learning techniques and chemometric analysis for evaluation of Bupleuri Radix through high-performance thin-layer chromatography

Analytical Methods ◽

10.1039/c3ay41132j ◽

2013 ◽

Vol 5 (22) ◽

pp. 6325 ◽

Cited By ~ 3

Author(s):

Xiaoping Cheng ◽

Hongmin Cai ◽

Ping He ◽

Yue Zhang ◽

Runtiao Tian

Keyword(s):

Machine Learning ◽

Thin Layer ◽

Thin Layer Chromatography ◽

High Performance ◽

Chemometric Analysis ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Layer Chromatography ◽

Bupleuri Radix

Download Full-text

COMPARATIVE ANALYSIS AND EVALUATION OF THE APPLICATION OF DEEP LEARNING TECHNIQUES TO CYBERSECURITY DATASETS

DYNA INGENIERIA E INDUSTRIA ◽

10.6036/10007 ◽

2021 ◽

Vol 96 (5) ◽

pp. 528-533

Author(s):

XAVIER LARRIVA NOVO ◽

MARIO VEGA BARBAS ◽

VICTOR VILLAGRA ◽

JULIO BERROCAL

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Performance ◽

New Technologies ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Short Term ◽

Term Memory ◽

Learning Techniques ◽

Long Short Term Memory

Cybersecurity has stood out in recent years with the aim of protecting information systems. Different methods, techniques and tools have been used to make the most of the existing vulnerabilities in these systems. Therefore, it is essential to develop and improve new technologies, as well as intrusion detection systems that allow detecting possible threats. However, the use of these technologies requires highly qualified cybersecurity personnel to analyze the results and reduce the large number of false positives that these technologies presents in their results. Therefore, this generates the need to research and develop new high-performance cybersecurity systems that allow efficient analysis and resolution of these results. This research presents the application of machine learning techniques to classify real traffic, in order to identify possible attacks. The study has been carried out using machine learning tools applying deep learning algorithms such as multi-layer perceptron and long-short-term-memory. Additionally, this document presents a comparison between the results obtained by applying the aforementioned algorithms and algorithms that are not deep learning, such as: random forest and decision tree. Finally, the results obtained are presented, showing that the long-short-term-memory algorithm is the one that provides the best results in relation to precision and logarithmic loss.

Download Full-text

Network Anomaly Detection Using Machine Learning Techniques

Proceedings ◽

10.3390/proceedings2020054008 ◽

2020 ◽

Vol 54 (1) ◽

pp. 8

Author(s):

Julio J. Estévez-Pereira ◽

Diego Fernández ◽

Francisco J. Novoa

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Machine Learning Techniques ◽

Security Threats ◽

Learning Techniques ◽

Distribution Phase ◽

The One ◽

Network Anomaly Detection ◽

And Storage

While traditional network security methods have been proven useful until now, the flexibility of machine learning techniques makes them a solid candidate in the current scene of our networks. In this paper, we assess how well the latter are capable of detecting security threats in a corporative network. To that end, we configure and compare several models to find the one which fits better with our needs. Furthermore, we distribute the computational load and storage so we can handle extensive volumes of data. The algorithms that we use to create our models, Random Forest, Naive Bayes, and Deep Neural Networks (DNN), are both divergent and tested in other papers in order to make our comparison richer. For the distribution phase, we operate with Apache Structured Streaming, PySpark, and MLlib. As for the results, it is relevant to mention that our dataset has been found to be effectively modelable with just a reduced number of features. Finally, given the outcomes obtained, we find this line of research encouraging and, therefore, this approach worth pursuing.

Download Full-text

Machine Learning in Agriculture Application: Algorithms and Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f3713.049620 ◽

2020 ◽

Vol 9 (6) ◽

pp. 1140-1146

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Performance ◽

Network Models ◽

Processing Technique ◽

Machine Learning Techniques ◽

Image Processing Technique ◽

Neural Network Models ◽

Survey Paper ◽

Learning Techniques

Machine learning techniques with high performance computing technologies can create various new opportunities in the agriculture domain. This paper does comprehensivereview of various papers which are concentrating on machine learning (ML) and deep learning application in agriculture. This paper is categorized into three sections a) Yield prediction using machine learning technique b) Price prediction c) Leaf disease detection using neural networks. In this paper we study the comparison of neural network models with existing models. The findings of this survey paper indicate Deep learning models give high accuracy and outperform traditional image processing technique and ML techniques outperforms various traditional techniques in prediction.

Download Full-text

Assessment of compressive strength of Ultra-high Performance Concrete using deep machine learning techniques

Applied Soft Computing ◽

10.1016/j.asoc.2020.106552 ◽

2020 ◽

Vol 95 ◽

pp. 106552

Author(s):

Omar R. Abuodeh ◽

Jamal A. Abdalla ◽

Rami A. Hawileh

Keyword(s):

Machine Learning ◽

Compressive Strength ◽

High Performance ◽

High Performance Concrete ◽

Machine Learning Techniques ◽

Ultra High Performance Concrete ◽

Learning Techniques

Download Full-text

A MapReduce Based High Performance Neural Network in Enabling Fast Stability Assessment of Power Systems

Mathematical Problems in Engineering ◽

10.1155/2017/4030146 ◽

2017 ◽

Vol 2017 ◽

pp. 1-12 ◽

Cited By ~ 4

Author(s):

Yang Liu ◽

Youbo Liu ◽

Junyong Liu ◽

Maozhen Li ◽

Tingjian Liu ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Power Systems ◽

High Performance ◽

Iteration Process ◽

Machine Learning Techniques ◽

Stability Assessment ◽

Ensemble Techniques ◽

The Neural Network ◽

Learning Techniques

Transient stability assessment is playing a vital role in modern power systems. For this purpose, machine learning techniques have been widely employed to find critical conditions and recognize transient behaviors based on massive data analysis. However, an ever increasing volume of data generated from power systems poses a number of challenges to traditional machine learning techniques, which are computationally intensive running on standalone computers. This paper presents a MapReduce based high performance neural network to enable fast stability assessment of power systems. Hadoop, which is an open-source implementation of the MapReduce model, is first employed to parallelize the neural network. The parallel neural network is further enhanced with HaLoop to reduce the computation overhead incurred in the iteration process of the neural network. In addition, ensemble techniques are employed to accommodate the accuracy loss of the parallelized neural network in classification. The parallelized neural network is evaluated with both the IEEE 68-node system and a real power system from the aspects of computation speedup and stability assessment.

Download Full-text

Machine Learning Techniques for THz Imaging and Time-Domain Spectroscopy

Sensors ◽

10.3390/s21041186 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1186

Author(s):

Hochong Park ◽

Joo-Hiuk Son

Keyword(s):

Machine Learning ◽

Time Domain ◽

High Performance ◽

State Of The Art ◽

Terahertz Imaging ◽

Machine Learning Techniques ◽

Time Domain Spectroscopy ◽

Learning Techniques ◽

New Learning ◽

Modeling Techniques

Terahertz imaging and time-domain spectroscopy have been widely used to characterize the properties of test samples in various biomedical and engineering fields. Many of these tasks require the analysis of acquired terahertz signals to extract embedded information, which can be achieved using machine learning. Recently, machine learning techniques have developed rapidly, and many new learning models and learning algorithms have been investigated. Therefore, combined with state-of-the-art machine learning techniques, terahertz applications can be performed with high performance that cannot be achieved using modeling techniques that precede the machine learning era. In this review, we introduce the concept of machine learning and basic machine learning techniques and examine the methods for performance evaluation. We then summarize representative examples of terahertz imaging and time-domain spectroscopy that are conducted using machine learning.

Download Full-text

Computation of High-Performance Concrete Compressive Strength Using Standalone and Ensembled Machine Learning Techniques

Materials ◽

10.3390/ma14227034 ◽

2021 ◽

Vol 14 (22) ◽

pp. 7034

Author(s):

Yue Xu ◽

Waqas Ahmad ◽

Ayaz Ahmad ◽

Krzysztof Adam Ostrowski ◽

Marta Dudek ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Support Vector Regression ◽

High Performance ◽

Cross Validation ◽

High Performance Concrete ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques ◽

Fold Cross Validation

The current trend in modern research revolves around novel techniques that can predict the characteristics of materials without consuming time, effort, and experimental costs. The adaptation of machine learning techniques to compute the various properties of materials is gaining more attention. This study aims to use both standalone and ensemble machine learning techniques to forecast the 28-day compressive strength of high-performance concrete. One standalone technique (support vector regression (SVR)) and two ensemble techniques (AdaBoost and random forest) were applied for this purpose. To validate the performance of each technique, coefficient of determination (R2), statistical, and k-fold cross-validation checks were used. Additionally, the contribution of input parameters towards the prediction of results was determined by applying sensitivity analysis. It was proven that all the techniques employed showed improved performance in predicting the outcomes. The random forest model was the most accurate, with an R2 value of 0.93, compared to the support vector regression and AdaBoost models, with R2 values of 0.83 and 0.90, respectively. In addition, statistical and k-fold cross-validation checks validated the random forest model as the best performer based on lower error values. However, the prediction performance of the support vector regression and AdaBoost models was also within an acceptable range. This shows that novel machine learning techniques can be used to predict the mechanical properties of high-performance concrete.

Download Full-text

Artificial Intelligence and Radiomics in Head and Neck Cancer Care: Opportunities, Mechanics, and Challenges

American Society of Clinical Oncology Educational Book ◽

10.1200/edbk_320951 ◽

2021 ◽

pp. 1-11

Author(s):

Lisanne V. van Dijk ◽

Clifton D. Fuller

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Head And Neck Cancer ◽

Head And Neck ◽

Neck Cancer ◽

High Performance ◽

Large Scale ◽

Machine Learning Techniques ◽

Clinical Implementation ◽

Learning Techniques

The advent of large-scale high-performance computing has allowed the development of machine-learning techniques in oncologic applications. Among these, there has been substantial growth in radiomics (machine-learning texture analysis of images) and artificial intelligence (which uses deep-learning techniques for “learning algorithms”); however, clinical implementation has yet to be realized at scale. To improve implementation, opportunities, mechanics, and challenges, models of imaging-enabled artificial intelligence approaches need to be understood by clinicians who make the treatment decisions. This article aims to convey the basic conceptual premises of radiomics and artificial intelligence using head and neck cancer as a use case. This educational overview focuses on approaches for head and neck oncology imaging, detailing current research efforts and challenges to implementation.

Download Full-text

High-performance detectors for epileptic seizure by machine learning techniques

10.47749/t/unicamp.2018.994563 ◽

2018 ◽

Author(s):

Fernando dos Santos Beserra

Keyword(s):

Machine Learning ◽

Epileptic Seizure ◽

High Performance ◽

Machine Learning Techniques ◽

Learning Techniques

Download Full-text