Machine Learning Approaches for Outdoor Air Quality Modelling: A Systematic Review

Yves Rybarczyk; Rasa Zalakeviciute

doi:10.3390/app8122570

Machine Learning Approaches for Outdoor Air Quality Modelling: A Systematic Review

Applied Sciences ◽

10.3390/app8122570 ◽

2018 ◽

Vol 8 (12) ◽

pp. 2570 ◽

Cited By ~ 23

Author(s):

Yves Rybarczyk ◽

Rasa Zalakeviciute

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Deterministic Models ◽

Learning Techniques ◽

Vector Machines ◽

Estimation Problems ◽

Selection Of

Current studies show that traditional deterministic models tend to struggle to capture the non-linear relationship between the concentration of air pollutants and their sources of emission and dispersion. To tackle such a limitation, the most promising approach is to use statistical models based on machine learning techniques. Nevertheless, it is puzzling why a certain algorithm is chosen over another for a given task. This systematic review intends to clarify this question by providing the reader with a comprehensive description of the principles underlying these algorithms and how they are applied to enhance prediction accuracy. A rigorous search that conforms to the PRISMA guideline is performed and results in the selection of the 46 most relevant journal papers in the area. Through a factorial analysis method these studies are synthetized and linked to each other. The main findings of this literature review show that: (i) machine learning is mainly applied in Eurasian and North American continents and (ii) estimation problems tend to implement Ensemble Learning and Regressions, whereas forecasting make use of Neural Networks and Support Vector Machines. The next challenges of this approach are to improve the prediction of pollution peaks and contaminants recently put in the spotlights (e.g., nanoparticles).

Download Full-text

Prediction of the Hardness of Cu-Ti-Co Alloy Using Machine Learning Techniques

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.777.372 ◽

2018 ◽

Vol 777 ◽

pp. 372-376 ◽

Cited By ~ 1

Author(s):

Shan Feng Fang

Keyword(s):

Machine Learning ◽

Copper Alloys ◽

Least Square ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Forecasting Accuracy ◽

Learning Techniques ◽

Vector Machines ◽

Forecasting Performance

Diverse machine learning approaches were employed to build regression models for predicting mechanical property of Cu-Ti-Co alloy. The forecasting performance of the least-square support vector machines (LSSVM) model has been compared with other artificial intelligence methods such as GRNN, RBF-PLS and RBFNN. The models were developed and validated utilizing a cross-validation (CV) procedure to improve the forecasting accuracy and generalization ability. The result demonstrates that the generalization performance of the new LSSVM is slightly better or superior to those acquired using GRNN, RBF-PLS and RBFNN. In future, it would be expected that the relatively new model based on machine learning is used as an especially helpful implement to accelerate materials design of copper alloys.

Download Full-text

Predictive modeling for peri-implantitis by using machine learning techniques

Scientific Reports ◽

10.1038/s41598-021-90642-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Tomoaki Mameno ◽

Masahiro Wada ◽

Kazunori Nozaki ◽

Toshihito Takahashi ◽

Yoshitaka Tsujioka ◽

...

Keyword(s):

Machine Learning ◽

Demographic Data ◽

Risk Indicators ◽

Machine Learning Techniques ◽

Support Vector ◽

Machine Learning Methods ◽

Complex Interactions ◽

Learning Techniques ◽

Increased Risk ◽

Vector Machines

AbstractThe purpose of this retrospective cohort study was to create a model for predicting the onset of peri-implantitis by using machine learning methods and to clarify interactions between risk indicators. This study evaluated 254 implants, 127 with and 127 without peri-implantitis, from among 1408 implants with at least 4 years in function. Demographic data and parameters known to be risk factors for the development of peri-implantitis were analyzed with three models: logistic regression, support vector machines, and random forests (RF). As the results, RF had the highest performance in predicting the onset of peri-implantitis (AUC: 0.71, accuracy: 0.70, precision: 0.72, recall: 0.66, and f1-score: 0.69). The factor that had the most influence on prediction was implant functional time, followed by oral hygiene. In addition, PCR of more than 50% to 60%, smoking more than 3 cigarettes/day, KMW less than 2 mm, and the presence of less than two occlusal supports tended to be associated with an increased risk of peri-implantitis. Moreover, these risk indicators were not independent and had complex effects on each other. The results of this study suggest that peri-implantitis onset was predicted in 70% of cases, by RF which allows consideration of nonlinear relational data with complex interactions.

Download Full-text

Heart disease prediction using machine learning techniques : a survey

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.8.10557 ◽

2018 ◽

Vol 7 (2.8) ◽

pp. 684 ◽

Cited By ~ 12

Author(s):

V V. Ramalingam ◽

Ayantan Dandapath ◽

M Karthik Raja

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Complex Data ◽

Learning Techniques ◽

Vector Machines ◽

Supervised Learning Algorithms ◽

Life Threatening

Heart related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need of reliable, accurate and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart related diseases. This paper presents a survey of various models based on such algorithms and techniques andanalyze their performance. Models based on supervised learning algorithms such as Support Vector Machines (SVM), K-Nearest Neighbour (KNN), NaïveBayes, Decision Trees (DT), Random Forest (RF) and ensemble models are found very popular among the researchers.

Download Full-text

Machine Learning Frameworks in Cancer Detection

E3S Web of Conferences ◽

10.1051/e3sconf/202129701073 ◽

2021 ◽

Vol 297 ◽

pp. 01073

Author(s):

Sabyasachi Pramanik ◽

K. Martin Sagayam ◽

Om Prakash Jena

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Cancer Development ◽

Support Vector ◽

Learning Approaches ◽

Learning Techniques ◽

Fact Finding ◽

Risk Of Cancer

Cancer has been described as a diverse illness with several distinct subtypes that may occur simultaneously. As a result, early detection and forecast of cancer types have graced essentially in cancer fact-finding methods since they may help to improve the clinical treatment of cancer survivors. The significance of categorizing cancer suffers into higher or lower-threat categories has prompted numerous fact-finding associates from the bioscience and genomics field to investigate the utilization of machine learning (ML) algorithms in cancer diagnosis and treatment. Because of this, these methods have been used with the goal of simulating the development and treatment of malignant diseases in humans. Furthermore, the capacity of machine learning techniques to identify important characteristics from complicated datasets demonstrates the significance of these technologies. These technologies include Bayesian networks and artificial neural networks, along with a number of other approaches. Decision Trees and Support Vector Machines which have already been extensively used in cancer research for the creation of predictive models, also lead to accurate decision making. The application of machine learning techniques may undoubtedly enhance our knowledge of cancer development; nevertheless, a sufficient degree of validation is required before these approaches can be considered for use in daily clinical practice. An overview of current machine learning approaches utilized in the simulation of cancer development is presented in this paper. All of the supervised machine learning approaches described here, along with a variety of input characteristics and data samples, are used to build the prediction models. In light of the increasing trend towards the use of machine learning methods in biomedical research, we offer the most current papers that have used these approaches to predict risk of cancer or patient outcomes in order to better understand cancer.

Download Full-text

A Review of Machine Learning Techniques for Anomaly Detection in Static Graphs

Implementing Computational Intelligence Techniques for Security Systems Design - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-2418-3.ch007 ◽

2020 ◽

pp. 146-162

Author(s):

Hesham M. Al-Ammal

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Anomaly Detection ◽

Real Life ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Methods ◽

Data Set ◽

Learning Techniques ◽

Vector Machines

Detection of anomalies in a given data set is a vital step in several applications in cybersecurity; including intrusion detection, fraud, and social network analysis. Many of these techniques detect anomalies by examining graph-based data. Analyzing graphs makes it possible to capture relationships, communities, as well as anomalies. The advantage of using graphs is that many real-life situations can be easily modeled by a graph that captures their structure and inter-dependencies. Although anomaly detection in graphs dates back to the 1990s, recent advances in research utilized machine learning methods for anomaly detection over graphs. This chapter will concentrate on static graphs (both labeled and unlabeled), and the chapter summarizes some of these recent studies in machine learning for anomaly detection in graphs. This includes methods such as support vector machines, neural networks, generative neural networks, and deep learning methods. The chapter will reflect the success and challenges of using these methods in the context of graph-based anomaly detection.

Download Full-text

A Machine Learning View on Momentum and Reversal Trading

Algorithms ◽

10.3390/a11110170 ◽

2018 ◽

Vol 11 (11) ◽

pp. 170 ◽

Cited By ~ 2

Author(s):

Zhixi Li ◽

Vincent Tam

Keyword(s):

Neural Network ◽

Machine Learning ◽

Stock Market ◽

Short Term Memory ◽

Predictive Ability ◽

Trading Strategies ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Learning Techniques

Momentum and reversal effects are important phenomena in stock markets. In academia, relevant studies have been conducted for years. Researchers have attempted to analyze these phenomena using statistical methods and to give some plausible explanations. However, those explanations are sometimes unconvincing. Furthermore, it is very difficult to transfer the findings of these studies to real-world investment trading strategies due to the lack of predictive ability. This paper represents the first attempt to adopt machine learning techniques for investigating the momentum and reversal effects occurring in any stock market. In the study, various machine learning techniques, including the Decision Tree (DT), Support Vector Machine (SVM), Multilayer Perceptron Neural Network (MLP), and Long Short-Term Memory Neural Network (LSTM) were explored and compared carefully. Several models built on these machine learning approaches were used to predict the momentum or reversal effect on the stock market of mainland China, thus allowing investors to build corresponding trading strategies. The experimental results demonstrated that these machine learning approaches, especially the SVM, are beneficial for capturing the relevant momentum and reversal effects, and possibly building profitable trading strategies. Moreover, we propose the corresponding trading strategies in terms of market states to acquire the best investment returns.

Download Full-text

A Systematic Review of Machine Learning Techniques in Hematopoietic Stem Cell Transplantation (HSCT)

Sensors ◽

10.3390/s20216100 ◽

2020 ◽

Vol 20 (21) ◽

pp. 6100

Author(s):

Vibhuti Gupta ◽

Thomas M. Braun ◽

Mosharaf Chowdhury ◽

Muneesh Tewari ◽

Sung Won Choi

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Cell Transplantation ◽

Hematopoietic Cell Transplantation ◽

Hematopoietic Cell ◽

Machine Learning Techniques ◽

Support Vector ◽

Hematopoietic Stem ◽

Curative Therapy ◽

Learning Techniques

Machine learning techniques are widely used nowadays in the healthcare domain for the diagnosis, prognosis, and treatment of diseases. These techniques have applications in the field of hematopoietic cell transplantation (HCT), which is a potentially curative therapy for hematological malignancies. Herein, a systematic review of the application of machine learning (ML) techniques in the HCT setting was conducted. We examined the type of data streams included, specific ML techniques used, and type of clinical outcomes measured. A systematic review of English articles using PubMed, Scopus, Web of Science, and IEEE Xplore databases was performed. Search terms included “hematopoietic cell transplantation (HCT),” “autologous HCT,” “allogeneic HCT,” “machine learning,” and “artificial intelligence.” Only full-text studies reported between January 2015 and July 2020 were included. Data were extracted by two authors using predefined data fields. Following PRISMA guidelines, a total of 242 studies were identified, of which 27 studies met the inclusion criteria. These studies were sub-categorized into three broad topics and the type of ML techniques used included ensemble learning (63%), regression (44%), Bayesian learning (30%), and support vector machine (30%). The majority of studies examined models to predict HCT outcomes (e.g., survival, relapse, graft-versus-host disease). Clinical and genetic data were the most commonly used predictors in the modeling process. Overall, this review provided a systematic review of ML techniques applied in the context of HCT. The evidence is not sufficiently robust to determine the optimal ML technique to use in the HCT setting and/or what minimal data variables are required.

Download Full-text

Twitter sentiment analysis for the estimation of voting intention in the 2017 Chilean elections

Intelligent Data Analysis ◽

10.3233/ida-194768 ◽

2020 ◽

Vol 24 (5) ◽

pp. 1141-1160

Author(s):

Tomás Alegre Sepúlveda ◽

Brian Keith Norambuena

Keyword(s):

Machine Learning ◽

Support Vector Machines ◽

Sentiment Analysis ◽

Classification Model ◽

Machine Learning Techniques ◽

Support Vector ◽

Traditional Methods ◽

Actual Result ◽

Learning Techniques ◽

Vector Machines

In this paper, we apply sentiment analysis methods in the context of the first round of the 2017 Chilean elections. The purpose of this work is to estimate the voting intention associated with each candidate in order to contrast this with the results from classical methods (e.g., polls and surveys). The data are collected from Twitter, because of its high usage in Chile and in the sentiment analysis literature. We obtained tweets associated with the three main candidates: Sebastián Piñera (SP), Alejandro Guillier (AG) and Beatriz Sánchez (BS). For each candidate, we estimated the voting intention and compared it to the traditional methods. To do this, we first acquired the data and labeled the tweets as positive or negative. Afterward, we built a model using machine learning techniques. The classification model had an accuracy of 76.45% using support vector machines, which yielded the best model for our case. Finally, we use a formula to estimate the voting intention from the number of positive and negative tweets for each candidate. For the last period, we obtained a voting intention of 35.84% for SP, compared to a range of 34–44% according to traditional polls and 36% in the actual elections. For AG we obtained an estimate of 37%, compared with a range of 15.40% to 30.00% for traditional polls and 20.27% in the elections. For BS we obtained an estimate of 27.77%, compared with the range of 8.50% to 11.00% given by traditional polls and an actual result of 22.70% in the elections. These results are promising, in some cases providing an estimate closer to reality than traditional polls. Some differences can be explained due to the fact that some candidates have been omitted, even though they held a significant number of votes.

Download Full-text

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

Computers ◽

10.3390/computers8010004 ◽

2019 ◽

Vol 8 (1) ◽

pp. 4 ◽

Cited By ~ 4

Author(s):

Jurgita Kapočiūtė-Dzikienė ◽

Robertas Damaševičius ◽

Marcin Woźniak

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Full Dataset ◽

Learning Techniques ◽

Long Short Term Memory

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets.

Download Full-text

Classification of volcanic and tectonic earthquakes in Kamchatka (Russia) with different machine learning techniques

10.5194/egusphere-egu2020-755 ◽

2020 ◽

Author(s):

Natalia Galina ◽

Nikolai Shapiro ◽

Leonard Seydoux ◽

Dmitry Droznin

Keyword(s):

Machine Learning ◽

Subduction Zone ◽

Aftershock Sequence ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Shiveluch Volcano ◽

Agglomerative Clustering ◽

Learning Techniques ◽

Single Station

<p>Kamchatka is an active subduction zone that exhibits intense seismic and volcanic activities. As a consequence, tectonic and volcanic earthquakes are often nearly simultaneously recorded at the same station. In this work, we consider seismograms recorded between December 2018 and April 2019. During this time period when the M=7.3 earthquake followed by an aftershock sequence occurred nearly simultaneously with a strong eruption of Shiveluch volcano. As a result, stations of the Kamchatka seismic monitoring network recorded up to several hundreds of earthquakes per day. In total, we detected almost 7000 events of different origin using a simple automatic detection algorithm based on signal envelope amplitudes. Then, for each detection different features have been extracted. We started from simple signal parameters (amplitude, duration, peak frequency, etc.), unsmoothed and smoothed spectra and finally used a multi-dimensional signal decomposition (scattering coefficients). For events classification both unsupervised (K-means, agglomerative clustering) and supervised (Support Vector Classification, Random Forest) classic machine learning techniques were performed on all types of extracted features. Obtained results are quite stable and do not vary significantly depending on features and method choice. As a result, the machine learning approaches allow us to clearly separate tectonic subduction-zone earthquakes and those associated with the Shiveluch volcano eruptions based on data of a single station.</p>

Download Full-text