A Survey on Distributed Fibre Optic Sensor Data Modelling Techniques and Machine Learning Algorithms for Multiphase Fluid Flow Estimation

Hasan Asy’ari Arief; Tomasz Wiktorski; Peter James Thomas

doi:10.3390/s21082801

A Survey on Distributed Fibre Optic Sensor Data Modelling Techniques and Machine Learning Algorithms for Multiphase Fluid Flow Estimation

Sensors ◽

10.3390/s21082801 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2801

Author(s):

Hasan Asy’ari Arief ◽

Tomasz Wiktorski ◽

Peter James Thomas

Keyword(s):

Machine Learning ◽

Fluid Flow ◽

Data Analysis ◽

Real Time ◽

Machine Learning Algorithms ◽

Sensor Data ◽

Support Vector ◽

Measurement Technology ◽

Fibre Optic ◽

Multiphase Fluid

Real-time monitoring of multiphase fluid flows with distributed fibre optic sensing has the potential to play a major role in industrial flow measurement applications. One such application is the optimization of hydrocarbon production to maximize short-term income, and prolong the operational lifetime of production wells and the reservoir. While the measurement technology itself is well understood and developed, a key remaining challenge is the establishment of robust data analysis tools that are capable of providing real-time conversion of enormous data quantities into actionable process indicators. This paper provides a comprehensive technical review of the data analysis techniques for distributed fibre optic technologies, with a particular focus on characterizing fluid flow in pipes. The review encompasses classical methods, such as the speed of sound estimation and Joule-Thomson coefficient, as well as their data-driven machine learning counterparts, such as Convolutional Neural Network (CNN), Support Vector Machine (SVM), and Ensemble Kalman Filter (EnKF) algorithms. The study aims to help end-users establish reliable, robust, and accurate solutions that can be deployed in a timely and effective way, and pave the wave for future developments in the field.

Download Full-text

Classification of Children’s Sitting Postures Using Machine Learning Algorithms

Applied Sciences ◽

10.3390/app8081280 ◽

2018 ◽

Vol 8 (8) ◽

pp. 1280 ◽

Cited By ~ 14

Author(s):

Yong Kim ◽

Youngdoo Son ◽

Wonjoon Kim ◽

Byungki Jin ◽

Myung Yun

Keyword(s):

Neural Network ◽

Machine Learning ◽

Monitoring System ◽

Multinomial Logistic Regression ◽

Learning Algorithms ◽

Feedback System ◽

Machine Learning Algorithms ◽

Sensor Data ◽

Future Research ◽

Support Vector

Sitting on a chair in an awkward posture or sitting for a long period of time is a risk factor for musculoskeletal disorders. A postural habit that has been formed cannot be changed easily. It is important to form a proper postural habit from childhood as the lumbar disease during childhood caused by their improper posture is most likely to recur. Thus, there is a need for a monitoring system that classifies children’s sitting postures. The purpose of this paper is to develop a system for classifying sitting postures for children using machine learning algorithms. The convolutional neural network (CNN) algorithm was used in addition to the conventional algorithms: Naïve Bayes classifier (NB), decision tree (DT), neural network (NN), multinomial logistic regression (MLR), and support vector machine (SVM). To collect data for classifying sitting postures, a sensing cushion was developed by mounting a pressure sensor mat (8 × 8) inside children’s chair seat cushion. Ten children participated, and sensor data was collected by taking a static posture for the five prescribed postures. The accuracy of CNN was found to be the highest as compared with those of the other algorithms. It is expected that the comprehensive posture monitoring system would be established through future research on enhancing the classification algorithm and providing an effective feedback system.

Download Full-text

A Proposal of Implementation of Sitting Posture Monitoring System for Wheelchair Utilizing Machine Learning Methods

Sensors ◽

10.3390/s21196349 ◽

2021 ◽

Vol 21 (19) ◽

pp. 6349

Author(s):

Jawad Ahmad ◽

Johan Sidén ◽

Henrik Andersson

Keyword(s):

Machine Learning ◽

Pressure Distribution ◽

Real Time ◽

Monitoring System ◽

Pressure Ulcers ◽

Machine Learning Algorithms ◽

Raspberry Pi ◽

Support Vector ◽

Processing Unit ◽

Posture Recognition

This paper presents a posture recognition system aimed at detecting sitting postures of a wheelchair user. The main goals of the proposed system are to identify and inform irregular and improper posture to prevent sitting-related health issues such as pressure ulcers, with the potential that it could also be used for individuals without mobility issues. In the proposed monitoring system, an array of 16 screen printed pressure sensor units was employed to obtain pressure data, which are sampled and processed in real-time using read-out electronics. The posture recognition was performed for four sitting positions: right-, left-, forward- and backward leaning based on k-nearest neighbors (k-NN), support vector machines (SVM), random forest (RF), decision tree (DT) and LightGBM machine learning algorithms. As a result, a posture classification accuracy of up to 99.03 percent can be achieved. Experimental studies illustrate that the system can provide real-time pressure distribution value in the form of a pressure map on a standard PC and also on a raspberry pi system equipped with a touchscreen monitor. The stored pressure distribution data can later be shared with healthcare professionals so that abnormalities in sitting patterns can be identified by employing a post-processing unit. The proposed system could be used for risk assessments related to pressure ulcers. It may be served as a benchmark by recording and identifying individuals’ sitting patterns and the possibility of being realized as a lightweight portable health monitoring device.

Download Full-text

An Overview of Supervised Machine Learning Methods and Data Analysis for COVID-19 Detection

Journal of Healthcare Engineering ◽

10.1155/2021/4733167 ◽

2021 ◽

Vol 2021 ◽

pp. 1-18

Author(s):

Aurelle Tchagna Kouanou ◽

Thomas Mih Attia ◽

Cyrille Feudjio ◽

Anges Fleurio Djeumo ◽

Adèle Ngo Mouelas ◽

...

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

High Rate ◽

Supervised Machine Learning ◽

Polymerase Chain Reaction Test ◽

Support Vector ◽

Machine Learning Algorithm ◽

Test Results

Background and Objective. To mitigate the spread of the virus responsible for COVID-19, known as SARS-CoV-2, there is an urgent need for massive population testing. Due to the constant shortage of PCR (polymerase chain reaction) test reagents, which are the tests for COVID-19 by excellence, several medical centers have opted for immunological tests to look for the presence of antibodies produced against this virus. However, these tests have a high rate of false positives (positive but actually negative test results) and false negatives (negative but actually positive test results) and are therefore not always reliable. In this paper, we proposed a solution based on Data Analysis and Machine Learning to detect COVID-19 infections. Methods. Our analysis and machine learning algorithm is based on most cited two clinical datasets from the literature: one from San Raffaele Hospital Milan Italia and the other from Hospital Israelita Albert Einstein São Paulo Brasilia. The datasets were processed to select the best features that most influence the target, and it turned out that almost all of them are blood parameters. EDA (Exploratory Data Analysis) methods were applied to the datasets, and a comparative study of supervised machine learning models was done, after which the support vector machine (SVM) was selected as the one with the best performance. Results. SVM being the best performant is used as our proposed supervised machine learning algorithm. An accuracy of 99.29%, sensitivity of 92.79%, and specificity of 100% were obtained with the dataset from Kaggle (https://www.kaggle.com/einsteindata4u/covid19) after applying optimization to SVM. The same procedure and work were performed with the dataset taken from San Raffaele Hospital (https://zenodo.org/record/3886927#.YIluB5AzbMV). Once more, the SVM presented the best performance among other machine learning algorithms, and 92.86%, 93.55%, and 90.91% for accuracy, sensitivity, and specificity, respectively, were obtained. Conclusion. The obtained results, when compared with others from the literature based on these same datasets, are superior, leading us to conclude that our proposed solution is reliable for the COVID-19 diagnosis.

Download Full-text

Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System

Complexity ◽

10.1155/2020/6688912 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Xiongwei Zhang ◽

Hager Saleh ◽

Eman M. G. Younis ◽

Radhya Sahal ◽

Abdelmgeid A. Ali

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Sentiment Analysis ◽

Real Time ◽

Machine Learning Algorithms ◽

Streaming Data ◽

Support Vector ◽

Real Time System ◽

Online Prediction ◽

Analysis Prediction

Twitter is a virtual social network where people share their posts and opinions about the current situation, such as the coronavirus pandemic. It is considered the most significant streaming data source for machine learning research in terms of analysis, prediction, knowledge extraction, and opinions. Sentiment analysis is a text analysis method that has gained further significance due to social networks’ emergence. Therefore, this paper introduces a real-time system for sentiment prediction on Twitter streaming data for tweets about the coronavirus pandemic. The proposed system aims to find the optimal machine learning model that obtains the best performance for coronavirus sentiment analysis prediction and then uses it in real-time. The proposed system has been developed into two components: developing an offline sentiment analysis and modeling an online prediction pipeline. The system has two components: the offline and the online components. For the offline component of the system, the historical tweets’ dataset was collected in duration 23/01/2020 and 01/06/2020 and filtered by #COVID-19 and #Coronavirus hashtags. Two feature extraction methods of textual data analysis were used, n-gram and TF-ID, to extract the dataset’s essential features, collected using coronavirus hashtags. Then, five regular machine learning algorithms were performed and compared: decision tree, logistic regression, k-nearest neighbors, random forest, and support vector machine to select the best model for the online prediction component. The online prediction pipeline was developed using Twitter Streaming API, Apache Kafka, and Apache Spark. The experimental results indicate that the RF model using the unigram feature extraction method has achieved the best performance, and it is used for sentiment prediction on Twitter streaming data for coronavirus.

Download Full-text

Comparison of common machine learning algorithms trained with multi-zone models for identifying the location and strength of indoor pollutant sources

Indoor and Built Environment ◽

10.1177/1420326x20931576 ◽

2020 ◽

pp. 1420326X2093157

Author(s):

Yu Huang ◽

Zhi Gao ◽

Hongguang Zhang

Keyword(s):

Machine Learning ◽

Meteorological Parameters ◽

Human Life ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Identification Accuracy ◽

Sensor Data ◽

Support Vector ◽

Accurate Identification ◽

Pollutant Sources

The accurate identification of the characteristics of pollutant sources can effectively prevent the loss of human life and property damage caused by the sudden release of harmful chemicals in emergency situations. Machine learning algorithms, artificial neural network (ANN), support vector machine (SVM), k-nearest neighbour (KNN) and naive Bayesian (NB) classification can be used to identify the location of pollutant sources with limited sensor data inputs. In this study, the identification accuracy of the four above-mentioned machine learning algorithms was investigated and compared, considering the different sensor layouts, eigenvector inputs, meteorological parameters and number of samples. The results show that the collection of pollutant concentrations over an extended period of time could improve identification accuracy. Additional sensors were required to reach the same identification accuracy after the introduction of distributed meteorological parameters. Increasing the number of trained samples by a factor of five improved the identification accuracy of KNN by 22% and that of SVM by 1.7%; however, ANN and NB classification remained basically unchanged. When identifying the release mass of the pollutant source, multiple linear, ANN and SVM regression models were adopted. Results show that ANN performs best, whereas SVM provides the least optimal performance.

Download Full-text

Invisible experience to real-time assessment in elite tennis athlete training: Sport-specific movement classification based on wearable MEMS sensor data

Proceedings of the Institution of Mechanical Engineers Part P Journal of Sports Engineering and Technology ◽

10.1177/17543371211050312 ◽

2021 ◽

pp. 175433712110503

Author(s):

Mingyue Wu ◽

Ran Wang ◽

Yang Hu ◽

Mengjiao Fan ◽

Yufan Wang ◽

...

Keyword(s):

Machine Learning ◽

Real Time ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Test Accuracy ◽

Z Score ◽

Mems Sensor ◽

Score Normalization

This study examined the reliability of a tennis stroke classification and assessment platform consisting of a single low-cost MEMS sensor in a wrist-worn wearable device, smartphone, and computer. The data that was collected was transmitted via Bluetooth and analyzed by machine learning algorithms. Twelve right-handed male elite tennis athletes participated in the study, and each athlete performed 150 strokes. The results from three machine learning algorithms regarding their recognition and classification of the real-time data stream were compared. Stroke recognition and classification went through pre-processing, segmentation, feature extraction, and classification with Support Vector Machine (SVM), including SVM without normalization, SVM with Min–Max, SVM with Z-score normalization, K-nearest neighbor (K-NN), and Naive Bayes (NB) machine learning algorithms. During the data training process, 10-fold cross-validation was used to avoid overfitting and suitable parameters were found within the SVM classifiers. The best classifier was achieved when C = 1 using the RBF kernel function. Different machine learning algorithms’ classification of unique stroke types yielded highly reliable clusters within each stroke type with the highest test accuracy of 99% achieved by SVM with Min–Max normalization and 98.4% achieved using SVM with a Z-score normalization classifier.

Download Full-text

Application of Bayesian Learning Mechanism in Power System Transient Stability Assessment

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.108-111.765 ◽

2010 ◽

Vol 108-111 ◽

pp. 765-770

Author(s):

Lin Niu ◽

Jian Guo Zhao ◽

Ke Jun Li ◽

Zhen Yu Zhou

Keyword(s):

Machine Learning ◽

Power System ◽

Real Time ◽

Bayesian Learning ◽

Transient Stability ◽

Decision Function ◽

Machine Learning Algorithms ◽

Support Vector ◽

Svm Classifier ◽

Stability Assessment

One of the most challenging problems in real-time operation of power system is the prediction of transient stability. Fast and accurate techniques are imperative to achieve on-line transient stability assessment (TSA). This problem has been approached by various machine learning algorithms, however they find a class decision estimate rather than a probabilistic confidence of the class distribution. To counter the shortcoming of common machine learning methods, a novel machine learning technique, i.e. ‘relevance vector machine’ (RVM), for TSA is presented in this paper. RVM is based on a probabilistic Bayesian learning framework, and as a feature it can yield a decision function that depends on only a very fewer number of so-called relevance vectors. The proposed method is tested on New England power system, and compared with a state-of-the-art ‘support vector machine’ (SVM) classifier. The classification performance is evaluated using false discriminate rate (FDR). It is demonstrated that the RVM classifier can yield a decision function that is much sparser than the SVM classifier while providing higher classification accuracy. Consequently, the RVM classifier greatly reduces the computational complexity, making it more suitable for real-time implementation.

Download Full-text

Sustainable Irrigation System for Farming Supported by Machine Learning and Real-Time Sensor Data

Sensors ◽

10.3390/s21093079 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3079

Author(s):

André Glória ◽

João Cardoso ◽

Pedro Sebastião

Keyword(s):

Machine Learning ◽

Random Forest ◽

Real Time ◽

New Technologies ◽

Irrigation System ◽

Low Cost ◽

Time Of Day ◽

Machine Learning Algorithms ◽

Sensor Data ◽

Sensors And Actuators

Presently, saving natural resources is increasingly a concern, and water scarcity is a fact that has been occurring in more areas of the globe. One of the main strategies used to counter this trend is the use of new technologies. On this topic, the Internet of Things has been highlighted, with these solutions being characterized by offering robustness and simplicity, while being low cost. This paper presents the study and development of an automatic irrigation control system for agricultural fields. The developed solution had a wireless sensors and actuators network, a mobile application that offers the user the capability of consulting not only the data collected in real time but also their history and also act in accordance with the data it analyses. To adapt the water management, Machine Learning algorithms were studied to predict the best time of day for water administration. Of the studied algorithms (Decision Trees, Random Forest, Neural Networks, and Support Vectors Machines) the one that obtained the best results was Random Forest, presenting an accuracy of 84.6%. Besides the ML solution, a method was also developed to calculate the amount of water needed to manage the fields under analysis. Through the implementation of the system it was possible to realize that the developed solution is effective and can achieve up to 60% of water savings.

Download Full-text

Machine Learning Based IoT Geriatric Fall Intelligent System in Pandemic (Preprint)

10.2196/preprints.34538 ◽

2021 ◽

Author(s):

Gowri R ◽

Rathipriya R

Keyword(s):

Machine Learning ◽

Blood Pressure ◽

Heart Rate ◽

Real Time ◽

Old Age ◽

Sensor Data ◽

Stochastic Gradient Descent ◽

Support Vector ◽

The Real ◽

Early Fall

UNSTRUCTURED In the current pandemic, there is lack of medical care takers and physicians in hospitals and health centers. The patients other than COVID infected are also affected by this scenario. Besides, the hospitals are also not admitting the old age peoples, and they are scared to approach hospitals even for their basic health checkups. But, they have to be cared and monitored to avoid the risk factors like fall incidence which may cause fatal injury. In such a case, this paper focuses on the cloud based IoT gadget for early fall incidence prediction. It is machine learning based fall incidence prediction system for the old age patients. The approaches such as Logistic Regression, Naive Bayes, Stochastic Gradient Descent, Decision Tree, Random Forest, Support Vector Machines, K-Nearest Neighbor and ensemble learning boosting techniques, i.e., XGBoost are used for fall incidence prediction. The proposed approach is first tested on the benchmark activity sensor data with different features for training purpose. The real-time vital signs like heart rate, blood pressure are recorded and stored in cloud and the machine learning approaches are applied to it. Then tested on the real-time sensor data like heart rate and blood pressure data of geriatric patients to predict early fall.

Download Full-text

Machine Learning based Improved Gaussian Mixture Model for IoT Real-Time Data Analysis

Ingeniería solidaria ◽

10.16925/2357-6014.2020.01.02 ◽

2020 ◽

Vol 16 (1) ◽

Author(s):

Sivadi Sivadi ◽

Moorthy Moorthy ◽

Vijender Solanki

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Real Time ◽

Gaussian Mixture Model ◽

Gaussian Mixture ◽

Sensor Data ◽

Cloud Platform ◽

Time Data ◽

Huge Amount ◽

Real Time Data

Introduction: The article is the product of the research “Due to the increase in popularity of Internet of Things (IoT), a huge amount of sensor data is being generated from various smart city applications”, developed at Pondicherry University in the year 2019. Problem:To acquire and analyze the huge amount of sensor-generated data effectively is a significant problem when processing the data. Objective: To propose a novel framework for IoT sensor data analysis using machine learning based improved Gaussian Mixture Model (GMM) by acquired real-time data. Methodology:In this paper, the clustering based GMM models are used to find the density patterns on a daily or weekly basis for user requirements. The ThingSpeak cloud platform used for performing analysis and visualizations. Results:An analysis has been performed on the proposed mechanism implemented on real-time traffic data with Accuracy, Precision, Recall, and F-Score as measures. Conclusions:The results indicate that the proposed mechanism is efficient when compared with the state-of-the-art schemes. Originality:Applying GMM and ThingSpeak Cloud platform to perform analysis on IoT real-time data is the first approach to find traffic density patterns on busy roads. Restrictions:There is a need to develop the application for mobile users to find the optimal traffic routes based on density patterns. The authors could not concentrate on the security aspect for finding density patterns.

Download Full-text