Bipartite Network of Interest (BNOI): Extending Co-Word Network with Interest of Researchers Using Sensor Data and Corresponding Applications as an Example

Zongming Dai; Kai Hu; Jie Xie; Shengyu Shen; Jie Zheng; Huayi Wu; Ya Guo

doi:10.3390/s21051668

Bipartite Network of Interest (BNOI): Extending Co-Word Network with Interest of Researchers Using Sensor Data and Corresponding Applications as an Example

Sensors ◽

10.3390/s21051668 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1668

Author(s):

Zongming Dai ◽

Kai Hu ◽

Jie Xie ◽

Shengyu Shen ◽

Jie Zheng ◽

...

Keyword(s):

Feature Fusion ◽

Extraction Methods ◽

Knowledge Network ◽

Sensor Data ◽

Support Vector ◽

Bipartite Network ◽

Classification Models ◽

Text Data ◽

Domain Experts ◽

Problems And Solutions

Traditional co-word networks do not discriminate keywords of researcher interest from general keywords. Co-word networks are therefore often too general to provide knowledge if interest to domain experts. Inspired by the recent work that uses an automatic method to identify the questions of interest to researchers like “problems” and “solutions”, we try to answer a similar question “what sensors can be used for what kind of applications”, which is great interest in sensor- related fields. By generalizing the specific questions as “questions of interest”, we built a knowledge network considering researcher interest, called bipartite network of interest (BNOI). Different from a co-word approaches using accurate keywords from a list, BNOI uses classification models to find possible entities of interest. A total of nine feature extraction methods including N-grams, Word2Vec, BERT, etc. were used to extract features to train the classification models, including naïve Bayes (NB), support vector machines (SVM) and logistic regression (LR). In addition, a multi-feature fusion strategy and a voting principle (VP) method are applied to assemble the capability of the features and the classification models. Using the abstract text data of 350 remote sensing articles, features are extracted and the models trained. The experiment results show that after removing the biased words and using the ten-fold cross-validation method, the F-measure of “sensors” and “applications” are 93.2% and 85.5%, respectively. It is thus demonstrated that researcher questions of interest can be better answered by the constructed BNOI based on classification results, comparedwith the traditional co-word network approach.

Download Full-text

Physical Activity Classification with Wearable Sensor Data in an eCoach Recommendation System (Preprint)

10.2196/preprints.35938 ◽

2021 ◽

Author(s):

Ayan Chatterjee

Keyword(s):

Physical Activity ◽

Transfer Learning ◽

Sedentary Lifestyle ◽

The Other ◽

Sensor Data ◽

Support Vector ◽

Accuracy Score ◽

Classification Models ◽

Linear Kernel ◽

Single Feature

UNSTRUCTURED Leading a sedentary lifestyle may cause numerous health problems. Therefore, sedentary lifestyle changes should be given priority to avoid severe damage. Research in eHealth can provide methods to enrich personal healthcare with Information and Communication Technologies (ICTs). An eCoach system may allow people to manage a healthy lifestyle with health state monitoring and personalized recommendations. Using machine learning (ML) techniques, this study investigated the possibility of classifying daily physical activity for adults into the following classes - sedentary, low active, active, active, highly active, and rigorous active. The daily total step count, total daily minutes of sedentary time, low physical activity (LPA), medium physical activity (MPA), and vigorous physical activity (VPA) served as input for the classification models. We first used publicly available Fitbit data to build the classification models. Second, using the transfer learning approach, we re-used the top five best-performing models on a real dataset as collected from the MOX2-5 wearable medical-grade activity sensor. We found that ensemble ExtraTreesClassifier with an estimator value of 150 outperformed other classifiers with a mean accuracy score of 99.72% for single feature and support vector classifier (SVC) with “linear” kernel outpaced other classifiers with a mean accuracy score of 99.14% for five features, for the public Fitbit datasets. To demonstrate the practical usefulness of the classifiers, we conceptualized how the classifier model can be used in an eCoach prototype system to attain personalized activity goals (e.g., stay active for the entire week). After transfer learning, K-Nearest-Neighbor (KNN) outpaced the other four classifiers for a single feature, and SVC with a “linear” kernel outdid the other four classifiers for multiple features.

Download Full-text

Prediction of Relative Physical Activity Intensity Using Multimodal Sensing of Physiological Data

Sensors ◽

10.3390/s19204509 ◽

2019 ◽

Vol 19 (20) ◽

pp. 4509 ◽

Cited By ~ 2

Author(s):

Alok Kumar Chowdhury ◽

Dian Tjondronegoro ◽

Vinod Chandran ◽

Jinglan Zhang ◽

Stewart G. Trost

Keyword(s):

Physical Activity ◽

Machine Learning ◽

Perceived Exertion ◽

Feature Fusion ◽

Electrodermal Activity ◽

Decision Fusion ◽

Sensor Data ◽

Physiological Data ◽

Support Vector ◽

Improve Performance

This study examined the feasibility of a non-laboratory approach that uses machine learning on multimodal sensor data to predict relative physical activity (PA) intensity. A total of 22 participants completed up to 7 PA sessions, where each session comprised 5 trials (sitting and standing, comfortable walk, brisk walk, jogging, running). Participants wore a wrist-strapped sensor that recorded heart-rate (HR), electrodermal activity (Eda) and skin temperature (Temp). After each trial, participants provided ratings of perceived exertion (RPE). Three classifiers, including random forest (RF), neural network (NN) and support vector machine (SVM), were applied independently on each feature set to predict relative PA intensity as low (RPE ≤ 11), moderate (RPE 12–14), or high (RPE ≥ 15). Then, both feature fusion and decision fusion of all combinations of sensor modalities were carried out to investigate the best combination. Among the single modality feature sets, HR provided the best performance. The combination of modalities using feature fusion provided a small improvement in performance. Decision fusion did not improve performance over HR features alone. A machine learning approach using features from HR provided acceptable predictions of relative PA intensity. Adding features from other sensing modalities did not significantly improve performance.

Download Full-text

A Transfer Learning Method for Meteorological Visibility Estimation Based on Feature Fusion Method

Applied Sciences ◽

10.3390/app11030997 ◽

2021 ◽

Vol 11 (3) ◽

pp. 997

Author(s):

Jiaping Li ◽

Wai Lun Lo ◽

Hong Fu ◽

Henry Shu Hung Chung

Keyword(s):

Feature Extraction ◽

Transfer Learning ◽

Feature Fusion ◽

Extraction Methods ◽

Features Extraction ◽

Image Feature ◽

Estimation Accuracy ◽

Support Vector ◽

Learning Method ◽

Visibility Estimation

Meteorological visibility is an important meteorological observation indicator to measure the weather transparency which is important for the transport safety. It is a challenging problem to estimate the visibilities accurately from the image characteristics. This paper proposes a transfer learning method for the meteorological visibility estimation based on image feature fusion. Different from the existing methods, the proposed method estimates the visibility based on the data processing and features’ extraction in the selected subregions of the whole image and therefore it had less computation load and higher efficiency. All the database images were gray-averaged firstly for the selection of effective subregions and features extraction. Effective subregions are extracted for static landmark objects which can provide useful information for visibility estimation. Four different feature extraction methods (Densest, ResNet50, Vgg16, and Vgg19) were used for the feature extraction of the subregions. The features extracted by the neural network were then imported into the proposed support vector regression (SVR) regression model, which derives the estimated visibilities of the subregions. Finally, based on the weight fusion of the visibility estimates from the subregion models, an overall comprehensive visibility was estimated for the whole image. Experimental results show that the visibility estimation accuracy is more than 90%. This method can estimate the visibility of the image, with high robustness and effectiveness.

Download Full-text

Using Sensor Data to Detect Lameness and Mastitis Treatment Events in Dairy Cows: A Comparison of Classification Models

Sensors ◽

10.3390/s20143863 ◽

2020 ◽

Vol 20 (14) ◽

pp. 3863 ◽

Cited By ~ 1

Author(s):

Christian Post ◽

Christian Rietz ◽

Wolfgang Büscher ◽

Ute Müller

Keyword(s):

Machine Learning ◽

Random Forest ◽

Dairy Cows ◽

Sensitivity And Specificity ◽

Training Data ◽

Sensor Data ◽

Support Vector ◽

Classification Models ◽

Sampled Data ◽

Learning Methods

The aim of this study was to develop classification models for mastitis and lameness treatments in Holstein dairy cows as the target variables based on continuous data from herd management software with modern machine learning methods. Data was collected over a period of 40 months from a total of 167 different cows with daily individual sensor information containing milking parameters, pedometer activity, feed and water intake, and body weight (in the form of differently aggregated data) as well as the entered treatment data. To identify the most important predictors for mastitis and lameness treatments, respectively, Random Forest feature importance, Pearson’s correlation and sequential forward feature selection were applied. With the selected predictors, various machine learning models such as Logistic Regression (LR), Support Vector Machine (SVM), K-nearest neighbors (KNN), Gaussian Naïve Bayes (GNB), Extra Trees Classifier (ET) and different ensemble methods such as Random Forest (RF) were trained. Their performance was compared using the receiver operator characteristic (ROC) area-under-curve (AUC), as well as sensitivity, block sensitivity and specificity. In addition, sampling methods were compared: Over- and undersampling as compensation for the expected unbalanced training data had a high impact on the ratio of sensitivity and specificity in the classification of the test data, but with regard to AUC, random oversampling and SMOTE (Synthetic Minority Over-sampling) even showed significantly lower values than with non-sampled data. The best model, ET, obtained a mean AUC of 0.79 for mastitis and 0.71 for lameness, respectively, based on testing data from practical conditions and is recommended by us for this type of data, but GNB, LR and RF were only marginally worse, and random oversampling and SMOTE even showed significantly lower values than without sampling. We recommend the use of these models as a benchmark for similar self-learning classification tasks. The classification models presented here retain their interpretability with the ability to present feature importances to the farmer in contrast to the “black box” models of Deep Learning methods.

Download Full-text

Fusion of smartphone sensor data for classification of daily user activities

Multimedia Tools and Applications ◽

10.1007/s11042-021-11105-6 ◽

2021 ◽

Author(s):

Gökhan Şengül ◽

Erol Ozcelik ◽

Sanjay Misra ◽

Robertas Damaševičius ◽

Rytis Maskeliūnas

Keyword(s):

Nearest Neighbor ◽

Feature Fusion ◽

Classification Performance ◽

Sensor Data ◽

Stochastic Gradient Descent ◽

Optimal Decision ◽

Support Vector ◽

K Nearest Neighbor ◽

Gradient Descent Algorithm ◽

User Activities

AbstractNew mobile applications need to estimate user activities by using sensor data provided by smart wearable devices and deliver context-aware solutions to users living in smart environments. We propose a novel hybrid data fusion method to estimate three types of daily user activities (being in a meeting, walking, and driving with a motorized vehicle) using the accelerometer and gyroscope data acquired from a smart watch using a mobile phone. The approach is based on the matrix time series method for feature fusion, and the modified Better-than-the-Best Fusion (BB-Fus) method with a stochastic gradient descent algorithm for construction of optimal decision trees for classification. For the estimation of user activities, we adopted a statistical pattern recognition approach and used the k-Nearest Neighbor (kNN) and Support Vector Machine (SVM) classifiers. We acquired and used our own dataset of 354 min of data from 20 subjects for this study. We report a classification performance of 98.32 % for SVM and 97.42 % for kNN.

Download Full-text

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div><br></div>

Download Full-text

Ablation Analysis to Select Wearable Sensors for Classifying Standing, Walking, and Running

Sensors ◽

10.3390/s21010194 ◽

2020 ◽

Vol 21 (1) ◽

pp. 194

Author(s):

Sarah Gonzalez ◽

Paul Stegall ◽

Harvey Edwards ◽

Leia Stirling ◽

Ho Chit Siu

Keyword(s):

Activity Recognition ◽

Principal Components ◽

Classification Accuracy ◽

Wearable Sensors ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Techniques ◽

Measurement Units ◽

The Difference

The field of human activity recognition (HAR) often utilizes wearable sensors and machine learning techniques in order to identify the actions of the subject. This paper considers the activity recognition of walking and running while using a support vector machine (SVM) that was trained on principal components derived from wearable sensor data. An ablation analysis is performed in order to select the subset of sensors that yield the highest classification accuracy. The paper also compares principal components across trials to inform the similarity of the trials. Five subjects were instructed to perform standing, walking, running, and sprinting on a self-paced treadmill, and the data were recorded while using surface electromyography sensors (sEMGs), inertial measurement units (IMUs), and force plates. When all of the sensors were included, the SVM had over 90% classification accuracy using only the first three principal components of the data with the classes of stand, walk, and run/sprint (combined run and sprint class). It was found that sensors that were placed only on the lower leg produce higher accuracies than sensors placed on the upper leg. There was a small decrease in accuracy when the force plates are ablated, but the difference may not be operationally relevant. Using only accelerometers without sEMGs was shown to decrease the accuracy of the SVM.

Download Full-text

Interpretable deep learning for the remote characterisation of ambulation in multiple sclerosis using smartphones

Scientific Reports ◽

10.1038/s41598-021-92776-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Andrew P. Creagh ◽

Florian Lipsmeier ◽

Michael Lindemann ◽

Maarten De Vos

Keyword(s):

Multiple Sclerosis ◽

Deep Learning ◽

Inertial Sensor ◽

Heterogeneous Data ◽

Fine Tuning ◽

Sensor Data ◽

Support Vector ◽

Deep Convolutional Neural Networks ◽

Healthcare Applications ◽

Feature Based

AbstractThe emergence of digital technologies such as smartphones in healthcare applications have demonstrated the possibility of developing rich, continuous, and objective measures of multiple sclerosis (MS) disability that can be administered remotely and out-of-clinic. Deep Convolutional Neural Networks (DCNN) may capture a richer representation of healthy and MS-related ambulatory characteristics from the raw smartphone-based inertial sensor data than standard feature-based methodologies. To overcome the typical limitations associated with remotely generated health data, such as low subject numbers, sparsity, and heterogeneous data, a transfer learning (TL) model from similar large open-source datasets was proposed. Our TL framework leveraged the ambulatory information learned on human activity recognition (HAR) tasks collected from wearable smartphone sensor data. It was demonstrated that fine-tuning TL DCNN HAR models towards MS disease recognition tasks outperformed previous Support Vector Machine (SVM) feature-based methods, as well as DCNN models trained end-to-end, by upwards of 8–15%. A lack of transparency of “black-box” deep networks remains one of the largest stumbling blocks to the wider acceptance of deep learning for clinical applications. Ensuing work therefore aimed to visualise DCNN decisions attributed by relevance heatmaps using Layer-Wise Relevance Propagation (LRP). Through the LRP framework, the patterns captured from smartphone-based inertial sensor data that were reflective of those who are healthy versus people with MS (PwMS) could begin to be established and understood. Interpretations suggested that cadence-based measures, gait speed, and ambulation-related signal perturbations were distinct characteristics that distinguished MS disability from healthy participants. Robust and interpretable outcomes, generated from high-frequency out-of-clinic assessments, could greatly augment the current in-clinic assessment picture for PwMS, to inform better disease management techniques, and enable the development of better therapeutic interventions.

Download Full-text

An Adversarial Generative Network for Crop Classification from Remote Sensing Timeseries Images

Remote Sensing ◽

10.3390/rs13010065 ◽

2020 ◽

Vol 13 (1) ◽

pp. 65

Author(s):

Jingtao Li ◽

Yonglin Shen ◽

Chao Yang

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Short Term Memory ◽

Complete Classification ◽

Support Vector ◽

Classification Models ◽

Training Samples ◽

Agricultural Applications ◽

Crop Classification ◽

Increasing Demand

Due to the increasing demand for the monitoring of crop conditions and food production, it is a challenging and meaningful task to identify crops from remote sensing images. The state-of the-art crop classification models are mostly built on supervised classification models such as support vector machines (SVM), convolutional neural networks (CNN), and long- and short-term memory neural networks (LSTM). Meanwhile, as an unsupervised generative model, the adversarial generative network (GAN) is rarely used to complete classification tasks for agricultural applications. In this work, we propose a new method that combines GAN, CNN, and LSTM models to classify crops of corn and soybeans from remote sensing time-series images, in which GAN’s discriminator was used as the final classifier. The method is feasible on the condition that the training samples are small, and it fully takes advantage of spectral, spatial, and phenology features of crops from satellite data. The classification experiments were conducted on crops of corn, soybeans, and others. To verify the effectiveness of the proposed method, comparisons with models of SVM, SegNet, CNN, LSTM, and different combinations were also conducted. The results show that our method achieved the best classification results, with the Kappa coefficient of 0.7933 and overall accuracy of 0.86. Experiments in other study areas also demonstrate the extensibility of the proposed method.

Download Full-text

A Machine Learning Decision Support System (DSS) for Neuroendocrine Tumor Patients Treated with Somatostatin Analog (SSA) Therapy

Diagnostics ◽

10.3390/diagnostics11050804 ◽

2021 ◽

Vol 11 (5) ◽

pp. 804

Author(s):

Jasminka Hasic Telalovic ◽

Serena Pillozzi ◽

Rachele Fabbri ◽

Alice Laffi ◽

Daniele Lavacchi ◽

...

Keyword(s):

Machine Learning ◽

Single Agent ◽

Predictive Biomarkers ◽

Progression Free Survival ◽

Primary Site ◽

Somatostatin Analog ◽

Support Vector ◽

Classification Models ◽

First Line Therapy ◽

Bayes Algorithm

The application of machine learning (ML) techniques could facilitate the identification of predictive biomarkers of somatostatin analog (SSA) efficacy in patients with neuroendocrine tumors (NETs). We collected data from 74 patients with a pancreatic or gastrointestinal NET who received SSA as first-line therapy. We developed three classification models to predict whether the patient would experience a progressive disease (PD) after 12 or 18 months based on clinic-pathological factors at the baseline. The dataset included 70 samples and 15 features. We initially developed three classification models with accuracy ranging from 55% to 70%. We then compared ten different ML algorithms. In all but one case, the performance of the Multinomial Naïve Bayes algorithm (80%) was the highest. The support vector machine classifier (SVC) had a higher performance for the recall metric of the progression-free outcome (97% vs. 94%). Overall, for the first time, we documented that the factors that mainly influenced progression-free survival (PFS) included age, the number of metastatic sites and the primary site. In addition, the following factors were also isolated as important: adverse events G3–G4, sex, Ki67, metastatic site (liver), functioning NET, the primary site and the stage. In patients with advanced NETs, ML provides a predictive model that could potentially be used to differentiate prognostic groups and to identify patients for whom SSA therapy as a single agent may not be sufficient to achieve a long-lasting PFS.

Download Full-text