Adaptive Diagnosis of Lung Cancer by Deep Learning Classification Using Wilcoxon Gain and Generator

Journal of Healthcare Engineering ◽

10.1155/2021/5912051 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

O. Obulesu ◽

Suresh Kallam ◽

Gaurav Dhiman ◽

Rizwan Patan ◽

Ramana Kadiyala ◽

...

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Deep Learning ◽

Information Gain ◽

Disease Diagnosis ◽

Machine Learning Techniques ◽

Computational Time ◽

Model Parameters ◽

Cancer Disease ◽

Data Set

Cancer is a complicated worldwide health issue with an increasing death rate in recent years. With the swift blooming of the high throughput technology and several machine learning methods that have unfolded in recent years, progress in cancer disease diagnosis has been made based on subset features, providing awareness of the efficient and precise disease diagnosis. Hence, progressive machine learning techniques that can, fortunately, differentiate lung cancer patients from healthy persons are of great concern. This paper proposes a novel Wilcoxon Signed-Rank Gain Preprocessing combined with Generative Deep Learning called Wilcoxon Signed Generative Deep Learning (WS-GDL) method for lung cancer disease diagnosis. Firstly, test significance analysis and information gain eliminate redundant and irrelevant attributes and extract many informative and significant attributes. Then, using a generator function, the Generative Deep Learning method is used to learn the deep features. Finally, a minimax game (i.e., minimizing error with maximum accuracy) is proposed to diagnose the disease. Numerical experiments on the Thoracic Surgery Data Set are used to test the WS-GDL method's disease diagnosis performance. The WS-GDL approach may create relevant and significant attributes and adaptively diagnose the disease by selecting optimal learning model parameters. Quantitative experimental results show that the WS-GDL method achieves better diagnosis performance and higher computing efficiency in computational time, computational complexity, and false-positive rate compared to state-of-the-art approaches.

Download Full-text

Predicting In Vitro Neurotoxicity Induced by Nanoparticles Using Machine Learning

International Journal of Molecular Sciences ◽

10.3390/ijms21155280 ◽

2020 ◽

Vol 21 (15) ◽

pp. 5280

Author(s):

Irini Furxhi ◽

Finbarr Murphy

Keyword(s):

Machine Learning ◽

Goodness Of Fit ◽

Information Gain ◽

Sampling Technique ◽

Exposure Dose ◽

Classification Model ◽

Machine Learning Techniques ◽

Data Set ◽

Tissue Specific

The practice of non-testing approaches in nanoparticles hazard assessment is necessary to identify and classify potential risks in a cost effective and timely manner. Machine learning techniques have been applied in the field of nanotoxicology with encouraging results. A neurotoxicity classification model for diverse nanoparticles is presented in this study. A data set created from multiple literature sources consisting of nanoparticles physicochemical properties, exposure conditions and in vitro characteristics is compiled to predict cell viability. Pre-processing techniques were applied such as normalization methods and two supervised instance methods, a synthetic minority over-sampling technique to address biased predictions and production of subsamples via bootstrapping. The classification model was developed using random forest and goodness-of-fit with additional robustness and predictability metrics were used to evaluate the performance. Information gain analysis identified the exposure dose and duration, toxicological assay, cell type, and zeta potential as the five most important attributes to predict neurotoxicity in vitro. This is the first tissue-specific machine learning tool for neurotoxicity prediction caused by nanoparticles in in vitro systems. The model performs better than non-tissue specific models.

Download Full-text

Robust Feature Engineering for Parkinson Disease Diagnosis: New Machine Learning Techniques (Preprint)

10.2196/preprints.13611 ◽

2019 ◽

Author(s):

Max Wang ◽

Wenbo Ge ◽

Deborah Apthorp ◽

Hanna Suominen

Keyword(s):

Machine Learning ◽

Parkinson Disease ◽

Neurodegenerative Disorder ◽

Disease Diagnosis ◽

Machine Learning Techniques ◽

Support Vector ◽

Feature Engineering ◽

Data Set ◽

Performance Improvements ◽

Number Of Patients

BACKGROUND Parkinson disease (PD) is a common neurodegenerative disorder that affects between 7 and 10 million people worldwide. No objective test for PD currently exists, and studies suggest misdiagnosis rates of up to 34%. Machine learning (ML) presents an opportunity to improve diagnosis; however, the size and nature of data sets make it difficult to generalize the performance of ML models to real-world applications. OBJECTIVE This study aims to consolidate prior work and introduce new techniques in feature engineering and ML for diagnosis based on vowel phonation. Additional features and ML techniques were introduced, showing major performance improvements on the large mPower vocal phonation data set. METHODS We used 1600 randomly selected /aa/ phonation samples from the entire data set to derive rules for filtering out faulty samples from the data set. The application of these rules, along with a joint age-gender balancing filter, results in a data set of 511 PD patients and 511 controls. We calculated features on a 1.5-second window of audio, beginning at the 1-second mark, for a support vector machine. This was evaluated with 10-fold cross-validation (CV), with stratification for balancing the number of patients and controls for each CV fold. RESULTS We showed that the features used in prior literature do not perform well when extrapolated to the much larger mPower data set. Owing to the natural variation in speech, the separation of patients and controls is not as simple as previously believed. We presented significant performance improvements using additional novel features (with 88.6% certainty, derived from a Bayesian correlated t test) in separating patients and controls, with accuracy exceeding 58%. CONCLUSIONS The results are promising, showing the potential for ML in detecting symptoms imperceptible to a neurologist.

Download Full-text

Recent Development of Artificial Intelligence Applied in Echocardiography: A Review (Preprint)

10.2196/preprints.17227 ◽

2019 ◽

Author(s):

Lu Liu ◽

Ahmed Elazab ◽

Baiying Lei ◽

Tianfu Wang

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Deep Learning ◽

Cardiovascular Diseases ◽

Disease Diagnosis ◽

Machine Learning Techniques ◽

Tissue Segmentation ◽

The Past ◽

Learning Techniques ◽

The Future

BACKGROUND Echocardiography has a pivotal role in the diagnosis and management of cardiovascular diseases since it is real-time, cost-effective, and non-invasive. The development of artificial intelligence (AI) techniques have led to more intelligent and automatic computer-aided diagnosis (CAD) systems in echocardiography over the past few years. Automatic CAD mainly includes classification, detection of anatomical structures, tissue segmentation, and disease diagnosis, which are mainly completed by machine learning techniques and the recent developed deep learning techniques. OBJECTIVE This review aims to provide a guide for researchers and clinicians on relevant aspects of AI, machine learning, and deep learning. In addition, we review the recent applications of these methods in echocardiography and identify how echocardiography could incorporate AI in the future. METHODS This paper first summarizes the overview of machine learning and deep learning. Second, it reviews current use of AI in echocardiography by searching literature in the main databases for the past 10 years and finally discusses potential limitations and challenges in the future. RESULTS AI has showed promising improvements in analysis and interpretation of echocardiography to a new stage in the fields of standard views detection, automated analysis of chamber size and function, and assessment of cardiovascular diseases. CONCLUSIONS Compared with machine learning, deep learning methods have achieved state-of-the-art performance across different applications in echocardiography. Although there are challenges such as the required large dataset, AI can provide satisfactory results by devising various strategies. We believe AI has the potential to improve accuracy of diagnosis, reduce time consumption, and decrease the load of cardiologists.

Download Full-text

Stance detection using diverse feature sets based on machine learning techniques

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202269 ◽

2021 ◽

pp. 1-20

Author(s):

Kashif Ayyub ◽

Saqib Iqbal ◽

Muhammad Wasif Nisar ◽

Saima Gulzar Ahmad ◽

Ehsan Ullah Munir

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Information Gain ◽

Real Life ◽

Machine Learning Techniques ◽

Base Line ◽

Feature Sets ◽

Part Of Speech ◽

Learning Techniques

Sentiment analysis is the field that analyzes sentiments, and opinions of people about entities such as products, businesses, and events. As opinions influence the people’s behaviors, it has numerous applications in real life such as marketing, politics, social media etc. Stance detection is the sub-field of sentiment analysis. The stance classification aims to automatically identify from the source text, whether the source is in favor, neutral, or opposed to the target. This research study proposed a framework to explore the performance of the conventional (NB, DT, SVM), ensemble learning (RF, AdaBoost) and deep learning-based (DBN, CNN-LSTM, and RNN) machine learning techniques. The proposed method is feature centric and extracted the (sentiment, content, tweet specific and part-of-speech) features from both datasets of SemEval2016 and SemEval2017. The proposed study has also explored the role of deep features such as GloVe and Word2Vec for stance classification which has not received attention yet for stance detection. Some base line features such as Bag of words, N-gram, TF-IDF are also extracted from both datasets to compare the proposed features along with deep features. The proposed features are ranked using feature ranking methods such as (information gain, gain ration and relief-f). Further, the results are evaluated using standard performance evaluation measures for stance classification with existing studies. The calculated results show that the proposed feature sets including sentiment, (part-of-speech, content, and tweet specific) are helpful for stance classification when applied with SVM and GloVe a deep feature has given the best results when applied with deep learning method RNN.

Download Full-text

Automated Amharic News Categorization Using Deep Learning Models

Computational Intelligence and Neuroscience ◽

10.1155/2021/3774607 ◽

2021 ◽

Vol 2021 ◽

pp. 1-9

Author(s):

Demeke Endalie ◽

Getamesay Haile

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Document Classification ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Language Resources ◽

Data Set ◽

Learning Techniques ◽

Proposed Model

For decades, machine learning techniques have been used to process Amharic texts. The potential application of deep learning on Amharic document classification has not been exploited due to a lack of language resources. In this paper, we present a deep learning model for Amharic news document classification. The proposed model uses fastText to generate text vectors to represent semantic meaning of texts and solve the problem of traditional methods. The text vectors matrix is then fed into the embedding layer of a convolutional neural network (CNN), which automatically extracts features. We conduct experiments on a data set with six news categories, and our approach produced a classification accuracy of 93.79%. We compared our method to well-known machine learning algorithms such as support vector machine (SVM), multilayer perceptron (MLP), decision tree (DT), XGBoost (XGB), and random forest (RF) and achieved good results.

Download Full-text

Deep Learning-Based Approaches for Decoding Motor Intent From Peripheral Nerve Signals

Frontiers in Neuroscience ◽

10.3389/fnins.2021.667907 ◽

2021 ◽

Vol 15 ◽

Author(s):

Diu K. Luu ◽

Anh T. Nguyen ◽

Ming Jiang ◽

Jian Xu ◽

Markus W. Drealan ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Real Time ◽

Input Data ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Learning Models ◽

Data Set ◽

Advantages And Disadvantages ◽

Trade Offs

Previous literature shows that deep learning is an effective tool to decode the motor intent from neural signals obtained from different parts of the nervous system. However, deep neural networks are often computationally complex and not feasible to work in real-time. Here we investigate different approaches' advantages and disadvantages to enhance the deep learning-based motor decoding paradigm's efficiency and inform its future implementation in real-time. Our data are recorded from the amputee's residual peripheral nerves. While the primary analysis is offline, the nerve data is cut using a sliding window to create a “pseudo-online” dataset that resembles the conditions in a real-time paradigm. First, a comprehensive collection of feature extraction techniques is applied to reduce the input data dimensionality, which later helps substantially lower the motor decoder's complexity, making it feasible for translation to a real-time paradigm. Next, we investigate two different strategies for deploying deep learning models: a one-step (1S) approach when big input data are available and a two-step (2S) when input data are limited. This research predicts five individual finger movements and four combinations of the fingers. The 1S approach using a recurrent neural network (RNN) to concurrently predict all fingers' trajectories generally gives better prediction results than all the machine learning algorithms that do the same task. This result reaffirms that deep learning is more advantageous than classic machine learning methods for handling a large dataset. However, when training on a smaller input data set in the 2S approach, which includes a classification stage to identify active fingers before predicting their trajectories, machine learning techniques offer a simpler implementation while ensuring comparably good decoding outcomes to the deep learning ones. In the classification step, either machine learning or deep learning models achieve the accuracy and F1 score of 0.99. Thanks to the classification step, in the regression step, both types of models result in a comparable mean squared error (MSE) and variance accounted for (VAF) scores as those of the 1S approach. Our study outlines the trade-offs to inform the future implementation of real-time, low-latency, and high accuracy deep learning-based motor decoder for clinical applications.

Download Full-text

Diagnosis Lung Cancer Disease Using Machine Learning Techniques

10.34279/0923-008-004-010 ◽

2018 ◽

pp. 119

Author(s):

Shokhan M. Al Barzinji

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Machine Learning Techniques ◽

Cancer Disease ◽

Learning Techniques

Download Full-text

Deep Learning and Machine Learning Techniques to Classify Electrical and Electronic Equipment

10.1115/detc2021-71403 ◽

2021 ◽

Author(s):

Shuaizhou Hu ◽

Xinyao Zhang ◽

Hao-yu Liao ◽

Xiao Liang ◽

Minghui Zheng ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Electronic Waste ◽

Consumer Electronics ◽

Machine Learning Techniques ◽

Support Vector ◽

Model Parameters ◽

Scale Invariant ◽

Learning Techniques ◽

Scale Invariant Feature

Abstract Remanufacturing sites often receive products with different brands, models, conditions, and quality levels. Proper sorting and classification of the waste stream is a primary step in efficiently recovering and handling used products. The correct classification is particularly crucial in future electronic waste (e-waste) management sites equipped with Artificial Intelligence (AI) and robotic technologies. Robots should be enabled with proper algorithms to recognize and classify products with different features and prepare them for assembly and disassembly tasks. In this study, two categories of Machine Learning (ML) and Deep Learning (DL) techniques are used to classify consumer electronics. ML models include Naïve Bayes with Bernoulli, Gaussian, Multinomial distributions, and Support Vector Machine (SVM) algorithms with four kernels of Linear, Radial Basis Function (RBF), Polynomial, and Sigmoid. While DL models include VGG-16, GoogLeNet, Inception-v3, Inception-v4, and ResNet-50. The above-mentioned models are used to classify three laptop brands, including Apple, HP, and ThinkPad. First the Edge Histogram Descriptor (EHD) and Scale Invariant Feature Transform (SIFT) are used to extract features as inputs to ML models for classification. DL models use laptop images without pre-processing on feature extraction. The trained models are slightly overfitting due to the limited dataset and complexity of model parameters. Despite slight overfitting, the models can identify each brand. The findings prove that DL models outperform them of ML. Among DL models, GoogLeNet has the highest performance in identifying the laptop brands.

Download Full-text

Comparative Study of Supervised Machine Learning Algorithms on Thoracic Surgery Patients based on Ranker Feature Algorithms

UHD Journal of Science and Technology ◽

10.21928/uhdjst.v5n2y2021.pp66-74 ◽

2021 ◽

Vol 5 (2) ◽

pp. 66-74

Author(s):

Hezha M.Tareq Abdulhadi ◽

Hardi Sabah Talabani

Keyword(s):

Machine Learning ◽

Lung Cancer ◽

Thoracic Surgery ◽

Information Gain ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Lung Cancer Patients

Thoracic surgery refers to the information gathered for the patients who have to suffer from lung cancer. Various machine learning techniques were employed in post-operative life expectancy to predict lung cancer patients. In this study, we have used the most famous and influential supervised machine learning algorithms, which are J48, Naïve Bayes, Multilayer Perceptron, and Random Forest (RF). Then, two ranker feature selections, information gain and gain ratio, were used on the thoracic surgery dataset to examine and explore the effect of used ranker feature selections on the machine learning classifiers. The dataset was collected from the Wroclaw University in UCI repository website. We have done two experiments to show the performances of the supervised classifiers on the dataset with and without employing the ranker feature selection. The obtained results with the ranker feature selections showed that J48, NB, and MLP’s accuracy improved, whereas RF accuracy decreased and support vector machine remained stable.

Download Full-text

Intelligent Prediction of The Equivalent Circulating Density From Surface Data Sensors During Drilling By Employing Machine Learning Techniques

10.21203/rs.3.rs-154257/v1 ◽

2021 ◽

Author(s):

Hany Gamal ◽

Ahmed Abdelaal ◽

Salaheldin Elkatatny

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Machine Learning Techniques ◽

Sensitivity Analyses ◽

Percentage Error ◽

Model Parameters ◽

Data Set ◽

Inference System ◽

Mathematical Correlation ◽

Drilling Parameters

Abstract The precise control for the equivalent circulating density (ECD) will lead to evade well control issues like loss of circulation, formation fracturing, underground blowout, and surface blowout. Predicting the ECD from the drilling parameters is a new horizon in drilling engineering practices and this is because of the drawbacks of the cost of downhole ECD tools and the low accuracy of the mathematical models. Machine learning methods can offer a superior prediction accuracy over the traditional and statistical models due to the advanced computing capacity. Hence, the objective of this paper is to use the artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS) techniques to develop ECD prediction models. The novel contribution for this study is predicting the downhole ECD without any need for downhole measurements but only the available surface drilling parameters. The data in this study covered the drilling data for a horizontal section with 3,570 readings for each input after data preprocessing. The data covered the mud rate, rate of penetration, drill string speed, standpipe pressure, weight on bit, and the drilling torque. The data used to build the model with a 77:23 training to testing ratio. Another data set (1,150 data points) from the same field was used for models` validation. Many sensitivity analyses were done to optimize the ANN and ANFIS model parameters. The prediction of the developed machine learning models provided a high performance and accuracy level with a correlation coefficient (R) of 0.99 for the models' training and testing data sets, and an average absolute percentage error (AAPE) less than 0.24%. The validation results showed R of 0.98 and 0.96 and AAPE of 0.30% and 0.69% for ANN and ANFIS models respectively. Besides, a mathematical correlation was developed for estimating ECD based on the inputs as a white-box model.

Download Full-text