Gene Mutation Classification through Text Evidence Facilitating Cancer Tumour Detection

Journal of Healthcare Engineering ◽

10.1155/2021/8689873 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Meenu Gupta ◽

Hao Wu ◽

Simrann Arora ◽

Akash Gupta ◽

Gopal Chaudhary ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Clinical Evidence ◽

Sparse Matrix ◽

Confusion Matrix ◽

Genetic Mutation ◽

Cancer Center ◽

Genetic Mutations ◽

Accuracy Score ◽

Machine Learning Classification

A cancer tumour consists of thousands of genetic mutations. Even after advancement in technology, the task of distinguishing genetic mutations, which act as driver for the growth of tumour with passengers (Neutral Genetic Mutations), is still being done manually. This is a time-consuming process where pathologists interpret every genetic mutation from the clinical evidence manually. These clinical shreds of evidence belong to a total of nine classes, but the criterion of classification is still unknown. The main aim of this research is to propose a multiclass classifier to classify the genetic mutations based on clinical evidence (i.e., the text description of these genetic mutations) using Natural Language Processing (NLP) techniques. The dataset for this research is taken from Kaggle and is provided by the Memorial Sloan Kettering Cancer Center (MSKCC). The world-class researchers and oncologists contribute the dataset. Three text transformation models, namely, CountVectorizer, TfidfVectorizer, and Word2Vec, are utilized for the conversion of text to a matrix of token counts. Three machine learning classification models, namely, Logistic Regression (LR), Random Forest (RF), and XGBoost (XGB), along with the Recurrent Neural Network (RNN) model of deep learning, are applied to the sparse matrix (keywords count representation) of text descriptions. The accuracy score of all the proposed classifiers is evaluated by using the confusion matrix. Finally, the empirical results show that the RNN model of deep learning has performed better than other proposed classifiers with the highest accuracy of 70%.

Download Full-text

Deep Learning-Based Sentiment Analysis of COVID-19 Vaccination Responses from Twitter Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2021/4321131 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Kazi Nabiul Alam ◽

Md Shakib Khan ◽

Abdur Rab Dhruba ◽

Mohammad Monirujjaman Khan ◽

Jehad F. Al-Amri ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Performance Metrics ◽

Short Term Memory ◽

Confusion Matrix ◽

Short Term ◽

Learning Techniques ◽

The World ◽

Long Short Term Memory ◽

Severe Anxiety

The COVID-19 pandemic has had a devastating effect on many people, creating severe anxiety, fear, and complicated feelings or emotions. After the initiation of vaccinations against coronavirus, people’s feelings have become more diverse and complex. Our aim is to understand and unravel their sentiments in this research using deep learning techniques. Social media is currently the best way to express feelings and emotions, and with the help of Twitter, one can have a better idea of what is trending and going on in people’s minds. Our motivation for this research was to understand the diverse sentiments of people regarding the vaccination process. In this research, the timeline of the collected tweets was from December 21 to July21. The tweets contained information about the most common vaccines available recently from across the world. The sentiments of people regarding vaccines of all sorts were assessed using the natural language processing (NLP) tool, Valence Aware Dictionary for sEntiment Reasoner (VADER). Initializing the polarities of the obtained sentiments into three groups (positive, negative, and neutral) helped us visualize the overall scenario; our findings included 33.96% positive, 17.55% negative, and 48.49% neutral responses. In addition, we included our analysis of the timeline of the tweets in this research, as sentiments fluctuated over time. A recurrent neural network- (RNN-) oriented architecture, including long short-term memory (LSTM) and bidirectional LSTM (Bi-LSTM), was used to assess the performance of the predictive models, with LSTM achieving an accuracy of 90.59% and Bi-LSTM achieving 90.83%. Other performance metrics such as precision,, F1-score, and a confusion matrix were also used to validate our models and findings more effectively. This study improves understanding of the public’s opinion on COVID-19 vaccines and supports the aim of eradicating coronavirus from the world.

Download Full-text

A Tweet Sentiment Classification Approach Using a Hybrid Stacked Ensemble Technique

Information ◽

10.3390/info12090374 ◽

2021 ◽

Vol 12 (9) ◽

pp. 374

Author(s):

Babacar Gaye ◽

Dezheng Zhang ◽

Aziguli Wulamu

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Deep Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

State Of The Art ◽

Accuracy Score ◽

Learning Models ◽

Proposed Model

With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.

Download Full-text

Simple Extensible Deep Learning Model for Automatic Arabic Diacritization

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3480938 ◽

2022 ◽

Vol 21 (2) ◽

pp. 1-16

Author(s):

Hamza Abbad ◽

Shengwu Xiong

Keyword(s):

Deep Learning ◽

Language Processing ◽

Error Rate ◽

Short Term Memory ◽

Confusion Matrix ◽

Learning Model ◽

Model Errors ◽

Sequence Elements ◽

Arabic Natural Language Processing ◽

Deep Learning Model

Automatic diacritization is an Arabic natural language processing topic based on the sequence labeling task where the labels are the diacritics and the letters are the sequence elements. A letter can have from zero up to two diacritics. The dataset used was a subset of the preprocessed version of the Tashkeela corpus. We developed a deep learning model composed of a stack of four bidirectional long short-term memory hidden layers of the same size and an output layer at every level. The levels correspond to the groups that we classified the diacritics into (short vowels, double case-endings, Shadda, and Sukoon). Before training, the data were divided into input vectors containing letter indexes and outputs vectors containing the indexes of diacritics regarding their groups. Both input and output vectors are concatenated, then a sliding window operation with overlapping is performed to generate continuous and fixed-size data. Such data is used for both training and evaluation. Finally, we realize some tests using the standard metrics with all of their variations and compare our results with two recent state-of-the-art works. Our model achieved 3% diacritization error rate and 8.99% word error rate when including all letters. We have also generated the confusion matrix to show the performances per output and analyzed the mismatches of the first 500 lines to classify the model errors according to their linguistic nature.

Download Full-text

Sarcasm detection of tweets without #sarcasm: data science approach

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v23.i2.pp993-1001 ◽

2021 ◽

Vol 23 (2) ◽

pp. 993

Author(s):

Rupali Amit Bagate ◽

R. Suguna

Keyword(s):

Machine Learning ◽

Language Processing ◽

Data Science ◽

Short Term Memory ◽

Confusion Matrix ◽

Research Work ◽

Gradient Boosting ◽

Specific Context ◽

Machine Learning Classification ◽

Light Gradient

Identifying sarcasm present in the text could be a challenging work. In sarcasm, a negative word can flip the polarity of a positive sentence. Sentences can be classified as sarcastic or non-sarcastic. It is easier to identify sarcasm using facial expression or tonal weight rather detecting from plain text. Thus, sarcasm detection using natural language processing is major challenge without giving away any specific context or clue such as #sarcasm present in a tweet. Therefore, research tries to solve this classification problem using various optimized models. Proposed model, analyzes whether a given tweet, is sarcastic or not without the presnece of hashtag sarcasm or any kind of specific context present in text. To achieve better results, we used different machine learning classification methodology along with deep learning embedding techniques. Our optimized model uses a stacking technique which combines the result of logistic regression and long short-term memory (LSTM) recurrent neural net feed to light gradient boosting technique which generates better result as compare to existing machine learning and neural network algorithm. The key difference of our research work is sarcasm detection done without #sarcasm which has not been much explored earlier by any researcher. The metrics used for evolutionis F1-score and confusion matrix.

Download Full-text

Development of LSTM&CNN Based Hybrid Deep Learning Model to Classify Motor Imagery Tasks

10.1101/2020.09.20.305300 ◽

2020 ◽

Cited By ~ 1

Author(s):

Caglar Uyulan

Keyword(s):

Deep Learning ◽

Motor Imagery ◽

Language Processing ◽

Roc Curve ◽

Short Term Memory ◽

Confusion Matrix ◽

Learning Model ◽

Classification Model ◽

Performance Criteria ◽

Deep Learning Model

AbstractRecent studies underline the contribution of brain-computer interface (BCI) applications to the enhancement process of the life quality of physically impaired subjects. In this context, to design an effective stroke rehabilitation or assistance system, the classification of motor imagery (MI) tasks are performed through deep learning (DL) algorithms. Although the utilization of DL in the BCI field remains relatively premature as compared to the fields related to natural language processing, object detection, etc., DL has proven its effectiveness in carrying out this task. In this paper, a hybrid method, which fuses the one-dimensional convolutional neural network (1D CNN) with the long short-term memory (LSTM), was performed for classifying four different MI tasks, i.e. left hand, right hand, tongue, and feet movements. The time representation of MI tasks is extracted through the hybrid deep learning model training after principal component analysis (PCA)-based artefact removal process. The performance criteria given in the BCI Competition IV dataset A are estimated. 10-folded Cross-validation (CV) results show that the proposed method outperforms in classifying electroencephalogram (EEG)-electrooculogram (EOG) combined motor imagery tasks compared to the state of art methods and is robust against data variations. The CNN-LSTM classification model reached 95.62 % (±1.2290742) accuracy and 0.9462 (±0.01216265) kappa value for datasets with four MI-based class validated using 10-fold CV. Also, the receiver operator characteristic (ROC) curve, the area under the ROC curve (AUC) score, and confusion matrix are evaluated for further interpretations.

Download Full-text

Deep Learning Based High-Resolution Remote Sensing Image classification

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i10.384 ◽

2017 ◽

Vol 7 (10) ◽

pp. 22

Author(s):

Sumit Kaur

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Deep Learning ◽

Image Classification ◽

Language Processing ◽

Object Perception ◽

Remote Sensing Image ◽

Research Area ◽

Remote Sensing Image Classification ◽

Unsupervised Algorithms

Abstract- Deep learning is an emerging research area in machine learning and pattern recognition field which has been presented with the goal of drawing Machine Learning nearer to one of its unique objectives, Artificial Intelligence. It tries to mimic the human brain, which is capable of processing and learning from the complex input data and solving different kinds of complicated tasks well. Deep learning (DL) basically based on a set of supervised and unsupervised algorithms that attempt to model higher level abstractions in data and make it self-learning for hierarchical representation for classification. In the recent years, it has attracted much attention due to its state-of-the-art performance in diverse areas like object perception, speech recognition, computer vision, collaborative filtering and natural language processing. This paper will present a survey on different deep learning techniques for remote sensing image classification.

Download Full-text

Hierarchical Phoneme Classification for Improved Speech Recognition

Applied Sciences ◽

10.3390/app11010428 ◽

2021 ◽

Vol 11 (1) ◽

pp. 428

Author(s):

Donghoon Oh ◽

Jeong-Sik Park ◽

Ji-Hwan Kim ◽

Gil-Jin Jang

Keyword(s):

Speech Recognition ◽

Language Processing ◽

Confusion Matrix ◽

Critical Factor ◽

Recognition System ◽

Classification Performance ◽

Language Models ◽

Successful Implementation ◽

Phoneme Classification ◽

Improved Performance

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-of-the-art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.

Download Full-text

Development and Verification of a Deep Learning Algorithm to Evaluate Small-Bowel Preparation Quality

Diagnostics ◽

10.3390/diagnostics11061127 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1127

Author(s):

Ji Hyung Nam ◽

Dong Jun Oh ◽

Sumin Lee ◽

Hyun Joo Song ◽

Yun Jeong Lim

Keyword(s):

Deep Learning ◽

Small Bowel ◽

Scoring System ◽

Operating Characteristic ◽

Clinical Evidence ◽

Learning Algorithm ◽

Characteristic Curve ◽

External Validation ◽

Test Results ◽

Deep Learning Algorithm

Capsule endoscopy (CE) quality control requires an objective scoring system to evaluate the preparation of the small bowel (SB). We propose a deep learning algorithm to calculate SB cleansing scores and verify the algorithm’s performance. A 5-point scoring system based on clarity of mucosal visualization was used to develop the deep learning algorithm (400,000 frames; 280,000 for training and 120,000 for testing). External validation was performed using additional CE cases (n = 50), and average cleansing scores (1.0 to 5.0) calculated using the algorithm were compared to clinical grades (A to C) assigned by clinicians. Test results obtained using 120,000 frames exhibited 93% accuracy. The separate CE case exhibited substantial agreement between the deep learning algorithm scores and clinicians’ assessments (Cohen’s kappa: 0.672). In the external validation, the cleansing score decreased with worsening clinical grade (scores of 3.9, 3.2, and 2.5 for grades A, B, and C, respectively, p < 0.001). Receiver operating characteristic curve analysis revealed that a cleansing score cut-off of 2.95 indicated clinically adequate preparation. This algorithm provides an objective and automated cleansing score for evaluating SB preparation for CE. The results of this study will serve as clinical evidence supporting the practical use of deep learning algorithms for evaluating SB preparation quality.

Download Full-text

Artificial Intelligence and Human Rights: Four Realms of Discussion: Summary of Remarks

Proceedings of the ASIL Annual Meeting ◽

10.1017/amp.2021.47 ◽

2020 ◽

Vol 114 ◽

pp. 242-245

Author(s):

Jootaek Lee

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Decision Making ◽

Deep Learning ◽

Language Processing ◽

Human Reasoning ◽

The Past ◽

Making Choices ◽

Data Capturing ◽

Intensive Procedures

The term, Artificial Intelligence (AI), has changed since it was first coined by John MacCarthy in 1956. AI, believed to have been created with Kurt Gödel's unprovable computational statements in 1931, is now called deep learning or machine learning. AI is defined as a computer machine with the ability to make predictions about the future and solve complex tasks, using algorithms. The AI algorithms are enhanced and become effective with big data capturing the present and the past while still necessarily reflecting human biases into models and equations. AI is also capable of making choices like humans, mirroring human reasoning. AI can help robots to efficiently repeat the same labor intensive procedures in factories and can analyze historic and present data efficiently through deep learning, natural language processing, and anomaly detection. Thus, AI covers a spectrum of augmented intelligence relating to prediction, autonomous intelligence relating to decision making, automated intelligence for labor robots, and assisted intelligence for data analysis.

Download Full-text

Daily estimates of individual discharge likelihood with deep learning natural language processing in general medicine: a prospective and external validation study

Internal and Emergency Medicine ◽

10.1007/s11739-021-02816-7 ◽

2021 ◽

Author(s):

Stephen Bacchi ◽

Toby Gilbert ◽

Samuel Gluck ◽

Joy Cheng ◽

Yiran Tan ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Validation Study ◽

External Validation ◽

General Medicine ◽

External Validation Study

Download Full-text