scholarly journals Classical and Deep Learning Paradigms for Detection and Validation of Key Genes of Risky Outcomes of HCV

Algorithms ◽  
2020 ◽  
Vol 13 (3) ◽  
pp. 73
Author(s):  
Nagwan M. Abdel Samee

Hepatitis C virus (HCV) is one of the most dangerous viruses worldwide. It is the foremost cause of the hepatic cirrhosis, and hepatocellular carcinoma, HCC. Detecting new key genes that play a role in the growth of HCC in HCV patients using machine learning techniques paves the way for producing accurate antivirals. In this work, there are two phases: detecting the up/downregulated genes using classical univariate and multivariate feature selection methods, and validating the retrieved list of genes using Insilico classifiers. However, the classification algorithms in the medical domain frequently suffer from a deficiency of training cases. Therefore, a deep neural network approach is proposed here to validate the significance of the retrieved genes in classifying the HCV-infected samples from the disinfected ones. The validation model is based on the artificial generation of new examples from the retrieved genes’ expressions using sparse autoencoders. Subsequently, the generated genes’ expressions data are used to train conventional classifiers. Our results in the first phase yielded a better retrieval of significant genes using Principal Component Analysis (PCA), a multivariate approach. The retrieved list of genes using PCA had a higher number of HCC biomarkers compared to the ones retrieved from the univariate methods. In the second phase, the classification accuracy can reveal the relevance of the extracted key genes in classifying the HCV-infected and disinfected samples.

Author(s):  
Ankit Kumar Jain ◽  
Sumit Kumar Yadav ◽  
Neelam Choudhary

Smishing attack is generally performed by sending a fake short message service (SMS) that contains a link of the malicious webpage or application. Smishing messages are the subclass of spam SMS and these are more harmful compared to spam messages. There are various solutions available to detect the spam messages. However, no existing solution, filters the smishing message from the spam message. Therefore, this article presents a novel method to filter smishing message from spam message. The proposed approach is divided into two phases. The first phase filters the spam messages and ham messages. The second phase filters smishing messages from spam messages. The performance of the proposed method is evaluated on various machine learning classifiers using the dataset of ham and spam messages. The simulation results indicate that the proposed approach can detect spam messages with the accuracy of 94.9% and it can filter smishing messages with the accuracy of 96% on neural network classifier.


2020 ◽  
Vol 12 (1) ◽  
pp. 21-38 ◽  
Author(s):  
Ankit Kumar Jain ◽  
Sumit Kumar Yadav ◽  
Neelam Choudhary

Smishing attack is generally performed by sending a fake short message service (SMS) that contains a link of the malicious webpage or application. Smishing messages are the subclass of spam SMS and these are more harmful compared to spam messages. There are various solutions available to detect the spam messages. However, no existing solution, filters the smishing message from the spam message. Therefore, this article presents a novel method to filter smishing message from spam message. The proposed approach is divided into two phases. The first phase filters the spam messages and ham messages. The second phase filters smishing messages from spam messages. The performance of the proposed method is evaluated on various machine learning classifiers using the dataset of ham and spam messages. The simulation results indicate that the proposed approach can detect spam messages with the accuracy of 94.9% and it can filter smishing messages with the accuracy of 96% on neural network classifier.


Software engineering is an important area that deals with development and maintenance of software. After developing a software, it is always important to track its performance. One has to always see whether the software functions according to customer requirements. To ensure this, faulty and non- faulty modules must be identified. For this purpose, one can make use of a model for binary class classification of faults. Different technique's outputs differ in one or the other way with respect to the following: fault dataset used, complexity, classification algorithm implemented, etc. Various machine learning techniques can be used for this purpose. But this paper deals with the best classification algorithms available till date and they are decision tree, random forest, naive bayes and logistic regression (tree-based techniques and bayesian based techniques). The motive behind developing such a project is to identify the faulty modules within a software before the actual software testing takes place. As a result, the time consumed by testers or the workload of the testers can be reduced to an extent. This work is very well useful to those working in software industry and also to those people carrying out research in software engineering where the lifecycle of development of a software is discussed.


2021 ◽  
Author(s):  
◽  
Cao Truong Tran

<p>Classification is a major task in machine learning and data mining. Many real-world datasets suffer from the unavoidable issue of missing values. Classification with incomplete data has to be carefully handled because inadequate treatment of missing values will cause large classification errors.    Existing most researchers working on classification with incomplete data focused on improving the effectiveness, but did not adequately address the issue of the efficiency of applying the classifiers to classify unseen instances, which is much more important than the act of creating classifiers. A common approach to classification with incomplete data is to use imputation methods to replace missing values with plausible values before building classifiers and classifying unseen instances. This approach provides complete data which can be then used by any classification algorithm, but sophisticated imputation methods are usually computationally intensive, especially for the application process of classification. Another approach to classification with incomplete data is to build a classifier that can directly work with missing values. This approach does not require time for estimating missing values, but it often generates inaccurate and complex classifiers when faced with numerous missing values. A recent approach to classification with incomplete data which also avoids estimating missing values is to build a set of classifiers which then is used to select applicable classifiers for classifying unseen instances. However, this approach is also often inaccurate and takes a long time to find applicable classifiers when faced with numerous missing values.   The overall goal of the thesis is to simultaneously improve the effectiveness and efficiency of classification with incomplete data by using evolutionary machine learning techniques for feature selection, clustering, ensemble learning, feature construction and constructing classifiers.   The thesis develops approaches for improving imputation for classification with incomplete data by integrating clustering and feature selection with imputation. The approaches improve both the effectiveness and the efficiency of using imputation for classification with incomplete data.   The thesis develops wrapper-based feature selection methods to improve input space for classification algorithms that are able to work directly with incomplete data. The methods not only improve the classification accuracy, but also reduce the complexity of classifiers able to work directly with incomplete data.   The thesis develops a feature construction method to improve input space for classification algorithms with incomplete data by proposing interval genetic programming-genetic programming with a set of interval functions. The method improves the classification accuracy and reduces the complexity of classifiers.   The thesis develops an ensemble approach to classification with incomplete data by integrating imputation, feature selection, and ensemble learning. The results show that the approach is more accurate, and faster than previous common methods for classification with incomplete data.   The thesis develops interval genetic programming to directly evolve classifiers for incomplete data. The results show that classifiers generated by interval genetic programming can be more effective and efficient than classifiers generated the combination of imputation and traditional genetic programming. Interval genetic programming is also more effective than common classification algorithms able to work directly with incomplete data.    In summary, the thesis develops a range of approaches for simultaneously improving the effectiveness and efficiency of classification with incomplete data by using a range of evolutionary machine learning techniques.</p>


2014 ◽  
Vol 2014 ◽  
pp. 1-21 ◽  
Author(s):  
Lise Safatly ◽  
Mario Bkassiny ◽  
Mohammed Al-Husseini ◽  
Ali El-Hajj

A cognitive transceiver is required to opportunistically use vacant spectrum resources licensed to primary users. Thus, it relies on a complete adaptive behavior composed of: reconfigurable radio frequency (RF) parts, enhanced spectrum sensing algorithms, and sophisticated machine learning techniques. In this paper, we present a review of the recent advances in CR transceivers hardware design and algorithms. For the RF part, three types of antennas are presented: UWB antennas, frequency-reconfigurable/tunable antennas, and UWB antennas with reconfigurable band notches. The main challenges faced by the design of the other RF blocks are also discussed. Sophisticated spectrum sensing algorithms that overcome main sensing challenges such as model uncertainty, hardware impairments, and wideband sensing are highlighted. The cognitive engine features are discussed. Moreover, we study unsupervised classification algorithms and a reinforcement learning (RL) algorithm that has been proposed to perform decision-making in CR networks.


Author(s):  
Roya Nasimi ◽  
Fernando Moreu ◽  
John Stormont

Abstract Rockfalls are a hazard for the safety of infrastructure as well as people. Identifying loose rocks by inspection of slopes adjacent to roadways and other infrastructure and removing them in advance can be an effective way to prevent unexpected rockfall incidents. This paper proposes a system towards an automated inspection for potential rockfalls. A robot is used to repeatedly strike or tap on the rock surface. The sound from the tapping is collected by the robot and subsequently classified with the intent of identifying rocks that are broken and prone to fall. Principal Component Analysis (PCA) of the collected acoustic data is used to recognize patterns associated with rocks of various conditions, including intact as well as rock with different types and locations of cracks. The PCA classification was first demonstrated simulating sounds of different characteristics that were automatically trained and tested. Secondly, a laboratory test was conducted tapping rock specimens with three different levels of discontinuity in depth and shape. A real microphone mounted on the robot recorded the sound and the data were classified in three clusters within 2D space. A model was created using the training data to classify the reminder of the data (the test data). The performance of the method is evaluated with a confusion matrix.


2020 ◽  
Vol 8 (5) ◽  
pp. 4624-4627

In recent years, a lot of data has been generated about students, which can be utilized for deciding the career path of the student. This paper discusses some of the machine learning techniques which can be used to predict the performance of a student and help to decide his/her career path. Some of the key Machine Learning (ML) algorithms applied in our research work are Linear Regression, Logistics Regression, Support Vector machine, Naïve Bayes Classifier and K- means Clustering. The aim of this paper is to predict the student career path using Machine Learning algorithms. We compare the efficiencies of different ML classification algorithms on a real dataset obtained from University students.


Author(s):  
Nurul Farhana Hamzah ◽  
◽  
Nazri Mohd Nawi ◽  
Abdulkareem A. Hezam ◽  
◽  
...  

Heart failure means that the heart is not pumping well as normal as it should be. A congestive heart failure is a form of heart failure that involves seeking timely medical care, although the two terms are sometimes used interchangeably. Heart failure happens when the heart muscle does not pump blood as well as it can, often referred to as congestive heart failure. Some disorders, such as heart's narrowed arteries (coronary artery disease) or high blood pressure, eventually make the heart too weak or rigid to fill and pump effectively. Early detection of heart failure by using data mining techniques has gained popularity among researchers. This research uses some classification techniques for heart failure classification from medical data. This research analyzed the performance of some classification algorithms, namely Support Vector Machine (SVM), Decision Forest (DF), and Boosted Decision Tree (BDT), to classify accurately heart failure risk data as input. The best algorithm among the three is discovered for heart failure classification at the end of this research.


Advancement in medical science has always been one of the most vital aspects of the human race. With the progress in technology, the use of modern techniques and equipment is always imposed on treatment purposes. Nowadays, machine learning techniques have widely been used in medical science for assuring accuracy. In this work, we have constructed computational model building techniques for liver disease prediction accurately. We used some efficient classification algorithms: Random Forest, Perceptron, Decision Tree, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) for predicting liver diseases. Our works provide the implementation of hybrid model construction and comparative analysis for improving prediction performance. At first, classification algorithms are applied to the original liver patient datasets collected from the UCI repository. Then we analyzed features and tweaked to improve the performance of our predictor and made a comparative analysis among the classifiers. We examined that, KNN algorithm outperformed all other techniques with feature selection.


Sign in / Sign up

Export Citation Format

Share Document