scholarly journals Severity Assessment of COVID-19 Using a CT-Based Radiomics Model

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Zhigao Xu ◽  
Lili Zhao ◽  
Guoqiang Yang ◽  
Ying Ren ◽  
Jinlong Wu ◽  
...  

The coronavirus disease of 2019 (COVID-19) has evolved into a worldwide pandemic. Although CT is sensitive in detecting lesions and assessing their severity, these works mainly depend on radiologists’ subjective judgment, which is inefficient in case of a large-scale outbreak. This work focuses on developing a CT-based radiomics model to assess whether COVID-19 patients are in the early, progressive, severe, or absorption stages of the disease. We retrospectively analyzed the CT images of 284 COVID-19 patients. All of the patients were divided into four groups (0-3): early ( n = 75 ), progressive ( n = 58 ), severe ( n = 75 ), and absorption ( n = 76 ) groups, according to the progression of the disease and the CT features. Meanwhile, they were split randomly to training and test datasets with the fixed ratio of 7 : 3 in each category. Thirty-eight radiomic features were nominated from 1688 radiomic features after using select K -best method and the ElasticNet algorithm. On this basis, a support vector machine (SVM) classifier was trained to build this model. Receiver operating characteristic (ROC) curves were generated to determine the diagnostic performance of various models. The precision, recall, and f 1 -score of the classification model of macro- and microaverage were 0.82, 0.82, 0.81, 0.81, 0.81, and 0.81 for the training dataset and 0.75, 0.73, 0.73, 0.72, 0.72, and 0.72 for the test dataset. The AUCs for groups 0, 1, 2, and 3 on the training dataset were 0.99, 0.97, 0.96, and 0.93, and the microaverage AUC was 0.97 with a macroaverage AUC of 0.97. On the test dataset, AUCs for each group were 0.97, 0.86, 0.83, and 0.89 and the microaverage AUC was 0.89 with a macroaverage AUC of 0.90. The CT-based radiomics model proved efficacious in assessing the severity of COVID-19.

2019 ◽  
Vol 143 (8) ◽  
pp. 990-998 ◽  
Author(s):  
Min Yu ◽  
Lindsay A. L. Bazydlo ◽  
David E. Bruns ◽  
James H. Harrison

Context.— Turnaround time and productivity of clinical mass spectrometric (MS) testing are hampered by time-consuming manual review of the analytical quality of MS data before release of patient results. Objective.— To determine whether a classification model created by using standard machine learning algorithms can verify analytically acceptable MS results and thereby reduce manual review requirements. Design.— We obtained retrospective data from gas chromatography–MS analyses of 11-nor-9-carboxy-delta-9-tetrahydrocannabinol (THC-COOH) in 1267 urine samples. The data for each sample had been labeled previously as either analytically unacceptable or acceptable by manual review. The dataset was randomly split into training and test sets (848 and 419 samples, respectively), maintaining equal proportions of acceptable (90%) and unacceptable (10%) results in each set. We used stratified 10-fold cross-validation in assessing the abilities of 6 supervised machine learning algorithms to distinguish unacceptable from acceptable assay results in the training dataset. The classifier with the highest recall was used to build a final model, and its performance was evaluated against the test dataset. Results.— In comparison testing of the 6 classifiers, a model based on the Support Vector Machines algorithm yielded the highest recall and acceptable precision. After optimization, this model correctly identified all unacceptable results in the test dataset (100% recall) with a precision of 81%. Conclusions.— Automated data review identified all analytically unacceptable assays in the test dataset, while reducing the manual review requirement by about 87%. This automation strategy can focus manual review only on assays likely to be problematic, allowing improved throughput and turnaround time without reducing quality.


2020 ◽  
pp. 3397-3407
Author(s):  
Nur Syafiqah Mohd Nafis ◽  
Suryanti Awang

Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.


2009 ◽  
Vol 07 (05) ◽  
pp. 773-788 ◽  
Author(s):  
PENG CHEN ◽  
CHUNMEI LIU ◽  
LEGAND BURGE ◽  
MOHAMMAD MAHMOOD ◽  
WILLIAM SOUTHERLAND ◽  
...  

Protein fold classification is a key step to predicting protein tertiary structures. This paper proposes a novel approach based on genetic algorithms and feature selection to classifying protein folds. Our dataset is divided into a training dataset and a test dataset. Each individual for the genetic algorithms represents a selection function of the feature vectors of the training dataset. A support vector machine is applied to each individual to evaluate the fitness value (fold classification rate) of each individual. The aim of the genetic algorithms is to search for the best individual that produces the highest fold classification rate. The best individual is then applied to the feature vectors of the test dataset and a support vector machine is built to classify protein folds based on selected features. Our experimental results on Ding and Dubchak's benchmark dataset of 27-class folds show that our approach achieves an accuracy of 71.28%, which outperforms current state-of-the-art protein fold predictors.


2020 ◽  
Author(s):  
Saki Ishino ◽  
Takuya Itaki

Abstract The Eucampia Index, which is calculated from valve ratio of Antarctic diatom Eucampia ainarctica varieties, has been expected to be a useful indicator of sea ice coverage or/and sea surface temperature variation in the Southern Ocean. To verify the relationship between the index value and the environmental factors, considerable effort is needed to classify and count valves of E. antarctica in a very large number of samples. In this study, to realize automated detection of the Eucampia Index, we constructed a deep-learning (one of the learning methods of artificial intelligence) based models for identifying Eucampia valves from various particles in a diatom slide. The microfossil Classification and Rapid Accumulation Device (miCRAD) system, which can be used for scanning a slide and cropping images of particles automatically, was employed to collect images in training dataset for the model and test dataset for model verification. As a result of classifying particle images in the test dataset by the initial model "Eant_1000px_200616", accuracy was 78.8%. The Eucampia Index value prepared in the test dataset was 0.80, and the value predicted using the developed model from the same dataset was 0.76. The predicted value was in the range of the manual counting error. These results suggest that the classification performance of the model is similar to that of a human expert. This study revealed that a model capable of detecting the ratio of two diatom species can be constructed using the miCRAD system for the first time. The miCRAD system connected with the developed model in this study is capable of automatically classifying particle images at the same time of capturing images so that the system can be applied to a large-scale analysis of the Eucampia index in the Southern Ocean. Depending on the setting of the classification category, similar method is relevant to investigators who have to process a large number of diatom samples such as for detecting specific species for biostratigraphic and paleoenvironmental studies.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Xinke Zhan ◽  
Zhuhong You ◽  
Changqing Yu ◽  
Liping Li ◽  
Jie Pan

Identifying the drug-target interactions (DTIs) plays an essential role in new drug development. However, there still has the limited knowledge of DTIs and a significant number of unknown DTI pairs. Moreover, the traditional experimental methods have inevitable disadvantages such as high cost and time-consuming. Therefore, developing computational methods for predicting DTIs is attracting more and more attention. In this study, we report a novel computational approach for predicting DTI using GIST feature, position-specific scoring matrix (PSSM), and rotation forest (RF). Specifically, each target protein is first converted into a PSSM for retaining evolutionary information. Then, the GIST feature is extracted from PSSM and substructure fingerprint information is adopted to extract the feature of the drug. Finally, combining each protein and drug features to form a new drug-target pair, which is employed as input feature for RF classifier. In the experiment, the proposed method achieves high average accuracies of 89.25%, 85.93%, 82.36%, and 73.89% on enzyme, ion channel, G protein-coupled receptors (GPCRs), and nuclear receptor, respectively. For further evaluating the prediction performance of the proposed method, we compare it with the state-of-the-art support vector machine (SVM) classifier on the same golden standard dataset. These promising results illustrate that the proposed method is more effective and stable than other methods. We expect the proposed method to be a useful tool for predicting large-scale DTIs.


2020 ◽  
Vol 10 (16) ◽  
pp. 5686
Author(s):  
Ines A. Cruz-Guerrero ◽  
Raquel Leon ◽  
Daniel U. Campos-Delgado ◽  
Samuel Ortega ◽  
Himar Fabelo ◽  
...  

Hyperspectral imaging is a multidimensional optical technique with the potential of providing fast and accurate tissue classification. The main challenge is the adequate processing of the multidimensional information usually linked to long processing times and significant computational costs, which require expensive hardware. In this study, we address the problem of tissue classification for intraoperative hyperspectral images of in vivo brain tissue. For this goal, two methodologies are introduced that rely on a blind linear unmixing (BLU) scheme for practical tissue classification. Both methodologies identify the characteristic end-members related to the studied tissue classes by BLU from a training dataset and classify the pixels by a minimum distance approach. The proposed methodologies are compared with a machine learning method based on a supervised support vector machine (SVM) classifier. The methodologies based on BLU achieve speedup factors of ~459× and ~429× compared to the SVM scheme, while keeping constant and even slightly improving the classification performance.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 133 ◽  
Author(s):  
Yang Li ◽  
Ying Lv ◽  
Suge Wang ◽  
Jiye Liang ◽  
Juanzi Li ◽  
...  

A large-scale and high-quality training dataset is an important guarantee to learn an ideal classifier for text sentiment classification. However, manually constructing such a training dataset with sentiment labels is a labor-intensive and time-consuming task. Therefore, based on the idea of effectively utilizing unlabeled samples, a synthetical framework that covers the whole process of semi-supervised learning from seed selection, iterative modification of the training text set, to the co-training strategy of the classifier is proposed in this paper for text sentiment classification. To provide an important basis for selecting the seed texts and modifying the training text set, three kinds of measures—the cluster similarity degree of an unlabeled text, the cluster uncertainty degree of a pseudo-label text to a learner, and the reliability degree of a pseudo-label text to a learner—are defined. With these measures, a seed selection method based on Random Swap clustering, a hybrid modification method of the training text set based on active learning and self-learning, and an alternately co-training strategy of the ensemble classifier of the Maximum Entropy and Support Vector Machine are proposed and combined into our framework. The experimental results on three Chinese datasets (COAE2014, COAE2015, and a Hotel review, respectively) and five English datasets (Books, DVD, Electronics, Kitchen, and MR, respectively) in the real world verify the effectiveness of the proposed framework.


Proceedings ◽  
2020 ◽  
Vol 66 (1) ◽  
pp. 6
Author(s):  
Ehdieh Khaledian ◽  
Shira L. Broschat

Antimicrobial resistance is driving pharmaceutical companies to investigate different therapeutic approaches. One approach that has garnered growing consideration in drug development is the use of antimicrobial peptides (AMPs). Antibacterial peptides (ABPs), which occur naturally as part of the immune response, can serve as powerful, broad-spectrum antibiotics. However, conventional laboratory procedures for screening and discovering ABPs are expensive and time-consuming. Identification of ABPs can be significantly improved using computational methods. In this paper, we introduce a machine learning method for the fast and accurate prediction of ABPs. We gathered more than 6000 peptides from publicly available datasets and extracted 1209 features (peptide characteristics) from these sequences. We selected the set of optimal features by applying correlation-based and random forest feature selection techniques. Finally, we designed an ensemble gradient boosting model (GBM) to predict putative ABPs. We evaluated our model using receiver operating characteristic (ROC) curves, calculating the area under the curve (AUC) for several different models for comparison, including a recurrent neural network, a support vector machine, and iAMPpred. The AUC for the GBM was ~0.98, more than 3% better than any of the other models.


2021 ◽  
Vol 7 ◽  
pp. e680
Author(s):  
Muhammad Amirul Abdullah ◽  
Muhammad Ar Rahim Ibrahim ◽  
Muhammad Nur Aiman Shapiee ◽  
Muhammad Aizzat Zakaria ◽  
Mohd Azraai Mohd Razman ◽  
...  

This study aims at classifying flat ground tricks, namely Ollie, Kickflip, Shove-it, Nollie and Frontside 180, through the identification of significant input image transformation on different transfer learning models with optimized Support Vector Machine (SVM) classifier. A total of six amateur skateboarders (20 ± 7 years of age with at least 5.0 years of experience) executed five tricks for each type of trick repeatedly on a customized ORY skateboard (IMU sensor fused) on a cemented ground. From the IMU data, a total of six raw signals extracted. A total of two input image type, namely raw data (RAW) and Continous Wavelet Transform (CWT), as well as six transfer learning models from three different families along with grid-searched optimized SVM, were investigated towards its efficacy in classifying the skateboarding tricks. It was shown from the study that RAW and CWT input images on MobileNet, MobileNetV2 and ResNet101 transfer learning models demonstrated the best test accuracy at 100% on the test dataset. Nonetheless, by evaluating the computational time amongst the best models, it was established that the CWT-MobileNet-Optimized SVM pipeline was found to be the best. It could be concluded that the proposed method is able to facilitate the judges as well as coaches in identifying skateboarding tricks execution.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Jie Wan ◽  
Xue Cao ◽  
Kun Yao ◽  
Donghui Yang ◽  
E. Peng ◽  
...  

False information on the Internet is being heralded as serious social harm to our society. To recognize false text information, in this paper, an effective method for mining text features is proposed in the field of false drug advertisements. Firstly, the data of false drug advertisements and real drug advertisements were collected from the official websites to build a database of false and real drug advertisements. Secondly, by performing feature extraction on the text of drug advertisements, this work built a characteristic matrix based on the effective features and assigned positive or negative labels to the feature vector of the matrix according to whether it is a fake medical advertisement or not. Thirdly, this study trained and tested several different classifiers, selected the classification model with the best performance in identifying false drug advertisements, and found the key characteristics that can determine the classification. Finally, the model with the best performance was used to predict new false drug advertisements collected from Sina Weibo. In the case of identifying false drug advertisements, the classification effect of the support vector machine (SVM) classifier established on the feature set after feature selection was the most effective. The findings of this study can provide an effective method for the government to identify and combat false advertisements. This study has a certain reference significance in demonstrating the use of text data mining technology to identify and detect information fraud behavior.


Sign in / Sign up

Export Citation Format

Share Document