Co-evolution networks of HIV/HCV are modular with direct association to structure and function

Mapping Intimacies ◽

10.1101/307033 ◽

2018 ◽

Author(s):

Ahmed Abdul Quadeer ◽

David Morales-Jimenez ◽

Matthew R. McKay

Keyword(s):

Statistical Method ◽

Immune Escape ◽

Sequence Data ◽

Principal Component ◽

Population Level ◽

Viral Fitness ◽

Sparse Principal Component Analysis ◽

Phenotypic Properties ◽

Structural And Functional Properties ◽

Robust Statistical Method

AbstractMutational correlation patterns found in population-level sequence data for the Human Immunodeficiency Virus (HIV) and the Hepatitis C Virus (HCV) have been demonstrated to be informative of viral fitness. Such patterns can be seen as footprints of the intrinsic functional constraints placed on viral evolution under diverse selective pressures. Here, considering multiple HIV and HCV proteins, we demonstrate that these mutational correlations encode a modular co-evolutionary structure that is tightly linked to the structural and functional properties of the respective proteins. Specifically, by introducing a robust statistical method based on sparse principal component analysis, we identify near-disjoint sets of collectively-correlated residues (sectors) having mostly a one-to-one association to largely distinct structural or functional domains. This suggests that the distinct phenotypic properties of HIV/HCV proteins often give rise to quasi-independent modes of evolution, with each mode involving a sparse and localized network of mutational interactions. Moreover, individual inferred sectors of HIV are shown to carry immunological significance, providing insight for guiding targeted vaccine strategies.Author summaryHIV and HCV cause devastating infectious diseases for which no functional vaccine exists. A key problem is that while immune cells may induce individual mutations that compromise viral fitness, this is typically restored through other “compensatory” mutations, leading to immune escape. These compensatory pathways are complicated and remain poorly understood. They do, however, leave co-evolutionary markers which may be inferred from measured sequence data. Here, by introducing a new robust statistical method, we demonstrated that the compensatory networks employed by both viruses exhibit a remarkably simple decomposition involving small and near-distinct groups of protein residues, with most groups having a clear association to biological function or structure. This provides insights that can be harnessed for the purpose of vaccine design.

Download Full-text

Fitness estimation for viral variants in the context of cellular coinfection

10.1101/2021.04.26.441479 ◽

2021 ◽

Author(s):

Huisheng Zhu ◽

Brent E Allman ◽

Katia Koelle

Keyword(s):

Evolutionary Dynamics ◽

Sequence Data ◽

Population Level ◽

Viral Fitness ◽

Fitness Effects ◽

Fitness Advantage ◽

Viral Adaptation ◽

Zoonotic Viruses ◽

Frequency Changes

AbstractAnimal models are frequently used to characterize the within-host dynamics of emerging zoonotic viruses. More recent studies have also deep-sequenced longitudinal viral samples originating from experimental challenges to gain a better understanding of how these viruses may evolve in vivo and between transmission events. These studies have often identified nucleotide variants that can replicate more efficiently within hosts and also transmit more effectively between hosts. Quantifying the degree to which a mutation impacts viral fitness within a host can improve identification of variants that are of particular epidemiological concern and our ability to anticipate viral adaptation at the population level. While methods have been developed to quantify the fitness effects of mutations using observed changes in allele frequencies over the course of a host’s infection, none of the existing methods account for the possibility of cellular coinfection. Here, we develop mathematical models to project variant allele frequency changes in the context of cellular coinfection and, further, integrate these models with statistical inference approaches to demonstrate how variant fitness can be estimated alongside cellular multiplicity of infection. We apply our approaches to empirical longitudinally-sampled H5N1 sequence data from ferrets. Our results indicate that previous studies may have significantly underestimated the within-host fitness advantage of viral variants. These findings underscore the importance of considering the process of cellular coinfection when studying within-host viral evolutionary dynamics.

Download Full-text

Fitness Estimation for Viral Variants in the Context of Cellular Coinfection

Viruses ◽

10.3390/v13071216 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1216

Author(s):

Huisheng Zhu ◽

Brent Allman ◽

Katia Koelle

Keyword(s):

Evolutionary Dynamics ◽

Sequence Data ◽

Population Level ◽

Viral Fitness ◽

Fitness Effects ◽

Fitness Advantage ◽

Viral Adaptation ◽

Zoonotic Viruses ◽

Frequency Changes

Animal models are frequently used to characterize the within-host dynamics of emerging zoonotic viruses. More recent studies have also deep-sequenced longitudinal viral samples originating from experimental challenges to gain a better understanding of how these viruses may evolve in vivo and between transmission events. These studies have often identified nucleotide variants that can replicate more efficiently within hosts and also transmit more effectively between hosts. Quantifying the degree to which a mutation impacts viral fitness within a host can improve identification of variants that are of particular epidemiological concern and our ability to anticipate viral adaptation at the population level. While methods have been developed to quantify the fitness effects of mutations using observed changes in allele frequencies over the course of a host’s infection, none of the existing methods account for the possibility of cellular coinfection. Here, we develop mathematical models to project variant allele frequency changes in the context of cellular coinfection and, further, integrate these models with statistical inference approaches to demonstrate how variant fitness can be estimated alongside cellular multiplicity of infection. We apply our approaches to empirical longitudinally sampled H5N1 sequence data from ferrets. Our results indicate that previous studies may have significantly underestimated the within-host fitness advantage of viral variants. These findings underscore the importance of considering the process of cellular coinfection when studying within-host viral evolutionary dynamics.

Download Full-text

Lymphocyte-Monocyte-Neutrophil Index: A Predictor of Severity of Coronavirus Disease 2019 Patients Produced by Sparse Principal Component Analysis

SSRN Electronic Journal ◽

10.2139/ssrn.3576895 ◽

2020 ◽

Author(s):

Jian-an Jia ◽

Yingjie Qi ◽

Huiming Li ◽

Nagen Wan ◽

Shuqin Zhang ◽

...

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Sparse Principal Component Analysis

Download Full-text

A novel super-resolution image and video reconstruction approach based on Newton-Thiele’s rational kernel in sparse principal component analysis

Multimedia Tools and Applications ◽

10.1007/s11042-016-3557-1 ◽

2016 ◽

Vol 76 (7) ◽

pp. 9463-9483

Author(s):

Lei He ◽

Jieqing Tan ◽

Xing Huo ◽

Chengjun Xie

Keyword(s):

Principal Component Analysis ◽

Super Resolution ◽

Principal Component ◽

Component Analysis ◽

Sparse Principal Component Analysis ◽

Resolution Image ◽

Video Reconstruction

Download Full-text

Sparse Principal Component Analysis via Rotation and Truncation

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2015.2427451 ◽

2016 ◽

Vol 27 (4) ◽

pp. 875-890 ◽

Cited By ~ 17

Author(s):

Zhenfang Hu ◽

Gang Pan ◽

Yueming Wang ◽

Zhaohui Wu

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Sparse Principal Component Analysis

Download Full-text

Reef development and Sea level changes drive Acanthaster Population Expansion in the Indo-Pacific region

10.1101/2020.11.18.388207 ◽

2020 ◽

Author(s):

P.C. Pretorius ◽

T.B. Hoareau

Keyword(s):

Population Size ◽

Sea Level ◽

Environmental Changes ◽

Sequence Data ◽

Demographic History ◽

Population Expansion ◽

Population Level ◽

Sea Level Changes ◽

Reef Development ◽

Curve Fitting Method

AbstractMolecular clock calibration is central in population genetics as it provides an accurate inference of demographic history, whereby helping with the identification of driving factors of population changes in an ecosystem. This is particularly important for coral reef species that are seriously threatened globally and in need of conservation. Biogeographic events and fossils are the main source of calibration, but these are known to overestimate timing and parameters at population level, which leads to a disconnection between environmental changes and inferred reconstructions. Here, we propose the Last Glacial Maximum (LGM) calibration that is based on the assumptions that reef species went through a bottleneck during the LGM, which was followed by an early yet marginal increase in population size. We validated the LGM calibration using simulations and genetic inferences based on Extended Bayesian Skyline Plots. Applying it to mitochondrial sequence data of crown-of-thorns starfish Acanthaster spp., we obtained mutation rates that were higher than phylogenetically based calibrations and varied among populations. The timing of the greatest increase in population size differed slightly among populations, but all started between 10 and 20 kya. Using a curve-fitting method, we showed that Acanthaster populations were more influenced by sea-level changes in the Indian Ocean and by reef development in the Pacific Ocean. Our results illustrate that the LGM calibration is robust and can probably provide accurate demographic inferences in many reef species. Application of this calibration has the potential to help identify population drivers that are central for the conservation and management of these threatened ecosystems.

Download Full-text

Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA)

Journal of Computational and Graphical Statistics ◽

10.1080/10618600.2021.1987254 ◽

2021 ◽

pp. 1-33

Author(s):

H. Robert Frost

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Sparse Principal Component Analysis

Download Full-text

Comparison of dimensionality reduction and clustering methods for SARS-CoV-2 genome

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i4.2803 ◽

2021 ◽

Vol 10 (4) ◽

pp. 2170-2180

Author(s):

Untari N. Wisesty ◽

Tati Rajab Mengko

Keyword(s):

Dimensionality Reduction ◽

Dimensional Reduction ◽

Clustering Algorithm ◽

Sequence Data ◽

Clustering Algorithms ◽

Gaussian Mixture Models ◽

Reduction Process ◽

Principal Component ◽

Gaussian Mixture ◽

Clustering Methods

This paper aims to conduct an analysis of the SARS-CoV-2 genome variation was carried out by comparing the results of genome clustering using several clustering algorithms and distribution of sequence in each cluster. The clustering algorithms used are K-means, Gaussian mixture models, agglomerative hierarchical clustering, mean-shift clustering, and DBSCAN. However, the clustering algorithm has a weakness in grouping data that has very high dimensions such as genome data, so that a dimensional reduction process is needed. In this research, dimensionality reduction was carried out using principal component analysis (PCA) and autoencoder method with three models that produce 2, 10, and 50 features. The main contributions achieved were the dimensional reduction and clustering scheme of SARS-CoV-2 sequence data and the performance analysis of each experiment on each scheme and hyper parameters for each method. Based on the results of experiments conducted, PCA and DBSCAN algorithm achieve the highest silhouette score of 0.8770 with three clusters when using two features. However, dimensionality reduction using autoencoder need more iterations to converge. On the testing process with Indonesian sequence data, more than half of them enter one cluster and the rest are distributed in the other two clusters.

Download Full-text

Integrative sparse principal component analysis

Journal of Multivariate Analysis ◽

10.1016/j.jmva.2018.02.002 ◽

2018 ◽

Vol 166 ◽

pp. 1-16 ◽

Cited By ~ 4

Author(s):

Kuangnan Fang ◽

Xinyan Fan ◽

Qingzhao Zhang ◽

Shuangge Ma

Keyword(s):

Principal Component Analysis ◽

Principal Component ◽

Component Analysis ◽

Sparse Principal Component Analysis

Download Full-text

A COMPUTATIONAL FRAMEWORK TO DISCRIMINATE DIFFERENT ANESTHESIA STATES FROM EEG SIGNAL

Biomedical Engineering Applications Basis and Communications ◽

10.4015/s1016237218500205 ◽

2018 ◽

Vol 30 (03) ◽

pp. 1850020 ◽

Cited By ~ 1

Author(s):

Seyyed Abed Hosseini

Keyword(s):

General Anesthesia ◽

Rbf Neural Network ◽

Principal Component ◽

Approximate Entropy ◽

Largest Lyapunov Exponent ◽

Eeg Signal ◽

Wavelet Coefficients ◽

Sparse Principal Component Analysis ◽

Beta Band ◽

Computational Framework

This paper develops a computational framework to classify different anesthesia states, including awake, moderate anesthesia, and general anesthesia, using electroencephalography (EEG) signal. The proposed framework presents data gathering; preprocessing; appropriate selection of window length by genetic algorithm (GA); feature extraction by approximate entropy (ApEn), Petrosian fractal dimension (PFD), Hurst exponent (HE), largest Lyapunov exponent (LLE), Lempel-Ziv complexity (LZC), correlation dimension (CD), and Daubechies wavelet coefficients; feature normalization; feature selection by non-negative sparse principal component analysis (NSPCA); and classification by radial basis function (RBF) neural network. Because of the small number of samples, a five-fold cross-validation approach is used to validate the results. A GA is used to select that by observing an interval of 2.7[Formula: see text]s for further assessment. This paper assessed superior features, such as LZC, ApEn, PFD, HE, the mean value of wavelet coefficients for the beta band, and LLE. The results indicate that the proposed framework can classify different anesthesia states, including awake, moderate anesthesia, and general anesthesia, with an accuracy of 92.07%, 96.18%, and 93.42%, respectively. Therefore, the proposed framework can discriminate different anesthesia states with an average accuracy of 93.89%. Finally, the proposed framework provided a facilitative representation of the brain’s behavior in different states of anesthesia.

Download Full-text