Significance of Single-Nucleotide Variants in Long Intergenic Non-protein Coding RNAs

dbMTS: a comprehensive database of putative human microRNA target site SNVs and their functional predictions

10.1101/554485 ◽

2019 ◽

Cited By ~ 2

Author(s):

Chang Li ◽

Michael D. Swartz ◽

Bing Yu ◽

Yongsheng Bai ◽

Xiaoming Liu

Keyword(s):

Target Site ◽

Genetic Mutations ◽

Messenger Rnas ◽

Microrna Target ◽

Functional Importance ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Protein Coding ◽

Functional Annotations ◽

Non Coding Rnas

AbstractmicroRNAs (miRNAs) are short non-coding RNAs that can repress the expression of protein coding messenger RNAs (mRNAs) by binding to the 3’UTR of the target. Genetic mutations such as single nucleotide variants (SNVs) in the 3’UTR of the mRNAs can disrupt this regulatory effect. In this study, we presented dbMTS, the database for miRNA target site (MTS) SNVs, which includes all potential MTS SNVs in the 3’UTR of human genome along with hundreds of functional annotations. This database can help studies easily identify putative SNVs that affect miRNA targeting and facilitate the prioritization of their functional importance. dbMTS is freely available at: https://sites.google.com/site/jpopgen/dbNSFP.

Download Full-text

Tumour mutations in long noncoding RNAs that enhance cell fitness

10.1101/2021.11.06.467555 ◽

2021 ◽

Author(s):

Roberta Esposito ◽

Andres Lanzos ◽

Taisia Polidori ◽

Hugo Guillen-Ramirez ◽

Bernard Merlin ◽

...

Keyword(s):

Noncoding Rnas ◽

Long Noncoding Rnas ◽

Driver Mutations ◽

Cancer Genes ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Protein Coding ◽

Genomic Features ◽

Coding Regions ◽

Lncrna Neat1

Tumour DNA contains thousands of single nucleotide variants (SNVs) in non-protein-coding regions, yet it remains unclear which are driver mutations that promote cell fitness. Amongst the most highly mutated non-coding elements are long noncoding RNAs (lncRNAs), which can promote cancer and may be targeted therapeutically. We here searched for evidence that driver mutations may act through alteration of lncRNA function. Using an integrative driver discovery algorithm, we analysed single nucleotide variants (SNVs) from 2583 primary tumours and 3527 metastases to reveal 54 candidate driver lncRNAs (FDR<0.1). Their relevance is supported by enrichment for previously-reported cancer genes and by clinical and genomic features. Using knockdown and transgene overexpression, we show that tumour SNVs in two novel lncRNAs can boost cell fitness. Researchers have noted particularly high yet unexplained mutation rates in the iconic cancer lncRNA, NEAT1. We apply in cellulo mutagenesis by CRISPR-Cas9 to identify vulnerable regions of NEAT1 where SNVs reproducibly increase cell fitness in both transformed and normal backgrounds. In particular, mutations in the 5-prime region of NEAT1 alter ribonucleoprotein assembly and boost the population of subnuclear paraspeckles. Together, this work reveals function-altering somatic lncRNA mutations as a new route to enhanced cell fitness during transformation and metastasis.

Download Full-text

ProTECT – Prediction of T-cell Epitopes for Cancer Therapy

10.1101/696526 ◽

2019 ◽

Author(s):

Arjun A. Rao ◽

Ada A. Madejska ◽

Jacob Pfeil ◽

Benedict Paten ◽

Sofie R. Salama ◽

...

Keyword(s):

Somatic Mutations ◽

Cytotoxic T Cells ◽

T Cell Epitopes ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Binding Prediction ◽

Local Cluster ◽

Single Nucleotide ◽

Protein Coding ◽

Cell Therapies

AbstractSomatic mutations in cancers affecting protein coding genes can give rise to potentially therapeutic neoepitopes. These neoepitopes can guide Adoptive Cell Therapies (ACTs) and Peptide Vaccines (PVs) to selectively target tumor cells using autologous patient cytotoxic T-cells. Currently, researchers have to independently align their data, call somatic mutations and haplotype the patient’s HLA to use existing neoepitope prediction tools. We present ProTECT, a fully automated, reproducible, scalable, and efficient end-to-end analysis pipeline to identify and rank therapeutically relevant tumor neoepitopes in terms of immunogenicity starting directly from raw patient sequencing data, or from pre-processed data. The ProTECT pipeline encompasses alignment, HLA haplotyping, mutation calling (single nucleotide variants, short insertions and deletions, and gene fusions), peptide:MHC (pMHC) binding prediction, and ranking of final candidates. We demonstrate ProTECT on 326 samples from the TCGA Prostate Adenocarcinoma cohort, and compare it with published tools. ProTECT can be run on a standalone computer, a local cluster, or on a compute cloud using a Mesos backend. ProTECT is highly scalable and can process TCGA data in under 30 minutes per sample when run in large batches. ProTECT is freely available at https://www.github.com/BD2KGenomics/protect.

Download Full-text

Consistency of the Tools That Predict the Impact of Single Nucleotide Variants (SNVs) on Gene Functionality: The BRCA1 Gene

Biomolecules ◽

10.3390/biom10030475 ◽

2020 ◽

Vol 10 (3) ◽

pp. 475

Author(s):

Javier Murillo ◽

Flavio Spetale ◽

Serge Guillaume ◽

Pilar Bulacio ◽

Ignacio Garcia Labari ◽

...

Keyword(s):

Brca1 Gene ◽

Prediction Performance ◽

Single Nucleotide Variants ◽

Consistency Analysis ◽

Single Nucleotide ◽

Protein Coding ◽

Consistency Results ◽

Standard Information ◽

The Impact ◽

Shed Light

Single nucleotide variants (SNVs) occurring in a protein coding gene may disrupt its function in multiple ways. Predicting this disruption has been recognized as an important problem in bioinformatics research. Many tools, hereafter p-tools, have been designed to perform these predictions and many of them are now of common use in scientific research, even in clinical applications. This highlights the importance of understanding the semantics of their outputs. To shed light on this issue, two questions are formulated, (i) do p-tools provide similar predictions? (inner consistency), and (ii) are these predictions consistent with the literature? (outer consistency). To answer these, six p-tools are evaluated with exhaustive SNV datasets from the BRCA1 gene. Two indices, called K a l l and K s t r o n g , are proposed to quantify the inner consistency of pairs of p-tools while the outer consistency is quantified by standard information retrieval metrics. While the inner consistency analysis reveals that most of the p-tools are not consistent with each other, the outer consistency analysis reveals they are characterized by a low prediction performance. Although this result highlights the need of improving the prediction performance of individual p-tools, the inner consistency results pave the way to the systematic design of truly diverse ensembles of p-tools that can overcome the limitations of individual members.

Download Full-text

Estimation of allele-specific fitness effects across human protein-coding sequences and implications for disease

10.1101/441337 ◽

2018 ◽

Cited By ~ 2

Author(s):

Yi-Fei Huang ◽

Adam Siepel

Keyword(s):

Single Nucleotide Variants ◽

High Coverage ◽

Single Nucleotide ◽

Protein Coding ◽

Human Genomics ◽

Coding Sequences ◽

Genomic Features ◽

Fitness Effects ◽

Machine Learning Model ◽

Allele Specific

AbstractA central challenge in human genomics is to understand the cellular, evolutionary, and clinical significance of genetic variants. Here we introduce a unified population-genetic and machine-learning model, called Linear Allele-Specific Selection InferencE (LASSIE), for estimating the fitness effects of all potential single-nucleotide variants, based on polymorphism data and predictive genomic features. We applied LASSIE to 51 high-coverage genome sequences annotated with 33 genomic features, and constructed a map of allele-specific selection coefficients across all protein-coding sequences in the human genome. We show that this map is informative about both human evolution and disease.

Download Full-text

Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes

Science ◽

10.1126/science.1219240 ◽

2012 ◽

Vol 337 (6090) ◽

pp. 64-69 ◽

Cited By ~ 1186

Author(s):

Jacob A. Tennessen ◽

Abigail W. Bigham ◽

Timothy D. O’Connor ◽

Wenqing Fu ◽

Eimear E. Kenny ◽

...

Keyword(s):

Protein Function ◽

Complex Traits ◽

Rare Variants ◽

Purifying Selection ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Protein Coding ◽

Protein Coding Genes ◽

Functional Variants ◽

A Minor

As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.

Download Full-text

Faculty Opinions recommendation of Phylogenetic and physicochemical analyses enhance the classification of rare nonsynonymous single nucleotide variants in type 1 and 2 long-QT syndrome.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.717960422.793463950 ◽

2012 ◽

Author(s):

Jeffrey Noebels ◽

Tara Klassen

Keyword(s):

Long Qt Syndrome ◽

Single Nucleotide Variants ◽

Long Qt ◽

Single Nucleotide ◽

Qt Syndrome

Download Full-text

Analysis of Inter-Chromosomal Distribution of Disease-Related Genes in Human Genome

Current Protein and Peptide Science ◽

10.2174/1389203721666200426233158 ◽

2020 ◽

Vol 21 (11) ◽

pp. 1068-1077

Author(s):

Xiaochao Sun ◽

Bin Yang ◽

Qunye Zhang

Keyword(s):

Spatial Distribution ◽

Model Organisms ◽

Nucleotide Polymorphisms ◽

Chromosomal Distribution ◽

Single Nucleotide ◽

Protein Coding ◽

Single Chromosome ◽

Deletion Mutations ◽

Protein Coding Genes ◽

Disease Related Genes

: Many studies have shown that the spatial distribution of genes within a single chromosome exhibits distinct patterns. However, little is known about the characteristics of inter-chromosomal distribution of genes (including protein-coding genes, processed transcripts and pseudogenes) in different genomes. In this study, we explored these issues using the available genomic data of both human and model organisms. Moreover, we also analyzed the distribution pattern of protein-coding genes that have been associated with 14 common diseases and the insert/deletion mutations and single nucleotide polymorphisms detected by whole genome sequencing in an acute promyelocyte leukemia patient. We obtained the following novel findings. Firstly, inter-chromosomal distribution of genes displays a nonstochastic pattern and the gene densities in different chromosomes are heterogeneous. This kind of heterogeneity is observed in genomes of both lower and higher species. Secondly, protein-coding genes involved in certain biological processes tend to be enriched in one or a few chromosomes. Our findings have added new insights into our understanding of the spatial distribution of genome and disease- related genes across chromosomes. These results could be useful in improving the efficiency of disease-associated gene screening studies by targeting specific chromosomes.

Download Full-text

Single-Nucleotide Variants in microRNAs Sequences or in their Target Genes Might Influence the Risk of Epilepsy: A Review

Cellular and Molecular Neurobiology ◽

10.1007/s10571-021-01058-7 ◽

2021 ◽

Author(s):

Renata Parissi Buainain ◽

Matheus Negri Boschiero ◽

Bruno Camporeze ◽

Paulo Henrique Pires de Aguiar ◽

Fernando Augusto Lima Marson ◽

...

Keyword(s):

Target Genes ◽

Single Nucleotide Variants ◽

Single Nucleotide

Download Full-text

Combination of Genome-Wide Polymorphisms and Copy Number Variations of Pharmacogenes in Koreans

Journal of Personalized Medicine ◽

10.3390/jpm11010033 ◽

2021 ◽

Vol 11 (1) ◽

pp. 33

Author(s):

Nayoung Han ◽

Jung Mi Oh ◽

In-Wha Kim

Keyword(s):

Copy Number ◽

Genome Wide Association Study ◽

Copy Number Gain ◽

Copy Number Variations ◽

Gene Gain ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Haplotype Blocks ◽

Genome Wide ◽

Control And Prevention

For predicting phenotypes and executing precision medicine, combination analysis of single nucleotide variants (SNVs) genotyping with copy number variations (CNVs) is required. The aim of this study was to discover SNVs or common copy CNVs and examine the combined frequencies of SNVs and CNVs in pharmacogenes using the Korean genome and epidemiology study (KoGES), a consortium project. The genotypes (N = 72,299) and CNV data (N = 1000) were provided by the Korean National Institute of Health, Korea Centers for Disease Control and Prevention. The allele frequencies of SNVs, CNVs, and combined SNVs with CNVs were calculated and haplotype analysis was performed. CYP2D6 rs1065852 (c.100C>T, p.P34S) was the most common variant allele (48.23%). A total of 8454 haplotype blocks in 18 pharmacogenes were estimated. DMD ranked the highest in frequency for gene gain (64.52%), while TPMT ranked the highest in frequency for gene loss (51.80%). Copy number gain of CYP4F2 was observed in 22 subjects; 13 of those subjects were carriers with CYP4F2*3 gain. In the case of TPMT, approximately one-half of the participants (N = 308) had loss of the TPMT*1*1 diplotype. The frequencies of SNVs and CNVs in pharmacogenes were determined using the Korean cohort-based genome-wide association study.

Download Full-text