CALINCA—A Novel Pipeline for the Identification of lncRNAs in Podocyte Disease

Sweta Talyan; Samantha Filipów; Michael Ignarski; Magdalena Smieszek; He Chen; Lucas Kühne; Linus Butt; Heike Göbel; K. Johanna R. Hoyer-Allo; Felix C. Koehler; Janine Altmüller; Paul Brinkkötter; Bernhard Schermer; Thomas Benzing; Martin Kann; Roman-Ulrich Müller; Christoph Dieterich

doi:10.3390/cells10030692

CALINCA—A Novel Pipeline for the Identification of lncRNAs in Podocyte Disease

Cells ◽

10.3390/cells10030692 ◽

2021 ◽

Vol 10 (3) ◽

pp. 692

Author(s):

Sweta Talyan ◽

Samantha Filipów ◽

Michael Ignarski ◽

Magdalena Smieszek ◽

He Chen ◽

...

Keyword(s):

Cell Biology ◽

Mammalian Cells ◽

De Novo ◽

Depth Information ◽

Gene Products ◽

Classical Analysis ◽

Protein Coding ◽

Bioinformatic Pipeline ◽

Non Coding Rnas ◽

Filtration Unit

Diseases of the renal filtration unit—the glomerulus—are the most common cause of chronic kidney disease. Podocytes are the pivotal cell type for the function of this filter and focal-segmental glomerulosclerosis (FSGS) is a classic example of a podocytopathy leading to proteinuria and glomerular scarring. Currently, no targeted treatment of FSGS is available. This lack of therapeutic strategies is explained by a limited understanding of the defects in podocyte cell biology leading to FSGS. To date, most studies in the field have focused on protein-coding genes and their gene products. However, more than 80% of all transcripts produced by mammalian cells are actually non-coding. Here, long non-coding RNAs (lncRNAs) are a relatively novel class of transcripts and have not been systematically studied in FSGS to date. The appropriate tools to facilitate lncRNA research for the renal scientific community are urgently required due to a row of challenges compared to classical analysis pipelines optimized for coding RNA expression analysis. Here, we present the bioinformatic pipeline CALINCA as a solution for this problem. CALINCA automatically analyzes datasets from murine FSGS models and quantifies both annotated and de novo assembled lncRNAs. In addition, the tool provides in-depth information on podocyte specificity of these lncRNAs, as well as evolutionary conservation and expression in human datasets making this pipeline a crucial basis to lncRNA studies in FSGS.

Download Full-text

Faculty Opinions recommendation of Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.717960736.793463655 ◽

2012 ◽

Author(s):

François Cambien

Keyword(s):

De Novo ◽

Protein Coding ◽

Protein Coding Genes ◽

Non Coding Rnas

Download Full-text

PHASIS: A computational suite for de novo discovery and characterization of phased, siRNA-generating loci and their miRNA triggers

10.1101/158832 ◽

2017 ◽

Cited By ~ 7

Author(s):

Atul Kakrana ◽

Pingchuan Li ◽

Parth Patel ◽

Reza Hammond ◽

Deepti Anand ◽

...

Keyword(s):

De Novo ◽

Sequencing Data ◽

Protein Coding ◽

Secondary Sirnas ◽

Integrated Methods ◽

Non Coding Rnas

AbstractPhased, secondary siRNAs (phasiRNAs) are found widely in plants, from protein-coding transcripts and long, non-coding RNAs; animal piRNAs are also phased. Integrated methods characterizing “PHAS” loci are unavailable, and existing methods are quite limited and inefficient in handling large volumes of sequencing data. The PHASIS suite described here provides complete tools for the computational characterization of PHAS loci, with an emphasis on plants, in which these loci are numerous. Benchmarked comparisons demonstrate that PHASIS is sensitive, highly scalable and fast. Importantly, PHASIS eliminates the requirement of a sequenced genome and PARE/degradome data for discovery of phasiRNAs and their miRNA triggers.

Download Full-text

Expanding the Chinese hamster ovary cell long non-coding RNA transcriptome using RNASeq

10.1101/863241 ◽

2019 ◽

Author(s):

Krishna Motheramgari ◽

Ricardo Valdés-Bango Curell ◽

Ioanna Tzani ◽

Clair Gallagher ◽

Marina Castro Rivadeneyra ◽

...

Keyword(s):

Dna Sequences ◽

Chinese Hamster Ovary ◽

Cho Cells ◽

Cell Biology ◽

Chinese Hamster ◽

Ovary Cell ◽

Cho Cell ◽

Transcriptomic Response ◽

Protein Coding ◽

Non Coding Rnas

AbstractOur ability to study Chinese hamster ovary (CHO) cell biology has been revolutionised over the last decade with the development of next generation sequencing and the publication of reference DNA sequences for CHO cells and the Chinese hamster. RNA sequencing has not only enabled the association of transcript expression with bioreactor conditions and desirable bioprocess phenotypes but played a key role in the characterisation of protein coding and small non-coding RNAs. The annotation of long non-coding RNAs, and therefore our understanding of their role in CHO cell biology, has been limited to date. In this manuscript, we use high resolution RNASeq data to more than double the number of annotated lncRNA transcripts for the CHOK1 genome. In addition, the utilisation of strand specific sequencing enabled the identification of more than 1,000 new lncRNAs located antisense to protein coding genes. The utility of monitoring lncRNA expression is demonstrated through an analysis of the transcriptomic response to a reduction of cell culture temperature and identification of simultaneous sense/antisense differential expression for the first time in CHO cells. To enable further studies of lncRNAs, the transcripts annotated in this study have been made available for the CHO cell biology community.

Download Full-text

Androgen-Driven Fusion Genes and Chimeric Transcripts in Prostate Cancer

Frontiers in Cell and Developmental Biology ◽

10.3389/fcell.2021.623809 ◽

2021 ◽

Vol 9 ◽

Author(s):

Mauro Scaravilli ◽

Sonja Koivukoski ◽

Leena Latonen

Keyword(s):

Prostate Cancer ◽

Molecular Mechanisms ◽

De Novo ◽

Fusion Gene ◽

Fusion Genes ◽

Protein Coding ◽

Leading Role ◽

Number Of Patients ◽

Androgen Regulation ◽

Non Coding Rnas

Androgens are steroid hormones governing the male reproductive development and function. As such, androgens and the key mediator of their effects, androgen receptor (AR), have a leading role in many diseases. Prostate cancer is a major disease where AR and its transcription factor function affect a significant number of patients worldwide. While disease-related AR-driven transcriptional programs are connected to the presence and activity of the receptor itself, also novel modes of transcriptional regulation by androgens are exploited by cancer cells. One of the most intriguing and ingenious mechanisms is to bring previously unconnected genes under the control of AR. Most often this occurs through genetic rearrangements resulting in fusion genes where an androgen-regulated promoter area is combined to a protein-coding area of a previously androgen-unaffected gene. These gene fusions are distinctly frequent in prostate cancer compared to other common solid tumors, a phenomenon still requiring an explanation. Interestingly, also another mode of connecting androgen regulation to a previously unaffected gene product exists via transcriptional read-through mechanisms. Furthermore, androgen regulation of fusion genes and transcripts is not linked to only protein-coding genes. Pseudogenes and non-coding RNAs (ncRNAs), including long non-coding RNAs (lncRNAs) can also be affected by androgens and de novo functions produced. In this review, we discuss the prevalence, molecular mechanisms, and functional evidence for androgen-regulated prostate cancer fusion genes and transcripts. We also discuss the clinical relevance of especially the most common prostate cancer fusion gene TMPRSS2-ERG, as well as present open questions of prostate cancer fusions requiring further investigation.

Download Full-text

Hominoid-Specific De Novo Protein-Coding Genes Originating from Long Non-Coding RNAs

PLoS Genetics ◽

10.1371/journal.pgen.1002942 ◽

2012 ◽

Vol 8 (9) ◽

pp. e1002942 ◽

Cited By ~ 98

Author(s):

Chen Xie ◽

Yong E. Zhang ◽

Jia-Yu Chen ◽

Chu-Jun Liu ◽

Wei-Zhen Zhou ◽

...

Keyword(s):

De Novo ◽

Protein Coding ◽

Protein Coding Genes ◽

Non Coding Rnas

Download Full-text

Unique genomic features and deeply-conserved functions of long non-coding RNAs in the Cancer LncRNA Census (CLC)

10.1101/152769 ◽

2017 ◽

Cited By ~ 4

Author(s):

Joana Carlevaro-Fita ◽

Andrés Lanzós ◽

Lars Feuerbach ◽

Chen Hong ◽

David Mas-Ponte ◽

...

Keyword(s):

Cancer Progression ◽

Cancer Genomics ◽

De Novo ◽

Cancer Type ◽

Cancer Genes ◽

Driver Genes ◽

Protein Coding ◽

Evidence Type ◽

Non Coding Rnas ◽

Functional Screens

AbstractLong non-coding RNAs (lncRNAs) that drive tumorigenesis are a growing focus of cancer genomics studies. To facilitate further discovery, we have created the “Cancer LncRNA Census” (CLC), a manually-curated and strictly-defined compilation of lncRNAs with causative roles in cancer. CLC has two principle applications: first, as a resource for training and benchmarking de novo identification methods; and second, as a dataset for studying the fundamental properties of these genes.CLC Version 1 comprises 122 lncRNAs implicated in 29 distinct cancers. LncRNAs are included based on functional or genetic evidence for causative roles in cancer progression. All belong to the GENCODE reference annotation, to enable integration across projects and datasets. For each entry, the evidence type, biological activity (oncogene or tumour suppressor), source reference and cancer type are recorded. Supporting its usefulness, CLC genes are significantly enriched amongst de novo predicted driver genes from PCAWG. CLC genes are distinguished from other lncRNAs by a series of features consistent with biological function, including gene length, high expression and sequence conservation of both exons and promoters. We identify a trend for CLC genes to be co-localised with known protein-coding cancer genes along the human genome. Finally, by integrating data from transposon-mutagenesis functional screens, we show that mouse orthologues of CLC genes tend also to be cancer genes.Thus CLC represents a valuable resource for research into long non-coding RNAs in cancer. Their evolutionary and genomic properties have implications for understanding disease mechanisms and point to conserved functions across ~80 million years of evolution.

Download Full-text

Assembly and validation of conserved long non-coding RNAs in the ruminant transcriptome

10.1101/253997 ◽

2018 ◽

Author(s):

Stephen J. Bush ◽

Charity Muriuki ◽

Mary E. B. McCulloch ◽

Iseabail L. Farquhar ◽

Emily L. Clark ◽

...

Keyword(s):

Developmental Stages ◽

De Novo ◽

Expression Profiles ◽

Rna Seq ◽

Protein Coding ◽

Stochastic Sampling ◽

Single Dataset ◽

Non Coding Rnas ◽

Low Levels ◽

Species Mapping

AbstractmRNA-like long non-coding RNAs (lncRNA) are a significant component of mammalian transcriptomes, although most are expressed only at low levels, with high tissue-specificity and/or at specific developmental stages. In many cases, therefore, lncRNA detection by RNA-sequencing (RNA-seq) is compromised by stochastic sampling. To account for this and create a catalogue of ruminant lncRNA, we comparedde novoassembled lncRNA derived from large RNA-seq datasets in transcriptional atlas projects for sheep and goats with previous lncRNA assembled in cattle and human. Few lncRNA could be reproducibly assembled from a single dataset, even with deep sequencing of the same tissues from multiple animals. Furthermore, there was little sequence overlap between lncRNA assembled from pooled RNA-seq data. We combined positional conservation (synteny) with cross-species mapping of candidate lncRNA to identify a consensus set of ruminant lncRNA and then used the RNA-seq data to demonstrate detectable and reproducible expression in each species. The majority of lncRNA were encoded by single exons, and expressed at < 1 TPM. In sheep, 20-30% of lncRNA had expression profiles significantly correlated with neighbouring protein-coding genes, suggesting association with enhancers. Alongside substantially expanding the ruminant lncRNA repertoire, the outcomes of our analysis demonstrate that stochastic sampling can be partly overcome by combining RNA-seq datasets from related species. This has practical implications for the future discovery of lncRNA in other species.

Download Full-text

High-quality genome assemblies uncover caste-specific long non-coding RNAs in ants

10.1101/155119 ◽

2017 ◽

Author(s):

Emily J. Shields ◽

Roberto Bonasio

Keyword(s):

Single Molecule ◽

Behavioral Plasticity ◽

De Novo ◽

High Quality ◽

Camponotus Floridanus ◽

Protein Coding ◽

Long Reads ◽

Non Coding Rnas ◽

And Behavior ◽

Genome Assemblies

ABSTRACTAnts are an emerging model system for neuroepigenetics, as embryos with virtually identical genomes develop into different adult castes that display strikingly different physiology, morphology, and behavior. Although a number of ant genomes have been sequenced to date, their draft quality is an obstacle to sophisticated analyses of epigenetic gene regulation. Using long reads generated with Pacific Biosystem single molecule real time sequencing, we have reassembled de novo high-quality genomes for two ant species: Camponotus floridanus and Harpegnathos saltator. The long reads allowed us to span large repetitive regions and join sequences previously found in separate scaffolds, leading to comprehensive and accurate protein-coding annotations that facilitated the identification of a Gp-9-like gene as differentially expressed in Harpegnathos castes. The new assemblies also enabled us to annotate long non-coding RNAs for the first time in ants, revealing several that were specifically expressed during Harpegnathos development and in the brains of different castes. These upgraded genomes, along with the new coding and non-coding annotations, will aid future efforts to identify epigenetic mechanisms of phenotypic and behavioral plasticity in ants.

Download Full-text

Regulatory Potential of Long Non-Coding RNAs (lncRNAs) in Boar Spermatozoa with Good and Poor Freezability

Life ◽

10.3390/life10110300 ◽

2020 ◽

Vol 10 (11) ◽

pp. 300

Author(s):

Leyland Fraser ◽

Łukasz Paukszto ◽

Anna Mańkowska ◽

Paweł Brym ◽

Przemysław Gilun ◽

...

Keyword(s):

Target Genes ◽

De Novo ◽

Transcriptome Assembly ◽

Expression Profiles ◽

Differentially Expressed ◽

Biological Processes ◽

Protein Coding ◽

Protein Coding Genes ◽

Potential Target ◽

Non Coding Rnas

Long non-coding RNAs (lncRNAs) are suggested to play an important role in the sperm biological processes. We performed de novo transcriptome assembly to characterize lncRNAs in spermatozoa, and to investigate the role of the potential target genes of the differentially expressed lncRNAs (DElncRNAs) in sperm freezability. We detected approximately 4007 DElncRNAs, which were differentially expressed in spermatozoa from boars classified as having good and poor semen freezability (GSF and PSF, respectively). Most of the DElncRNAs were upregulated in boars of the PSF group and appeared to significantly affect the sperm’s response to the cryopreservation conditions. Furthermore, we predicted that the potential target genes were regulated by DElncRNAs in cis or trans. It was found that DElncRNAs of both freezability groups had potential cis- and trans-regulatory effects on different protein-coding genes, such as COX7A2L, TXNDC8 and SOX-7. Gene Ontology (GO) enrichment revealed that the DElncRNA target genes are associated with numerous biological processes, including signal transduction, response to stress, cell death (apoptosis), motility and embryo development. Significant differences in the de novo assembled transcriptome expression profiles of the DElncRNAs between the freezability groups were confirmed by quantitative real-time PCR analysis. This study reveals the potential effects of protein-coding genes of DElncRNAs on sperm functions, which could contribute to further research on their relevance in semen freezability.

Download Full-text

De Novo Profiling of Long Non-Coding RNAs Involved in MC-LR-Induced Liver Injury in Whitefish: Discovery and Perspectives

International Journal of Molecular Sciences ◽

10.3390/ijms22020941 ◽

2021 ◽

Vol 22 (2) ◽

pp. 941

Author(s):

Maciej Florczyk ◽

Paweł Brzuzan ◽

Maciej Woźny

Keyword(s):

Liver Injury ◽

Molecular Mechanisms ◽

Liver Toxicity ◽

De Novo ◽

Minimum Free Energy ◽

Model Organisms ◽

Coregonus Lavaretus ◽

Protein Coding ◽

Liver Transcriptome ◽

Non Coding Rnas

Microcystin-LR (MC-LR) is a potent hepatotoxin for which a substantial gap in knowledge persists regarding the underlying molecular mechanisms of liver toxicity and injury. Although long non-coding RNAs (lncRNAs) have been extensively studied in model organisms, our knowledge concerning the role of lncRNAs in liver injury is limited. Given that lncRNAs show low levels of sequence conservation, their role becomes even more unclear in non-model organisms without an annotated genome, like whitefish (Coregonus lavaretus). The objective of this study was to discover and profile aberrantly expressed polyadenylated lncRNAs that are involved in MC-LR-induced liver injury in whitefish. Using RNA sequencing (RNA-Seq) data, we de novo assembled a high-quality whitefish liver transcriptome. This enabled us to find 94 differentially expressed (DE) putative evolutionary conserved lncRNAs, such as MALAT1, HOTTIP, HOTAIR or HULC, and 4429 DE putative novel whitefish lncRNAs, which differed from annotated protein-coding transcripts (PCTs) in terms of minimum free energy, guanine-cytosine (GC) base-pair content and length. Additionally, we identified DE non-coding transcripts that might be 3′ autonomous untranslated regions (3′UTRs) of mRNAs. We found both evolutionary conserved lncRNAs as well as novel whitefish lncRNAs that could serve as biomarkers of liver injury.

Download Full-text