The Co-regulation Data Harvester for Tetrahymena thermophila: automated high-throughput gene annotation and functional inference in a microbial eukaryote

Mapping Intimacies ◽

10.1101/115816 ◽

2017 ◽

Author(s):

Lev M. Tsypin ◽

Aaron P. Turkewitz

Keyword(s):

Membrane Trafficking ◽

Genome Annotation ◽

Tetrahymena Thermophila ◽

Gene Annotation ◽

Biological Pathways ◽

Genome Wide ◽

Transcriptome Database ◽

Functional Inference ◽

Experimental Findings ◽

Reciprocal Blast

AbstractIdentifying co-regulated genes can provide a useful approach for defining pathway-specific machinery in an organism. To be efficient, this approach relies on thorough genome annotation, which is not available for most organisms with sequenced genomes. Studies in Tetrahymena thermophila, the most experimentally accessible ciliate, have generated a rich transcriptomic database covering many well-defined physiological states. Genes that are involved in the same pathway show significant co-regulation, and screens based on gene co-regulation have identified novel factors in specific pathways, for example in membrane trafficking. However, a limitation has been the relatively sparse annotation of the Tetrahymena genome, making it impractical to approach genome-wide analyses. We have therefore developed an efficient approach to analyze both co-regulation and gene annotation, called the Co-regulation Data Harvester (CDH). The CDH automates identification of co-regulated genes by accessing the Tetrahymena transcriptome database, determines their orthologs in other organisms via reciprocal BLAST searches, and collates the annotations of those orthologs' functions. Inferences drawn from the CDH reproduce and expand upon experimental findings in Tetrahymena. The CDH, which is freely available, represents a powerful new tool for analyzing cell biological pathways in Tetrahymena. Moreover, to the extent that genes and pathways are conserved between organisms, the inferences obtained via the CDH should be relevant, and can be explored, in many other systems.

Download Full-text

Genetic analysis of amyotrophic lateral sclerosis identifies contributing pathways and cell types

Science Advances ◽

10.1126/sciadv.abd9036 ◽

2021 ◽

Vol 7 (3) ◽

pp. eabd9036

Author(s):

Sara Saez-Atienzar ◽

Sara Bandres-Ciga ◽

Rebekah G. Langston ◽

Jonggeol J. Kim ◽

Shing Wan Choi ◽

...

Keyword(s):

Amyotrophic Lateral Sclerosis ◽

Membrane Trafficking ◽

Molecular Mechanisms ◽

Cell Types ◽

Polygenic Risk Score ◽

Genome Wide ◽

Genome Wide Data ◽

Data Driven Approach ◽

Single Nucleus ◽

Lateral Sclerosis

Despite the considerable progress in unraveling the genetic causes of amyotrophic lateral sclerosis (ALS), we do not fully understand the molecular mechanisms underlying the disease. We analyzed genome-wide data involving 78,500 individuals using a polygenic risk score approach to identify the biological pathways and cell types involved in ALS. This data-driven approach identified multiple aspects of the biology underlying the disease that resolved into broader themes, namely, neuron projection morphogenesis, membrane trafficking, and signal transduction mediated by ribonucleotides. We also found that genomic risk in ALS maps consistently to GABAergic interneurons and oligodendrocytes, as confirmed in human single-nucleus RNA-seq data. Using two-sample Mendelian randomization, we nominated six differentially expressed genes (ATG16L2, ACSL5, MAP1LC3A, MAPKAPK3, PLXNB2, and SCFD1) within the significant pathways as relevant to ALS. We conclude that the disparate genetic etiologies of this fatal neurological disease converge on a smaller number of final common pathways and cell types.

Download Full-text

Uncovering transcriptional dark matter via gene annotation independent single-cell RNA sequencing analysis

Nature Communications ◽

10.1038/s41467-021-22496-3 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Michael F. Z. Wang ◽

Madhav Mantri ◽

Shao-Pei Chou ◽

Gaetano J. Scuderi ◽

David W. McKellar ◽

...

Keyword(s):

Single Cell ◽

Genome Annotation ◽

Gene Annotation ◽

Active Regions ◽

Sequencing Analysis ◽

Biologically Relevant ◽

Mole Rat ◽

Genome Annotations ◽

Cell Expression ◽

High Quality Genome

AbstractConventional scRNA-seq expression analyses rely on the availability of a high quality genome annotation. Yet, as we show here with scRNA-seq experiments and analyses spanning human, mouse, chicken, mole rat, lemur and sea urchin, genome annotations are often incomplete, in particular for organisms that are not routinely studied. To overcome this hurdle, we created a scRNA-seq analysis routine that recovers biologically relevant transcriptional activity beyond the scope of the best available genome annotation by performing scRNA-seq analysis on any region in the genome for which transcriptional products are detected. Our tool generates a single-cell expression matrix for all transcriptionally active regions (TARs), performs single-cell TAR expression analysis to identify biologically significant TARs, and then annotates TARs using gene homology analysis. This procedure uses single-cell expression analyses as a filter to direct annotation efforts to biologically significant transcripts and thereby uncovers biology to which scRNA-seq would otherwise be in the dark.

Download Full-text

Genome-wide association analysis of chickpea germplasms differing for salinity tolerance based on DArTseq markers

PLoS ONE ◽

10.1371/journal.pone.0260709 ◽

2021 ◽

Vol 16 (12) ◽

pp. e0260709

Author(s):

Shaimaa Mahmoud Ahmed ◽

Alsamman Mahmoud Alsamman ◽

Abdulqader Jighly ◽

Mohamed Hassan Mubarak ◽

Khaled Al-Shamaa ◽

...

Keyword(s):

Association Analysis ◽

Salinity Tolerance ◽

Field Experiments ◽

Agricultural Research ◽

Gene Annotation ◽

Genome Wide Association ◽

Grain Legume ◽

Genome Wide Association Analysis ◽

New Genes ◽

Genome Wide

Soil salinity is significant abiotic stress that severely limits global crop production. Chickpea (Cicer arietinum L.) is an important grain legume that plays a substantial role in nutritional food security, especially in the developing world. This study used a chickpea population collected from the International Center for Agricultural Research in the Dry Area (ICARDA) genebank using the focused identification of germplasm strategy. The germplasm included 186 genotypes with broad Asian and African origins and genotyped with 1856 DArTseq markers. We conducted phenotyping for salinity in the field (Arish, Sinai, Egypt) and greenhouse hydroponic experiments at 100 mM NaCl concentration. Based on the performance in both hydroponic and field experiments, we identified seven genotypes from Azerbaijan and Pakistan (IGs: 70782, 70430, 70764, 117703, 6057, 8447, and 70249) as potential sources for high salinity tolerance. Multi-trait genome-wide association analysis (mtGWAS) detected one locus on chromosome Ca4 at 10618070 bp associated with salinity tolerance under hydroponic and field conditions. In addition, we located another locus specific to the hydroponic system on chromosome Ca2 at 30537619 bp. Gene annotation analysis revealed the location of rs5825813 within the Embryogenesis-associated protein (EMB8-like), while the location of rs5825939 is within the Ribosomal Protein Large P0 (RPLP0). Utilizing such markers in practical breeding programs can effectively improve the adaptability of current chickpea cultivars in saline soil. Moreover, researchers can use our markers to facilitate the incorporation of new genes into commercial cultivars.

Download Full-text

Molecular Stratification of Chronic Kidney Disease

10.1101/2021.09.09.21263234 ◽

2021 ◽

Author(s):

Anna Reznichenko ◽

Viji Nair ◽

Sean Eddy ◽

Mark Tomilo ◽

Timothy Slidel ◽

...

Keyword(s):

Chronic Kidney Disease ◽

Kidney Disease ◽

Cell Types ◽

Biological Pathways ◽

Self Organizing Maps ◽

Molecular Stratification ◽

Indirect Measures ◽

Genome Wide ◽

Neural Network Algorithm

Current classification of chronic kidney disease (CKD) into stages based on the indirect measures of kidney functional state, estimated glomerular filtration rate and albuminuria, is agnostic to the heterogeneity of underlying etiologies, histopathology, and molecular processes. We used genome-wide transcriptomics from patients kidney biopsies, directly reflecting kidney biological processes, to stratify patients from three independent CKD cohorts. Unsupervised Self-Organizing Maps (SOM), an artificial neural network algorithm, assembled CKD patients into four novel subgroups, molecular categories, based on the similarity of their kidney transcriptomics profiles. The unbiased, molecular categories were present across CKD stages and histopathological diagnoses, highlighting heterogeneity of conventional clinical subgroups at the molecular level. CKD molecular categories were distinct in terms of biological pathways, transcriptional regulation and associated kidney cell types, indicating that the molecular categorization is founded on biologically meaningful mechanisms. Importantly, our results revealed that not all biological pathways are equally activated in all patients; instead, different pathways could be more dominant in different subgroups and thereby differentially influencing disease progression and outcomes. This first kidney-centric unbiased categorization of CKD paves the way to an integrated clinical, morphological and molecular diagnosis. This is a key step towards enabling precision medicine for this heterogeneous condition with the potential to advance biological understanding, clinical management, and drug development, as well as establish a roadmap for molecular reclassification of CKD and other complex diseases.

Download Full-text

A Tool to Build Up-To-Date Gene Annotations for Affymetrix Microarrays

Genomics and Computational Biology ◽

10.18547/gcb.2017.vol3.iss2.e38 ◽

2017 ◽

Vol 3 (2) ◽

pp. 38 ◽

Cited By ~ 1

Author(s):

Vladislava Milchevskaya ◽

Grischa Tödt ◽

Toby James Gibson

Keyword(s):

Statistical Power ◽

Gene Annotation ◽

Clinical Diagnostics ◽

Specific Gene ◽

Microarray Probe ◽

New Genes ◽

Affymetrix Microarrays ◽

Genome Wide ◽

Definition Of ◽

Genome Annotations

Genome-wide expression profiling and genotyping is widely applied in functional genomics research, ranging from stem cell studies to cancer, in drug response studies, and in clinical diagnostics. The Affymetrix GeneChip microarrays represent the most popular platform for such assays. Nevertheless, due to rapid and continuous improvement of the knowledge about the genome, the definition of many of the genes and transcripts change, and new genes are discovered. Thus the original probe information is out-dated for a number of Affymetrix platforms, and needs to be re-defined. It has been demonstrated, that accurate probe set definition improves both coverage of the gene expression analysis and its statistical power. Therefore we developed a method that incorporates the most recent genome annotations into the annotation of the microarray probe sets, using tools from the next generation sequencing. Additionally our method allows to quickly build project specific gene annotation models, as well as for comparison of microarray to RNAseq data.

Download Full-text

Identification and classification of cis-regulatory elements in the amphipod crustacean Parhyale hawaiensis

10.1101/2021.09.16.460328 ◽

2021 ◽

Author(s):

Dennis A Sun ◽

Nipam H Patel

Keyword(s):

Gene Expression ◽

Genome Annotation ◽

Time Course ◽

Regulatory Elements ◽

Rna Seq ◽

Genome Wide ◽

Long Read ◽

Amphipod Crustacean ◽

Accessible Chromatin

AbstractEmerging research organisms enable the study of biology that cannot be addressed using classical “model” organisms. The development of novel data resources can accelerate research in such animals. Here, we present new functional genomic resources for the amphipod crustacean Parhyale hawaiensis, facilitating the exploration of gene regulatory evolution using this emerging research organism. We use Omni-ATAC-Seq, an improved form of the Assay for Transposase-Accessible Chromatin coupled with next-generation sequencing (ATAC-Seq), to identify accessible chromatin genome-wide across a broad time course of Parhyale embryonic development. This time course encompasses many major morphological events, including segmentation, body regionalization, gut morphogenesis, and limb development. In addition, we use short- and long-read RNA-Seq to generate an improved Parhyale genome annotation, enabling deeper classification of identified regulatory elements. We leverage a variety of bioinformatic tools to discover differential accessibility, predict nucleosome positioning, infer transcription factor binding, cluster peaks based on accessibility dynamics, classify biological functions, and correlate gene expression with accessibility. Using a Minos transposase reporter system, we demonstrate the potential to identify novel regulatory elements using this approach, including distal regulatory elements. This work provides a platform for the identification of novel developmental regulatory elements in Parhyale, and offers a framework for performing such experiments in other emerging research organisms.Primary Findings-Omni-ATAC-Seq identifies cis-regulatory elements genome-wide during crustacean embryogenesis-Combined short- and long-read RNA-Seq improves the Parhyale genome annotation-ImpulseDE2 analysis identifies dynamically regulated candidate regulatory elements-NucleoATAC and HINT-ATAC enable inference of nucleosome occupancy and transcription factor binding-Fuzzy clustering reveals peaks with distinct accessibility and chromatin dynamics-Integration of accessibility and gene expression reveals possible enhancers and repressors-Omni-ATAC can identify known and novel regulatory elements

Download Full-text

The Medicago truncatula Transcriptome Database MtExpress: Genome-Wide Expression Profiles at Your Fingertips

Plant and Cell Physiology ◽

10.1093/pcp/pcab144 ◽

2021 ◽

Author(s):

Helge Küster

Keyword(s):

Medicago Truncatula ◽

Expression Profiles ◽

Genome Wide ◽

Transcriptome Database ◽

Genome Wide Expression

Download Full-text

A genome-wide pathway enrichment analysis identifies brain region related biological pathways associated with intelligence

Psychiatry Research ◽

10.1016/j.psychres.2018.07.029 ◽

2018 ◽

Vol 268 ◽

pp. 238-242 ◽

Cited By ~ 2

Author(s):

Yanan Du ◽

Yujie Ning ◽

Yan Wen ◽

Li Liu ◽

Xiao Liang ◽

...

Keyword(s):

Brain Region ◽

Enrichment Analysis ◽

Biological Pathways ◽

Pathway Enrichment Analysis ◽

Pathway Enrichment ◽

Genome Wide ◽

A Genome

Download Full-text

COMP-04. MODELING PRECISION ONCOLOGY FOR GLIOBLASTOMA THROUGH INTEGRATION OF DESCRIPTIVE, FUNCTIONAL, AND NETWORK-BASED GENOMICS

Neuro-Oncology ◽

10.1093/neuonc/noz175.247 ◽

2019 ◽

Vol 21 (Supplement_6) ◽

pp. vi61-vi62

Author(s):

Pia Hoellerbauer ◽

Megan Kufeld ◽

Sonali Arora ◽

Emily Girard ◽

James Olson ◽

...

Keyword(s):

Membrane Trafficking ◽

Mitochondrial Protein ◽

Survival Rates ◽

Mitotic Catastrophe ◽

Patient Specific ◽

Specific Gene ◽

Precision Oncology ◽

Genome Wide ◽

F Box Protein ◽

Phosphatidylinositol 4

Abstract Precision oncology is largely based on the notion that identification and targeting of oncogenic drivers will lead to improved clinical outcomes. However, the promise of precision oncology awaits to be fulfilled for many cancers, including Glioblastoma (GBM), where identification of oncogenic drivers has yet to improve survival rates. Here, we have attempted to systematically identify GBM vulnerabilities by performing genome-wide CRISRP-Cas9 lethality screens in patient-derived GBM stem-like cells (GSCs). In validation studies, we comprehensively retested GSC-specific hits in multiple GSC isolates, which were also genomically profiled (e.g. RNA-seq, exome-seq, CNV), and further integrated these data with CRISPR-Cas9 lethality screens from over 500 human cell lines from the Broad Institute’s CRISPR Avana dataset. As a result, we have begun making GBM dependency predictions and functional associations for top scoring hits, including: tumor developmental subtype; loss of functional redundancy with other genes/proteins; cancer-specific subnetworks of genes involved in mitochondrial protein turnover and membrane trafficking; and genes of unknown function essential for subset of GBMs. A few examples of these categories include the following scenarios. We find ADAR (Adenosine Deaminase RNA Specific) gene dependency is associated with the mesenchymal GBM subtype. The EFR3Agene, which has roles in maintaining active pools of phosphatidylinositol 4-kinase, appears required when the expression of its paralog EFR3Bis low or absent in tumor cells. The F-box protein-encoding gene FBXO42appears non-essential to most human cells lines and neural stem cells, but when knocked out in sensitive GSCs causes mitotic arrest, mitotic catastrophe, and cell death. While still a work in progress, we hope to use these results as a foundation for exploring and illuminating patient-specific molecular vulnerabilities for brain tumors. The results also underscore the need for integration of functional genetic approaches, where gene activities are inhibited, into precision oncology paradigms.

Download Full-text

An evaluation of noncoding genome annotation tools through enrichment analysis of 15 genome-wide association studies

Briefings in Bioinformatics ◽

10.1093/bib/bbx131 ◽

2017 ◽

Vol 20 (3) ◽

pp. 995-1003 ◽

Cited By ~ 2

Author(s):

Boyang Li ◽

Qiongshi Lu ◽

Hongyu Zhao

Keyword(s):

Genome Annotation ◽

Association Studies ◽

Enrichment Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text