RNA-seq based analysis of population structure within the maize inbred B73

Mapping Intimacies ◽

10.1101/043513 ◽

2016 ◽

Author(s):

Zhikai Liang ◽

James C Schnable

Keyword(s):

Large Scale ◽

Reference Genome ◽

Data Sets ◽

Rna Seq ◽

Research Groups ◽

Independent Research ◽

Sequencing Project ◽

Maize Genetics ◽

Outcrossing Species ◽

The Relationship

B73 is a variety of maize (Zea mays ssp. mays) widely used in genetic, genomic, and phenotypic research around the world. B73 was also served as the reference genotype for the original maize genome sequencing project. The advent of large-scale RNA-sequencing as a method of measuring gene expression presents a unique opportunity to assess the level of relatedness among individuals identified as variety B73. The level of haplotype conservation and divergence across the genome were assessed using 27 RNA-seq data sets from 20 independent research groups in three countries. Several clearly distinct clades were identified among putatively B73 samples. A number of these blocks were defined by the presence of clearly defined genomic blocks containing a haplotype which did not match the published B73 reference genome. In a number of cases the relationship among B73 samples generated by different research groups recapitulated mentor/mentee relationships within the maize genetics community. A number of regions with distinct, dissimilar, haplotypes were identified in our study. However, when considering the age of the B73 accession -- greater than 40 years -- and the challenges of maintaining isogenic lines of a naturally outcrossing species, a strikingly high overall level of conservation was exhibited among B73 samples from around the globe.

Download Full-text

Detection of Alu Exonization Events in Human Frontal Cortex From RNA-Seq Data

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2021.727537 ◽

2021 ◽

Vol 8 ◽

Author(s):

Liliana Florea ◽

Lindsay Payer ◽

Corina Antonescu ◽

Guangyu Yang ◽

Kathleen Burns

Keyword(s):

Frontal Cortex ◽

Large Scale ◽

Reference Genome ◽

Expression Patterns ◽

Data Sets ◽

Rna Seq ◽

Mrna Isoforms ◽

Specific Expression ◽

Human Frontal Cortex ◽

Alternatively Spliced

Alu exonization events functionally diversify the transcriptome, creating alternative mRNA isoforms and accounting for an estimated 5% of the alternatively spliced (skipped) exons in the human genome. We developed computational methods, implemented into a software called Alubaster, for detecting incorporation of Alu sequences in mRNA transcripts from large scale RNA-seq data sets. The approach detects Alu sequences derived from both fixed and polymorphic Alu elements, including Alu insertions missing from the reference genome. We applied our methods to 117 GTEx human frontal cortex samples to build and characterize a collection of Alu-containing mRNAs. In particular, we detected and characterized Alu exonizations occurring at 870 fixed Alu loci, of which 237 were novel, as well as hundreds of putative events involving Alu elements that are polymorphic variants or rare alleles not present in the reference genome. These methods and annotations represent a unique and valuable resource that can be used to understand the characteristics of Alu-containing mRNAs and their tissue-specific expression patterns.

Download Full-text

Fast and powerful statistical method for context-specific QTL mapping in multi-context genomic studies

10.1101/2021.06.17.448889 ◽

2021 ◽

Author(s):

Andrew Lu ◽

Mike Thompson ◽

M Grace Gordon ◽

Andy Dahl ◽

Chun Jimmie Ye ◽

...

Keyword(s):

Fold Increase ◽

Cell Types ◽

Specific Gene ◽

Data Sets ◽

Rna Seq ◽

Computationally Efficient ◽

Genomic Studies ◽

Cell Type Specific ◽

Context Specific ◽

The Relationship

Recent studies suggest that context-specific eQTLs underlie genetic risk factors for complex diseases. However, methods for identifying them are still nascent, limiting their comprehensive characterization and downstream interpretation of disease-associated variants. Here, we introduce FastGxC, a method to efficiently and powerfully map context-specific eQTLs by leveraging the correlation structure of multi-context studies. We first show via simulations that FastGxC is orders of magnitude more powerful and computationally efficient than previous approaches, making previously year-long computations possible in minutes. We next apply FastGxC to bulk multi-tissue and single-cell RNA-seq data sets to produce the most comprehensive tissue- and cell-type-specific eQTL maps to date. We then validate these maps by establishing that context-specific eQTLs are enriched in corresponding functional genomic annotations. Finally, we examine the relationship between context-specific eQTLs and human disease and show that FastGxC context-specific eQTLs provide a three-fold increase in precision to identify relevant tissues and cell types for GWAS variants than standard eQTLs. In summary, FastGxC enables the construction of context-specific eQTL maps that can be used to understand the context-specific gene regulatory mechanisms underlying complex human diseases.

Download Full-text

Detecting Sources of Transcriptional Heterogeneity in Large-Scale RNA-Seq Data Sets

Genetics ◽

10.1534/genetics.116.193714 ◽

2016 ◽

Vol 204 (4) ◽

pp. 1391-1396 ◽

Cited By ~ 8

Author(s):

Brian C. Searle ◽

Rachel M. Gittelman ◽

Ohad Manor ◽

Joshua M. Akey

Keyword(s):

Large Scale ◽

Data Sets ◽

Rna Seq

Download Full-text

Meta-megastudies

The Mental Lexicon ◽

10.1075/ml.11.3.01mye ◽

2016 ◽

Vol 11 (3) ◽

pp. 329-349 ◽

Cited By ~ 3

Author(s):

James Myers

Keyword(s):

Large Scale ◽

Mental Lexicon ◽

Random Variable ◽

Research Groups ◽

Independent Research ◽

Web Based ◽

Linguistic Data ◽

Technological Developments ◽

Regression Techniques

Cross-linguistic data have always been of interest to mental lexicon researchers, but only now are technological developments beginning to make it possible to treat language as a random variable, in an approach we dub meta-megastudies. A meta-megastudy uses regression techniques to tease apart not just factors that are partially confounded across items within languages, as in traditional megastudies, but also factors partially confounded across languages. While large-scale meta-megastudies will be logistically challenging, they promise great theoretical benefits and are becoming ever more feasible via Web-based coordination between independent research groups.

Download Full-text

CRIATIVIDADE E EDUCAÇÃO: possibilidades de um campo de pesquisa

Cadernos de Pesquisa ◽

10.18764/2178-2229.v25n4p129-146 ◽

2018 ◽

Vol 25 (4) ◽

pp. 129

Author(s):

Camila Nagem Marques Vieira ◽

Maria Vitória Campos Mamede Maia

Keyword(s):

Large Scale ◽

State Of The Art ◽

National Level ◽

Research Groups ◽

Creativity Research ◽

The Subject ◽

The Relationship ◽

Theses And Dissertations ◽

State Of Art ◽

Formación Docente

O presente artigo tem por objetivo apresentar o estado da arte da temática criatividade e educação. De metodologia documental, este parte de levantamento das pesquisas acadêmicas desenvolvidas no Brasil, nos últimos seis anos. Os dados apresentados foram sistematizados a partir do Diretório de Grupos de Pesquisa da Capes, no Scielo, de maneira a constatar os artigos que tratam da temática, do Banco de teses e dissertações em âmbito nacional (Capes e BDTD-Biblioteca Digital de Teses e Dissertações) e demais sites e documentos que surgiram no decorrer dessa etapa, de maneira a garantir o rigor do referido estudo. Desta forma foram definidas como lacunas do conhecimento ainda pouco exploradas por pesquisas nacionais: o desenvolvimento de mapeamento em larga escala sobre a pesquisa em criatividade no país, o investimento na relação entre criatividade e formação docente e na formação continuada aplicada à criatividade.Palavras-chave: Criatividade e Educação. Estado da Arte. Pesquisa.CREATIVITY AND EDUCATION: possibilities of a research FieldAbstractThis article aims to present the state of the art of creativity and education. From documentary methodology, this part of survey of the academic researches developed in Brazil, in last six years. The data presented were systematized from the Directory of Research Groups of Capes, in Scielo, in order to verify the articles that deal with the subject, from the Bank of theses and dissertations at the national level (Capes and BDTD-Digital Library of Theses and Dissertations) and other sites and documents that appeared during this stage, in order to guarantee the rigor of this article. In this way, knowledge gaps that still unexplored by national research: the development of large-scale mapping on creativity research in the country, investment in the relationship between creativity and teacher training, and in continuing education applied to creativity.Keywords: Creativity and Education. State of art. Search.CREATIVIDAD Y EDUCACIÓN: posibilidades de um campo de investigaciónResumenEl presente artículo tiene por objetivo presentar el estado del arte de la temática creatividad y educación. De metodología documental, esta parte de levantamiento de las investigaciones académicas desarrolladas en Brasil, en los últimos seis años. Los datos presentados fueron sistematizados a partir del Directorio de Grupos de Investigación de Capes, en Scielo, de manera a constatar los artículos que tratan de la temática, del Banco de tesis y disertaciones a nivel nacional (Capes y BDTD-Biblioteca Digital de Tesis y Disertaciones y otros sitios y documentos que surgieron en el transcurso de esta etapa, de manera a garantizar el rigor del referido estudio. De esta forma se definieron como lagunas del conocimiento aún poco exploradas por investigaciones nacionales: el desarrollo de mapeo a gran escala sobre la investigación en creatividad en el país, la inversión en la relación entre creatividad y formación docente y en la formación continuada aplicada a la creatividad.Palabras clave: Creatividad y Educación. Estado del Arte. Investigación acadêmica.

Download Full-text

Improvements to the Rice Genome Annotation Through Large-Scale Analysis of RNA-Seq and Proteomics Data Sets

Molecular & Cellular Proteomics ◽

10.1074/mcp.ra118.000832 ◽

2018 ◽

Vol 18 (1) ◽

pp. 86-98 ◽

Cited By ~ 5

Author(s):

Zhe Ren ◽

Da Qi ◽

Nina Pugh ◽

Kai Li ◽

Bo Wen ◽

...

Keyword(s):

Genome Annotation ◽

Large Scale ◽

Rice Genome ◽

Data Sets ◽

Scale Analysis ◽

Rna Seq ◽

Proteomics Data ◽

Large Scale Analysis

Download Full-text

CAFU: a Galaxy framework for exploring unmapped RNA-Seq data

Briefings in Bioinformatics ◽

10.1093/bib/bbz018 ◽

2019 ◽

Vol 21 (2) ◽

pp. 676-686 ◽

Cited By ~ 5

Author(s):

Siyuan Chen ◽

Chengzhi Ren ◽

Jingjing Zhai ◽

Jiantao Yu ◽

Xuyang Zhao ◽

...

Keyword(s):

Large Scale ◽

Biological Information ◽

Machine Learning Techniques ◽

Data Sets ◽

Rna Seq ◽

Mixed Species ◽

Short Reads ◽

Comprehensive Collection ◽

Expression Characterization ◽

And Function

Abstract A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological information and insights. To fill this gap, we present Comprehensive Assembly and Functional annotation of Unmapped RNA-Seq data (CAFU), a Galaxy-based framework that can facilitate the large-scale analysis of unmapped RNA sequencing (RNA-Seq) reads from single- and mixed-species samples. By taking advantage of machine learning techniques, CAFU addresses the issue of accurately identifying the species origin of transcripts assembled using unmapped reads from mixed-species samples. CAFU also represents an innovation in that it provides a comprehensive collection of functions required for transcript confidence evaluation, coding potential calculation, sequence and expression characterization and function annotation. These functions and their dependencies have been integrated into a Galaxy framework that provides access to CAFU via a user-friendly interface, dramatically simplifying complex exploration tasks involving unmapped RNA-Seq reads. CAFU has been validated with RNA-Seq data sets from wheat and Zea mays (maize) samples. CAFU is freely available via GitHub: https://github.com/cma2015/CAFU.

Download Full-text

A Framework for Service Semantic Description Based on Knowledge Graph

Electronics ◽

10.3390/electronics10091017 ◽

2021 ◽

Vol 10 (9) ◽

pp. 1017

Author(s):

Qitong Sun ◽

Jun Han ◽

Dianfu Ma

Keyword(s):

Service Discovery ◽

Large Scale ◽

Semantic Information ◽

Knowledge Graph ◽

Data Sets ◽

Accuracy Rate ◽

Data Set ◽

File Storage ◽

Representation Method ◽

The Relationship

To construct a large-scale service knowledge graph is necessary. We propose a method, namely semantic information extension, for service knowledge graphs. We insist on the information of services described by Web Services Description Language (WSDL) and we design the ontology layer of web service knowledge graph and construct the service graph, and using the WSDL document data set, the generated service knowledge graph contains 3738 service entities. In particular, our method can give a full performance to its effect in service discovery. To evaluate our approach, we conducted two sets of experiments to explore the relationship between services and classify services that develop by service descriptions. We constructed two experimental data sets, then designed and trained two different deep neural networks for the two tasks to extract the semantics of the natural language used in the service discovery task. In the prediction task of exploring the relationship between services, the prediction accuracy rate reached 95.1%, and in the service classification experiment, the accuracy rate of TOP5 reached 60.8%. Our experience shows that the service knowledge graph has additional advantages over traditional file storage when managing additional semantic information is effective and the new service representation method is helpful for service discovery and composition tasks.

Download Full-text

RNA-Seq workflow: gene-level exploratory analysis and differential expression

F1000Research ◽

10.12688/f1000research.7035.2 ◽

2016 ◽

Vol 4 ◽

pp. 1070 ◽

Cited By ~ 24

Author(s):

Michael I. Love ◽

Simon Anders ◽

Vladislav Kim ◽

Wolfgang Huber

Keyword(s):

Differential Expression ◽

Gene Expression Analysis ◽

Exploratory Data Analysis ◽

Reference Genome ◽

Rna Seq ◽

Differential Gene Expression Analysis ◽

Gene Level ◽

Exploratory Data ◽

Differential Gene ◽

The Relationship

Here we walk through an end-to-end gene-level RNA-Seq differential expression workflow using Bioconductor packages. We will start from the FASTQ files, show how these were aligned to the reference genome, and prepare a count matrix which tallies the number of RNA-seq reads/fragments within each gene for each sample.We will perform exploratory data analysis (EDA) for quality assessment and to explore the relationship between samples, perform differential gene expression analysis, and visually explore the results.

Download Full-text

Exploring transcriptional switches from pairwise, temporal and population RNA-Seq data using deepTS

Briefings in Bioinformatics ◽

10.1093/bib/bbaa137 ◽

2020 ◽

Cited By ~ 1

Author(s):

Zhixu Qiu ◽

Siyuan Chen ◽

Yuhong Qi ◽

Chunni Liu ◽

Jingjing Zhai ◽

...

Keyword(s):

High Throughput ◽

Stress Responses ◽

Large Scale ◽

Model Organisms ◽

Rna Seq ◽

Research Groups ◽

Web Based ◽

Rich Functionality ◽

Transcriptional Switch ◽

User Friendly

Abstract Transcriptional switch (TS) is a widely observed phenomenon caused by changes in the relative expression of transcripts from the same gene, in spatial, temporal or other dimensions. TS has been associated with human diseases, plant development and stress responses. Its investigation is often hampered by a lack of suitable tools allowing comprehensive and flexible TS analysis for high-throughput RNA sequencing (RNA-Seq) data. Here, we present deepTS, a user-friendly web-based implementation that enables a fully interactive, multifunctional identification, visualization and analysis of TS events for large-scale RNA-Seq datasets from pairwise, temporal and population experiments. deepTS offers rich functionality to streamline RNA-Seq-based TS analysis for both model and non-model organisms and for those with or without reference transcriptome. The presented case studies highlight the capabilities of deepTS and demonstrate its potential for the transcriptome-wide TS analysis of pairwise, temporal and population RNA-Seq data. We believe deepTS will help research groups, regardless of their informatics expertise, perform accessible, reproducible and collaborative TS analyses of large-scale RNA-Seq data.

Download Full-text