Cytokine-Related Genes Identified From the RIKEN Full-Length Mouse cDNA Data Set

V. Brusic

doi:10.1101/gr.1016503

Identification of Novel "Pathologs" (Human Disease-Related Gene Candidates) From the RIKEN Full-Length Mouse cDNA Data Set

Genome Research ◽

10.1101/gr.1461303 ◽

2003 ◽

Vol 13 (6) ◽

pp. 1559-1559

Author(s):

D. G. Silva

Keyword(s):

Human Disease ◽

Full Length ◽

Related Gene ◽

Data Set ◽

Mouse Cdna ◽

Disease Related Gene

Download Full-text

Novel High-Rank Phylogenetic Lineages within a Sulfur Spring (Zodletone Spring, Oklahoma), Revealed Using a Combined Pyrosequencing-Sanger Approach

Applied and Environmental Microbiology ◽

10.1128/aem.00002-12 ◽

2012 ◽

Vol 78 (8) ◽

pp. 2677-2688 ◽

Cited By ~ 28

Author(s):

Noha Youssef ◽

Brandi L. Steidley ◽

Mostafa S. Elshahed

Keyword(s):

High Throughput ◽

Accurate Determination ◽

Full Length ◽

Read Length ◽

Rrna Gene ◽

Data Set ◽

Content Type ◽

Rare Biosphere ◽

Phylogenetic Affiliation ◽

Phylogenetic Lineages

ABSTRACTThe utilization of high-throughput sequencing technologies in 16S rRNA gene-based diversity surveys has indicated that within most ecosystems, a significant fraction of the community could not be assigned to known microbial phyla. Accurate determination of the phylogenetic affiliation of such sequences is difficult due to the short-read-length output of currently available high-throughput technologies. This fraction could harbor multiple novel phylogenetic lineages that have so far escaped detection. Here we describe our efforts in accurate assessment of the novelty and phylogenetic affiliation of selected unclassified lineages within a pyrosequencing data set generated from source sediments of Zodletone Spring, a sulfide- and sulfur-rich spring in southwestern Oklahoma. Lineage-specific forward primers were designed for 78 putatively novel lineages identified within the pyrosequencing data set, and representative nearly full-length small-subunit (SSU) rRNA gene sequences were obtained by pairing those primers with reverse universal bacterial primers. Of the 78 lineages tested, amplifiable products were obtained for 52, 32 of which had at least one nearly full-length sequence that was representative of the lineage targeted. Analysis of phylogenetic affiliation of the obtained Sanger sequences identified 5 novel candidate phyla and 10 novel candidate classes (withinFibrobacteres,Planctomycetes, and candidate phyla BRC1, GN12, TM6, TM7, LD1, WS2, and GN06) in the data set, in addition to multiple novel orders and families. The discovery of multiple novel phyla within a pilot study of a single ecosystem clearly shows the potential of the approach in identifying novel diversities within the rare biosphere.

Download Full-text

Isolation and characterization of full-length mouse cDNA and genomic clones of 3-methylcholanthrene-inducible cytochrome P1-450 and P3-450

Gene ◽

10.1016/0378-1119(84)90057-x ◽

1984 ◽

Vol 29 (3) ◽

pp. 281-292 ◽

Cited By ~ 49

Author(s):

Frank J. Gonzalez ◽

Peter I. Mackenzie ◽

Shioko Kimura ◽

Daniel W. Nebert

Keyword(s):

Full Length ◽

Genomic Clones ◽

Mouse Cdna ◽

Isolation And Characterization

Download Full-text

Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.242603899 ◽

2002 ◽

Vol 99 (26) ◽

pp. 16899-16903 ◽

Cited By ~ 1161

Author(s):

Keyword(s):

Full Length ◽

Cdna Sequences ◽

Initial Analysis ◽

Mouse Cdna ◽

Human And Mouse

Download Full-text

Full-length de novo protein structure determination from cryo-EM maps using deep learning

10.1101/2020.08.28.271981 ◽

2020 ◽

Author(s):

Jiahua He ◽

Sheng-You Huang

Keyword(s):

Deep Learning ◽

Structure Determination ◽

De Novo ◽

Protein Structures ◽

Protein Structure Determination ◽

Full Length ◽

Supplementary Information ◽

General Applicability ◽

Data Set ◽

Convolutional Networks

AbstractAdvances in microscopy instruments and image processing algorithms have led to an increasing number of cryo-EM maps. However, building accurate models for the EM maps at 3-5 Å resolution remains a challenging and time-consuming process. With the rapid growth of deposited EM maps, there is an increasing gap between the maps and reconstructed/modeled 3-dimensional (3D) structures. Therefore, automatic reconstruction of atomic-accuracy full-atom structures from EM maps is pressingly needed. Here, we present a semi-automatic de novo structure determination method using a deep learning-based framework, named as DeepMM, which builds atomic-accuracy all-atom models from cryo-EM maps at near-atomic resolution. In our method, the main-chain and Cα positions as well as their amino acid and secondary structure types are predicted in the EM map using Densely Connected Convolutional Networks. DeepMM was extensively validated on 40 simulated maps at 5 Å resolution and 30 experimental maps at 2.6-4.8 Å resolution as well as an EMDB-wide data set of 2931 experimental maps at 2.6-4.9 Å resolution, and compared with state-of-the-art algorithms including RosettaES, MAINMAST, and Phenix. Overall, our DeepMM algorithm obtained a significant improvement over existing methods in terms of both accuracy and coverage in building full-length protein structures on all test sets, demonstrating the efficacy and general applicability of DeepMM.Availabilityhttps://github.com/JiahuaHe/DeepMMSupplementary informationSupplementary data are available.

Download Full-text

Emu: Species-Level Microbial Community Profiling for Full-Length Nanopore 16S Reads

10.1101/2021.05.02.442339 ◽

2021 ◽

Author(s):

Kristen D. Curry ◽

Qi Wang ◽

Michael G. Nute ◽

Alona Tyshaieva ◽

Elizabeth Reeves ◽

...

Keyword(s):

Microbial Community ◽

16S Rrna ◽

Microbial Community Composition ◽

Species Level ◽

Full Length ◽

Alternative Methods ◽

Read Length ◽

Data Set ◽

Microbial Community Profiling ◽

Community Profiling

16S rRNA based analysis is the established standard for elucidating microbial community composition. While short read 16S analyses are largely confined to genus-level resolution at best since only a portion of the gene is sequenced, full-length 16S sequences have the potential to provide species-level accuracy. However, existing taxonomic identification algorithms are not optimized for the increased read length and error rate of long-read data. Here we present Emu, a novel approach that employs an expectation-maximization (EM) algorithm to generate taxonomic abundance profiles from full-length 16S rRNA reads. Results produced from one simulated data set and two mock communities prove Emu capable of accurate microbial community profiling while obtaining fewer false positives and false negatives than alternative methods. Additionally, we illustrate a real-world application of our new software by comparing clinical sample composition estimates generated by an established whole-genome shotgun sequencing workflow to those returned by full-length 16S sequences processed with Emu.

Download Full-text

A first full outer capsid protein sequence data-set in the Orbivirus genus (family Reoviridae): cloning, sequencing, expression and analysis of a complete set of full-length outer capsid VP2 genes of the nine African horsesickness virus serotypes

Journal of General Virology ◽

10.1099/vir.0.18919-0 ◽

2003 ◽

Vol 84 (5) ◽

pp. 1317-1326 ◽

Cited By ~ 15

Author(s):

A. C. Potgieter ◽

M. Cloete ◽

P. J. Pretorius ◽

A. A. van Dijk

Keyword(s):

Capsid Protein ◽

Protein Sequence ◽

Sequence Data ◽

Full Length ◽

Outer Capsid Protein ◽

Data Set ◽

Capsid Protein Sequence ◽

Protein Sequence Data ◽

Complete Set ◽

Outer Capsid

Download Full-text

Unveiling the transcriptomic complexity of Miscanthus sinensis using a combination of PacBio long read- and Illumina short read sequencing platforms

10.21203/rs.3.rs-44130/v2 ◽

2020 ◽

Author(s):

Yongli Wang ◽

Xia Li ◽

Congsheng Wang ◽

Lu Gao ◽

Yanfang Wu ◽

...

Keyword(s):

Genetic Improvement ◽

Average Length ◽

Perennial Grass ◽

Full Length ◽

Miscanthus Sinensis ◽

Illumina Hiseq ◽

Data Set ◽

Long Read ◽

Sequencing Platforms ◽

Rich Data

Abstract Background: Miscanthus sinensis Andersson is a perennial grass that exhibits remarkable lignocellulose characteristics suitable for sustainable bioenergy production. However, knowledge of the genetic resources of this species is relatively limited, which considerably hampers further work on its biology and genetic improvement.Results: In this study, through analyzing the transcriptome of mixed samples of leaves and stems using the latest PacBio Iso-Seq sequencing technology combined with Illumina HiSeq, we report the first full-length transcriptome dataset of M. sinensis with a total of 58.21 Gb clean data. An average of 15.75 Gb clean reads of each sample were obtained from the PacBio Iso-Seq system, which doubled the data size (6.68 Gb) obtained from the Illumina HiSeq platform. The integrated analyses of PacBio- and Illumina-based transcriptomic data uncovered 408,801 non-redundant transcripts with an average length of 1,685 bp. Of those, 189,406 transcripts were commonly identified by both methods, 169,149 transcripts with an average length of 619 bp were uniquely identified by Illumina HiSeq, and 51,246 transcripts with an average length of 2,535 bp were uniquely identified by PacBio Iso-Seq. When comparing our data with genomes of four species of Andropogoneae, M. sinensis showed the closest relationship with sugarcane with up to 93% mapping ratios, followed by sorghum with up to 80% mapping ratios, indicating a high conservation of orthologs in these three genomes. Furthermore, 306,228 transcripts were successfully annotated against public databases including cell wall related genes and transcript factor families, thus providing many new insights into gene functions. The PacBio Iso-Seq data also helped identify 3,898 alternative splicing events and 2,963 annotated AS isoforms within 10 function categories.Conclusions: Taken together, the present study provides a rich data set of full-length transcripts that greatly enriches our understanding of M. sinensis transcriptomic resources, thus facilitating further genetic improvement and molecular studies of the Miscanthus species.

Download Full-text

High crystallizability under air-exclusion conditions of the full-length LysR-type transcriptional regulator TsaR fromComamonas testosteroniT-2 and data-set analysis for a MIRAS structure-solution approach

Acta Crystallographica Section F Structural Biology and Crystallization Communications ◽

10.1107/s1744309108019738 ◽

2008 ◽

Vol 64 (8) ◽

pp. 764-769 ◽

Cited By ~ 1

Author(s):

Dominique Monferrer ◽

Tewes Tralau ◽

Michael A. Kertesz ◽

Santosh Panjikar ◽

Isabel Usón

Keyword(s):

Transcriptional Regulator ◽

Full Length ◽

Data Set ◽

Solution Approach ◽

Structure Solution

Download Full-text

Cloning and expression analysis of full length mouse cDNA sequences encoding the transformation associated protein p53

Nucleic Acids Research ◽

10.1093/nar/12.14.5609 ◽

1984 ◽

Vol 12 (14) ◽

pp. 5609-5626 ◽

Cited By ~ 41

Author(s):

John R. Jenkins ◽

Karran Rudge ◽

Shelagh Redmond ◽

Alison Wade-Evans

Keyword(s):

Expression Analysis ◽

Full Length ◽

Cloning And Expression ◽

Cdna Sequences ◽

Mouse Cdna ◽

Protein P53

Download Full-text