Identification of Novel Conotoxin Precursors from the Cone Snail Conus spurius by High-Throughput RNA Sequencing

Roberto Zamora-Bustillos; Mario Alberto Martínez-Núñez; Manuel B. Aguilar; Reyna Cristina Collí-Dula; Diego Alfredo Brito-Domínguez

doi:10.3390/md19100547

Identification of Novel Conotoxin Precursors from the Cone Snail Conus spurius by High-Throughput RNA Sequencing

Marine Drugs ◽

10.3390/md19100547 ◽

2021 ◽

Vol 19 (10) ◽

pp. 547

Author(s):

Roberto Zamora-Bustillos ◽

Mario Alberto Martínez-Núñez ◽

Manuel B. Aguilar ◽

Reyna Cristina Collí-Dula ◽

Diego Alfredo Brito-Domínguez

Keyword(s):

Rna Sequencing ◽

High Throughput ◽

Sanger Sequencing ◽

Molecular Targets ◽

Cone Snail ◽

Limited Information ◽

Important Research ◽

Marine Gastropods ◽

Sequencing Technologies ◽

Generation Sequencing

Marine gastropods of the genus Conus, comprising more than 800 species, have the characteristic of injecting worms and other prey with venom. These conopeptide toxins, highly diverse in structure and action, are highly potent and specific for their molecular targets (ion channels, receptors, and transporters of the prey’s nervous system), and thus are important research tools and source for drug discovery. Next-generation sequencing technologies are speeding up the discovery of novel conopeptides in many of these species, but only limited information is available for Conus spurius, which inhabits sandy mud. To search for new precursor conopeptides, we analyzed the transcriptome of the venous ducts of C. spurius and identified 55 putative conotoxins. Seven were selected for further study and confirmed by Sanger sequencing to belong to the M-superfamily (Sr3.M01 and Sr3.M02), A-superfamily (Sr1.A01 and Sr1.A02), O-superfamily (Sr15.O01), and Con-ikot-ikot (Sr21.CII01 and Sr22.CII02). Six of these have never been reported. To our knowledge, this report is the first to use high-throughput RNA sequencing for the study of the diversity of C. spurius conotoxins.

Download Full-text

Non-coding RNA bioinformatics platform for full backing of the high-throughput sequencing experiments generated by next-generation sequencing technologies

EMBnet journal ◽

10.14806/ej.18.a.461 ◽

2012 ◽

Vol 18 (A) ◽

pp. 132

Author(s):

F Licciulli ◽

A Consiglio ◽

G De Caro ◽

A Gisel ◽

G Grillo ◽

...

Keyword(s):

Next Generation Sequencing ◽

High Throughput ◽

High Throughput Sequencing ◽

Next Generation ◽

Sequencing Technologies ◽

Non Coding Rna ◽

Generation Sequencing

Download Full-text

Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein-sequence-based replicon distribution scores

10.1101/2020.04.21.053082 ◽

2020 ◽

Cited By ~ 2

Author(s):

Oliver Schwengers ◽

Patrick Barth ◽

Linda Falgenhauer ◽

Torsten Hain ◽

Trinad Chakraborty ◽

...

Keyword(s):

High Throughput ◽

Protein Sequence ◽

Scientific Community ◽

Vital Role ◽

Bacterial Genomes ◽

Short Read ◽

Link Type ◽

Sequencing Technologies ◽

Generation Sequencing

ABSTRACTPlasmids are extrachromosomal genetic elements replicating independently of the chromosome which play a vital role in the environmental adaptation of bacteria. Due to potential mobilization or conjugation capabilities, plasmids are important genetic vehicles for antimicrobial resistance genes and virulence factors with huge and increasing clinical implications. They are therefore subject to large genomic studies within the scientific community worldwide. As a result of rapidly improving next generation sequencing methods, the amount of sequenced bacterial genomes is constantly increasing, in turn raising the need for specialized tools to (i) extract plasmid sequences from draft assemblies, (ii) derive their origin and distribution, and (iii) further investigate their genetic repertoire. Recently, several bioinformatic methods and tools have emerged to tackle this issue; however, a combination of both high sensitivity and specificity in plasmid sequence identification is rarely achieved in a taxon-independent manner. In addition, many software tools are not appropriate for large high-throughput analyses or cannot be included into existing software pipelines due to their technical design or software implementation. In this study, we investigated differences in the replicon distributions of protein-coding genes on a large scale as a new approach to distinguish plasmid-borne from chromosome-borne contigs. We defined and computed statistical discrimination thresholds for a new metric: the replicon distribution score (RDS) which achieved an accuracy of 96.6%. The final performance was further improved by the combination of the RDS metric with heuristics exploiting several plasmid specific higher-level contig characterizations. We implemented this workflow in a new high-throughput taxon-independent bioinformatics software tool called Platon for the recruitment and characterization of plasmid-borne contigs from short-read draft assemblies. Compared to PlasFlow, Platon achieved a higher accuracy (97.5%) and more balanced predictions (F1=82.6%) tested on a broad range of bacterial taxa and better or equal performance against the targeted tools PlasmidFinder and PlaScope on sequenced E. coli isolates. Platon is available at: platon.computational.bioData SummaryPlaton was developed as a Python 3 command line application for Linux.The complete source code and documentation is available on GitHub under a GPL3 license: https://github.com/oschwengers/platon and platon.computational.bio.All database versions are hosted at Zenodo: DOI 10.5281/zenodo.3349651.Platon is available via bioconda package platonPlaton is available via PyPI package cb-platonBacterial representative sequences for UniProt’s UniRef90 protein clusters, complete bacterial genome sequences from the NCBI RefSeq database, complete plasmid sequences from the NCBI genomes plasmid section, created artificial contigs, RDS threshold metrics and raw protein replicon hit counts used to create and evaluate the marker protein sequence database are hosted at Zenodo: DOI 10.5281/zenodo.375916924 Escherichia coli isolates sequenced with short read (Illumina MiSeq) and long read sequencing technologies (Oxford Nanopore Technology GridION platform) used for real data benchmarks are available under the following NCBI BioProjects: PRJNA505407, PRJNA387731Impact StatementPlasmids play a vital role in the spread of antibiotic resistance and pathogenicity genes. The increasing numbers of clinical outbreaks involving resistant pathogens worldwide pushed the scientific community to increase their efforts to comprehensively investigate bacterial genomes. Due to the maturation of next-generation sequencing technologies, nowadays entire bacterial genomes including plasmids are sequenced in huge scale. To analyze draft assemblies, a mandatory first step is to separate plasmid from chromosome contigs. Recently, many bioinformatic tools have emerged to tackle this issue. Unfortunately, several tools are implemented only as interactive or web-based tools disabling them for necessary high-throughput analysis of large data sets. Other tools providing such a high-throughput implementation however often come with certain drawbacks, e.g. providing taxon-specific databases only, not providing actionable, i.e. true binary classification or achieving biased classification performances towards either sensitivity or specificity.Here, we introduce the tool Platon implementing a new replicon distribution-based approach combined with higher-level contig characterizations to address the aforementioned issues. In addition to the plasmid detection within draft assemblies, Platon provides the user with valuable information on certain higher-level contig characterizations. We show that Platon provides a balanced classification performance as well as a scalable implementation for high-throughput analyses. We therefore consider Platon to be a powerful, species-independent and flexible tool to scan large amounts of bacterial whole-genome sequencing data for their plasmid content.

Download Full-text

Assessment of Bioleaching Microbial Community Structure and Function Based on Next-Generation Sequencing Technologies

Minerals ◽

10.3390/min8120596 ◽

2018 ◽

Vol 8 (12) ◽

pp. 596 ◽

Cited By ~ 1

Author(s):

Shuang Zhou ◽

Min Gan ◽

Jianyu Zhu ◽

Xinxing Liu ◽

Guanzhou Qiu

Keyword(s):

Community Structure ◽

Next Generation Sequencing ◽

Microbial Ecology ◽

High Throughput ◽

High Throughput Sequencing ◽

Structure And Function ◽

Next Generation ◽

Sequencing Technologies ◽

And Function ◽

Generation Sequencing

It is widely known that bioleaching microorganisms have to cope with the complex extreme environment in which microbial ecology relating to community structure and function varies across environmental types. However, analyses of microbial ecology of bioleaching bacteria is still a challenge. To address this challenge, numerous technologies have been developed. In recent years, high-throughput sequencing technologies enabling comprehensive sequencing analysis of cellular RNA and DNA within the reach of most laboratories have been added to the toolbox of microbial ecology. The next-generation sequencing technology allowing processing DNA sequences can produce available draft genomic sequences of more bioleaching bacteria, which provides the opportunity to predict models of genetic and metabolic potential of bioleaching bacteria and ultimately deepens our understanding of bioleaching microorganism. High-throughput sequencing that focuses on targeted phylogenetic marker 16S rRNA has been effectively applied to characterize the community diversity in an ore leaching environment. RNA-seq, another application of high-throughput sequencing to profile RNA, can be for both mapping and quantifying transcriptome and has demonstrated a high efficiency in quantifying the changing expression level of each transcript under different conditions. It has been demonstrated as a powerful tool for dissecting the relationship between genotype and phenotype, leading to interpreting functional elements of the genome and revealing molecular mechanisms of adaption. This review aims to describe the high-throughput sequencing approach for bioleaching environmental microorganisms, particularly focusing on its application associated with challenges.

Download Full-text

Uncovering the Genetic Diversity of Giardia Isolates from Outbreaks in New Zealand

10.21203/rs.3.rs-1157774/v1 ◽

2021 ◽

Author(s):

Paul Ogbuigwe ◽

Patrick J. Biggs ◽

Juan Carlos Garcia-Ramirez ◽

Matthew A. Knox ◽

Anthony Pita ◽

...

Keyword(s):

Genetic Diversity ◽

New Zealand ◽

Next Generation Sequencing ◽

Sanger Sequencing ◽

Molecular Techniques ◽

Notifiable Disease ◽

Sequencing Technologies ◽

Common Causes ◽

The World ◽

Generation Sequencing

Abstract BackgroundGiardia is one of the most common causes of diarrhoea in the world and is a notifiable disease in New Zealand. Recent advances in molecular techniques, such as PCR and Sanger sequencing, have greatly improved our understanding of the taxonomic classification and epidemiology of this parasite. However, there has been an inability to identify shared subtypes between samples from the same epidemiologically linked cases, due to samples showing multiple dominant subtypes within the same outbreak when characterised using Sanger sequencing. MethodsHere, NGS was employed to uncover the genetic diversity within samples from sporadic and outbreak cases of giardiasis that occurred in New Zealand between 2010 and 2018. ResultsThis strategy exposed the significant diversity of subtypes of Giardia present in each sample. The utilisation of NGS and metabarcoding at the glutamate dehydrogenase (gdh) locus enabled the identification of shared subtypes between samples from shared outbreaks, providing a better understanding of the epidemiology of outbreaks of giardiasis in New Zealand.ConclusionsNext-generation sequencing technologies provides a superior tool, when compared to consensus sequencing technologies, for capturing the genetic diversity of Giardia within hosts. This study showed that infections in humans are frequently mixed, with multiple subtypes present in each host.

Download Full-text

HD Spot: Interpretable Deep Learning Classification of Single Cell Transcript Data

10.1101/822759 ◽

2019 ◽

Cited By ~ 1

Author(s):

Eric Prince ◽

Todd C. Hankinson

Keyword(s):

Deep Learning ◽

Single Cell ◽

High Throughput ◽

Ground Truth ◽

Sequencing Technologies ◽

Bioinformatic Tool ◽

Complex Relationships ◽

Insight Into ◽

Generation Sequencing

ABSTRACTHigh throughput data is commonplace in biomedical research as seen with technologies such as single-cell RNA sequencing (scRNA-seq) and other Next Generation Sequencing technologies. As these techniques continue to be increasingly utilized it is critical to have analysis tools that can identify meaningful complex relationships between variables (i.e., in the case of scRNA-seq: genes) in a way such that human bias is absent. Moreover, it is equally paramount that both linear and non-linear (i.e., one-to-many) variable relationships be considered when contrasting datasets. HD Spot is a deep learning-based framework that generates an optimal interpretable classifier a given high-throughput dataset using a simple genetic algorithm as well as an autoencoder to classifier transfer learning approach. Using four unique publicly available scRNA-seq datasets with published ground truth, we demonstrate the robustness of HD Spot and the ability to identify ontologically accurate gene lists for a given data subset. HD Spot serves as a bioinformatic tool to allow novice and advanced analysts to gain complex insight into their respective datasets enabling novel hypotheses development.

Download Full-text

Unraveling the Heterogeneity and Ontogeny of Dendritic Cells Using Single-Cell RNA Sequencing

Frontiers in Immunology ◽

10.3389/fimmu.2021.711329 ◽

2021 ◽

Vol 12 ◽

Author(s):

Binyao Chen ◽

Lei Zhu ◽

Shizhao Yang ◽

Wenru Su

Keyword(s):

Dendritic Cells ◽

Single Cell ◽

Rna Sequencing ◽

Adaptive Immunity ◽

High Throughput ◽

High Throughput Sequencing ◽

Future Trends ◽

Innate And Adaptive Immunity ◽

Sequencing Technologies ◽

Single Cell Rna Sequencing

Dendritic cells (DCs) play essential roles in innate and adaptive immunity and show high heterogeneity and intricate ontogeny. Advances in high-throughput sequencing technologies, particularly single-cell RNA sequencing (scRNA-seq), have improved the understanding of DC subsets. In this review, we discuss in detail the remarkable perspectives in DC reclassification and ontogeny as revealed by scRNA-seq. Moreover, the heterogeneity and multifunction of DCs during diseases as determined by scRNA-seq are described. Finally, we provide insights into the challenges and future trends in scRNA-seq technologies and DC research.

Download Full-text

Trans-NanoSim characterizes and simulates nanopore RNA-sequencing data

GigaScience ◽

10.1093/gigascience/giaa061 ◽

2020 ◽

Vol 9 (6) ◽

Cited By ~ 1

Author(s):

Saber Hafezqorani ◽

Chen Yang ◽

Theodora Lo ◽

Ka Ming Nip ◽

René L Warren ◽

...

Keyword(s):

Rna Sequencing ◽

Single Molecule ◽

Rapid Development ◽

Cost Effective ◽

Third Generation ◽

Sequencing Data ◽

Complementary Dna ◽

Sequencing Technologies ◽

Analytical Tools ◽

Generation Sequencing

Abstract Background Compared with second-generation sequencing technologies, third-generation single-molecule RNA sequencing has unprecedented advantages; the long reads it generates facilitate isoform-level transcript characterization. In particular, the Oxford Nanopore Technology sequencing platforms have become more popular in recent years owing to their relatively high affordability and portability compared with other third-generation sequencing technologies. To aid the development of analytical tools that leverage the power of this technology, simulated data provide a cost-effective solution with ground truth. However, a nanopore sequence simulator targeting transcriptomic data is not available yet. Findings We introduce Trans-NanoSim, a tool that simulates reads with technical and transcriptome-specific features learnt from nanopore RNA-sequncing data. We comprehensively benchmarked Trans-NanoSim on direct RNA and complementary DNA datasets describing human and mouse transcriptomes. Through comparison against other nanopore read simulators, we show the unique advantage and robustness of Trans-NanoSim in capturing the characteristics of nanopore complementary DNA and direct RNA reads. Conclusions As a cost-effective alternative to sequencing real transcriptomes, Trans-NanoSim will facilitate the rapid development of analytical tools for nanopore RNA-sequencing data. Trans-NanoSim and its pre-trained models are freely accessible at https://github.com/bcgsc/NanoSim.

Download Full-text

High-throughput next-generation sequencing technologies foster new cutting-edge computing techniques in bioinformatics

BMC Genomics ◽

10.1186/1471-2164-10-s1-i1 ◽

2009 ◽

Vol 10 (Suppl 1) ◽

pp. I1 ◽

Cited By ~ 25

Author(s):

Mary Yang ◽

Brian D Athey ◽

Hamid R Arabnia ◽

Andrew H Sung ◽

Qingzhong Liu ◽

...

Keyword(s):

Next Generation Sequencing ◽

High Throughput ◽

Edge Computing ◽

Cutting Edge ◽

Next Generation ◽

Sequencing Technologies ◽

Generation Sequencing

Download Full-text

Advances in understanding the molecular pathology of gynecological malignancies: the role and potential of RNA sequencing

International Journal of Gynecological Cancer ◽

10.1136/ijgc-2021-002509 ◽

2021 ◽

pp. ijgc-2021-002509

Author(s):

Alba Southern ◽

Mona El-Bahrawy

Keyword(s):

Rna Sequencing ◽

Molecular Pathology ◽

Transcript Abundance ◽

Genetic Alterations ◽

Gynecological Cancers ◽

The Past ◽

Sequencing Technologies ◽

Technological Limitations ◽

Generation Sequencing

For many years technological limitations restricted the progress of identifying the underlying genetic causes of gynecologicalcancers. However, during the past decade, high-throughput next-generation sequencing technologies have revolutionized cancer research. RNA sequencing has arisen as a very useful technique in expanding our understanding of genome changes in cancer. Cancer is characterized by the accumulation of genetic alterations affecting genes, including substitutions, insertions, deletions, translocations, gene fusions, and alternative splicing. If these aberrant genes become transcribed, aberrations can be detected by RNA sequencing, which will also provide information on the transcript abundance revealing the expression levels of the aberrant genes. RNA sequencing is considered the technique of choice when studying gene expression and identifying new RNA species. This is due to the quantitative and qualitative improvement that it has brought to transcriptome analysis, offering a resolution that allows research into different layers of transcriptome complexity. It has also been successful in identifying biomarkers, fusion genes, tumor suppressors, and uncovering new targets responsible for drug resistance in gynecological cancers. To illustrate that we here review the role of RNA sequencing in studies that enhanced our understanding of the molecular pathology of gynecological cancers.

Download Full-text

High-Throughput Screening of Blood Donors for Twelve Human Platelet Antigen Systems Using Next-Generation Sequencing Reveals Detection of Rare Polymorphisms and Two Novel Protein-Changing Variants

Transfusion Medicine and Hemotherapy ◽

10.1159/000504894 ◽

2020 ◽

Vol 47 (1) ◽

pp. 33-44 ◽

Cited By ~ 2

Author(s):

Stephanie Maria Vorholt ◽

Nele Hamker ◽

Hagen Sparka ◽

Jürgen Enczmann ◽

Thomas Zeiler ◽

...

Keyword(s):

Next Generation Sequencing ◽

High Throughput ◽

Blood Donor ◽

Sanger Sequencing ◽

Blood Donors ◽

Human Platelet ◽

Migration Background ◽

Allele Frequencies ◽

Taqman Assay ◽

Generation Sequencing

Background: Exposure to non-matching human platelet alloantigens (HPA) may result in alloimmunization. Antibodies to HPA can be responsible for post-transfusion purpura, refractoriness to donor platelets, and fetal and neonatal alloimmune thrombocytopenia. For the supply of compatible apheresis platelet concentrates, the HPA genotypes are determined in a routine manner. Methods: Here, we describe a novel method for genotyping twelve different HPA systems simultaneously, including HPA-1 to HPA-5, HPA-9w, HPA-10w, HPA-16w, HPA-19w, HPA-27w, and the novel HPA-34w by means of amplicon-based next-generation sequencing (NGS). Blood donor samples of 757 individuals with a migration background and 547 of Western European ancestry were genotyped in a mass-screening setup. An in-house software was developed for fast and automatic analysis. TaqMan assay and Sanger sequencing results served for validation of the NGS workflow. Finally, blood donors were divided in several groups based on their country of origin and the allele frequencies were compared. Results: For 1,299 of 1,304 samples (99.6%) NGS was successfully performed. The concordance with TaqMan assay and Sanger sequencing results was 99.8%. Allele-calling dropouts that were observed for two samples with the TaqMan assay caused by rare single nucleotide polymorphisms were resolved by NGS. Additionally, twenty rare and two novel variants in the coding regions of the genes ITGB3, GPB1A, ITGBA2, and CD109 were detected. The determined allele frequencies were similar to those published in the gnomAD database. Conclusions: No significant differences were observed in the distribution of allele frequencies of HPA-1 through HPA-5 and HPA-15 throughout the analyzed groups except for a lower allele frequency for the HPA-1b allele in the group of donors with Southern Asian ancestry. In contrast, other nucleotide variants that have not yet been phenotypically characterized occurred three times more often in blood donors with a migration background. High-throughput amplicon-based NGS is a reliable method for screening HPA genotypes in a large sample cohort simultaneously. It is easily upgradeable for genotyping additional targets without changing the setup or the analysis pipeline. Mass-screening methods will help building up blood donor registries to provide matched blood products.

Download Full-text