scholarly journals Barcoding and demultiplexing Oxford Nanopore native RNA sequencing reads with deep residual learning

2019 ◽  
Author(s):  
Martin A. Smith ◽  
Tansel Ersavas ◽  
James M. Ferguson ◽  
Huanle Liu ◽  
Morghan C Lucas ◽  
...  

ABSTRACTNanopore sequencing has enabled sequencing of native RNA molecules without conversion to cDNA, thus opening the gates to a new era for the unbiased study of RNA biology. However, a formal barcoding protocol for direct sequencing of native RNA molecules is currently lacking, limiting the efficient processing of multiple samples in the same flowcell. A major limitation for the development of barcoding protocols for direct RNA sequencing is the error rate introduced during the base-calling process, especially towards the 5’ and 3’ ends of reads, which complicates sequence-based barcode demultiplexing. Here, we propose a novel strategy to barcode and demultiplex direct RNA sequencing nanopore data, which does not rely on base-calling or additional library preparation steps. Specifically, custom DNA oligonucleotides are ligated to RNA transcripts during library preparation. Then, raw current signal corresponding to the DNA barcode is extracted and transformed into an array of pixels, which is used to determine the underlying barcode using a deep convolutional neural network classifier. Our method,DeePlexiCon, implements a 20-layer residual neural network model that can demultiplex 93% of the reads with 95.1% specificity, or 60% of reads with 99.9% specificity. The availability of an efficient and simple barcoding strategy for native RNA sequencing will enhance the use of direct RNA sequencing by making it more cost-effective to the entire community. Moreover, it will facilitate the applicability of direct RNA sequencing to samples where the RNA amounts are limited, such as patient-derived samples.

2020 ◽  
Author(s):  
Alessia Del Piano ◽  
Ruggero Barbieri ◽  
Michael Schmid ◽  
Luciano Brocchieri ◽  
Silvia Tornaletti ◽  
...  

AbstractAccurate positional information concerning ribosomes and RNA binding proteins with respect to their transcripts is important to understand the global regulatory network underlying protein and RNA fate in living cells. Most footprinting approaches generate RNA fragments bearing a phosphate or cyclic phosphate groups at their 3′ end. Unfortunately, all current protocols for library preparation rely only on the presence of a 3′ hydroxyl group. Here, we developed circAID-p-seq, a PCR-free library preparation for 3′ phospho-RNA sequencing. We applied circAID-p-seq to ribosome profiling, which produces fragments protected by ribosomes after endonuclease digestion. CircAID-p-seq, combined with the dedicated computational pipeline circAidMe, facilitates accurate, fast, highly efficient and low-cost sequencing of phospho-RNA fragments from eukaryotic cells and tissues. While assessing circAID-p-seq to portray ribosomes engaged with transcripts, we provide a versatile tool to unravel any 3′-phospho RNA molecules.


2020 ◽  
Vol 17 (5) ◽  
pp. 354-364
Author(s):  
Mohammad Mahmoudi Goumari ◽  
Ibrahim Farhani ◽  
Navid Nezafat ◽  
Shirin Mahmoodi

Infectious diseases have caused historical pandemics in the world. Three strategies, including sanitation programs, antimicrobial drugs, and vaccines are considered for the prevention and treatment of infectious diseases. Today, some infectious diseases cause millions of mortalities universally. Due to the emergence of antibiotic-resistant pathogens, as well as some limitations of traditional vaccines, focusing on novel strategies is essential. Multi-Epitope Vaccines (MEVs), as a novel strategy, have been designed based on immunoinformatics methods; epitope prediction by authentic servers, attachment of epitopes using proper linkers, physicochemical, immunological and structural evaluation by bioinformatics tools that are basic stages in MEVs designing. Advantages such as cost-effective, high safety, less time consumption in designing, the application of natural adjuvants, and satisfactory preclinical evaluation outstand MEVs than other types of vaccines. Therefore, MEVs are promising vaccines against resistant diseases such as lower respiratory infection and diarrhea.


2020 ◽  
Author(s):  
Ramachandro Majji

BACKGROUND Cancer is one of the deadly diseases prevailing worldwide and the patients with cancer are rescued only when the cancer is detected at the very early stage. Early detection of cancer is essential as, in the final stage, the chance of survival is limited. The symptoms of cancers are rigorous and therefore, all the symptoms should be studied properly before the diagnosis. OBJECTIVE Propose an automatic prediction system for classifying cancer to malignant or benign. METHODS This paper introduces the novel strategy based on the JayaAnt lion optimization-based Deep recurrent neural network (JayaALO-based DeepRNN) for cancer classification. The steps followed in the developed model are data normalization, data transformation, feature dimension detection, and classification. The first step is the data normalization. The goal of data normalization is to eliminate data redundancy and to mitigate the storage of objects in a relational database that maintains the same information in several places. After that, the data transformation is carried out based on log transformation that generates the patterns using more interpretable and helps fulfill the supposition, and to reduce skew. Also, the non-negative matrix factorization is employed for reducing the feature dimension. Finally, the proposed JayaALO-based DeepRNN method effectively classifies cancer-based on the reduced dimension features to produce a satisfactory result. RESULTS The proposed JayaALO-based DeepRNN showed improved results with maximal accuracy of 95.97%, the maximal sensitivity of 95.95%, and the maximal specificity of 96.96%. CONCLUSIONS The resulted output of the proposed JayaALO-based DeepRNN is used for cancer classification.


Genes ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 283
Author(s):  
Eyal Seroussi

Determination of the relative copy numbers of mixed molecular species in nucleic acid samples is often the objective of biological experiments, including Single-Nucleotide Polymorphism (SNP), indel and gene copy-number characterization, and quantification of CRISPR-Cas9 base editing, cytosine methylation, and RNA editing. Standard dye-terminator chromatograms are a widely accessible, cost-effective information source from which copy-number proportions can be inferred. However, the rate of incorporation of dye terminators is dependent on the dye type, the adjacent sequence string, and the secondary structure of the sequenced strand. These variable rates complicate inferences and have driven scientists to resort to complex and costly quantification methods. Because these complex methods introduce their own biases, researchers are rethinking whether rectifying distortions in sequencing trace files and using direct sequencing for quantification will enable comparable accurate assessment. Indeed, recent developments in software tools (e.g., TIDE, ICE, EditR, BEEP and BEAT) indicate that quantification based on direct Sanger sequencing is gaining in scientific acceptance. This commentary reviews the common obstacles in quantification and the latest insights and developments relevant to estimating copy-number proportions based on direct Sanger sequencing, concluding that bidirectional sequencing and sophisticated base calling are the keys to identifying and avoiding sequence distortions.


Cancers ◽  
2021 ◽  
Vol 13 (15) ◽  
pp. 3876
Author(s):  
Chiao-En Wu ◽  
Chen-Yang Huang ◽  
Chiao-Ping Chen ◽  
Yi-Ru Pan ◽  
John Wen-Cheng Chang ◽  
...  

Background: Intrahepatic cholangiocarcinoma (iCCA) is an adenocarcinoma arising from the intrahepatic bile duct. It is the second most common primary liver cancer and has a poor prognosis. Activation of p53 by targeting its negative regulators, MDM2 and WIP1, is a potential therapy for wild-type p53 cancers, but few reports for iCCA or liver adenocarcinoma exist. Methods: Both RBE and SK-Hep-1 liver adenocarcinoma cell lines were treated with the HDM201 (Siremadlin) MDM2-p53 binding antagonist alone or in combination with the GSK2830371 WIP1 phosphatase inhibitor. Cell proliferation, clonogenicity, protein and mRNA expression, cell cycle distribution, and RNA sequencing were performed to investigate the effect and mechanism of this combination. Results: GSK2830371 alone demonstrated minimal activity on proliferation and colony formation, but potentiated growth inhibition (two-fold decrease in GI50) and cytotoxicity (four-fold decrease in IC50) by HDM201 on RBE and SK-Hep-1 cells. HDM201 increased p53 protein expression, leading to transactivation of downstream targets (p21 and MDM2). Combination with GSK2830371 increased p53 phosphorylation, resulting in an increase in both p53 accumulation and p53-dependent trans-activation. G2/M arrest was observed by flow cytometry after this treatment combination. RNA sequencing identified 21 significantly up-regulated genes and five downregulated genes following p53 reactivation by HDM201 in combination with GSK2830371 at 6 h and 24 h time points compared with untreated controls. These genes were predominantly known transcriptional targets regulated by the p53 signaling pathway, indicating enhanced p53 activation as the predominant effect of this combination. Conclusion: The current study demonstrated that GSK2830371 enhanced the p53-dependent antiproliferative and cytotoxic effect of HDM201 on RBE and SK-Hep-1 cells, providing a novel strategy for potentiating the efficacy of targeting the p53 pathway in iCCA.


BMC Biology ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Milda Mickutė ◽  
Kotryna Kvederavičiūtė ◽  
Aleksandr Osipenko ◽  
Raminta Mineikaitė ◽  
Saulius Klimašauskas ◽  
...  

Abstract Background Targeted installation of designer chemical moieties on biopolymers provides an orthogonal means for their visualisation, manipulation and sequence analysis. Although high-throughput RNA sequencing is a widely used method for transcriptome analysis, certain steps, such as 3′ adapter ligation in strand-specific RNA sequencing, remain challenging due to structure- and sequence-related biases introduced by RNA ligases, leading to misrepresentation of particular RNA species. Here, we remedy this limitation by adapting two RNA 2′-O-methyltransferases from the Hen1 family for orthogonal chemo-enzymatic click tethering of a 3′ sequencing adapter that supports cDNA production by reverse transcription of the tagged RNA. Results We showed that the ssRNA-specific DmHen1 and dsRNA-specific AtHEN1 can be used to efficiently append an oligonucleotide adapter to the 3′ end of target RNA for sequencing library preparation. Using this new chemo-enzymatic approach, we identified miRNAs and prokaryotic small non-coding sRNAs in probiotic Lactobacillus casei BL23. We found that compared to a reference conventional RNA library preparation, methyltransferase-Directed Orthogonal Tagging and RNA sequencing, mDOT-seq, avoids misdetection of unspecific highly-structured RNA species, thus providing better accuracy in identifying the groups of transcripts analysed. Our results suggest that mDOT-seq has the potential to advance analysis of eukaryotic and prokaryotic ssRNAs. Conclusions Our findings provide a valuable resource for studies of the RNA-centred regulatory networks in Lactobacilli and pave the way to developing novel transcriptome and epitranscriptome profiling approaches in vitro and inside living cells. As RNA methyltransferases share the structure of the AdoMet-binding domain and several specific cofactor binding features, the basic principles of our approach could be easily translated to other AdoMet-dependent enzymes for the development of modification-specific RNA-seq techniques.


Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 964
Author(s):  
Sarka Benesova ◽  
Mikael Kubista ◽  
Lukas Valihrach

MicroRNAs (miRNAs) are a class of small RNA molecules that have an important regulatory role in multiple physiological and pathological processes. Their disease-specific profiles and presence in biofluids are properties that enable miRNAs to be employed as non-invasive biomarkers. In the past decades, several methods have been developed for miRNA analysis, including small RNA sequencing (RNA-seq). Small RNA-seq enables genome-wide profiling and analysis of known, as well as novel, miRNA variants. Moreover, its high sensitivity allows for profiling of low input samples such as liquid biopsies, which have now found applications in diagnostics and prognostics. Still, due to technical bias and the limited ability to capture the true miRNA representation, its potential remains unfulfilled. The introduction of many new small RNA-seq approaches that tried to minimize this bias, has led to the existence of the many small RNA-seq protocols seen today. Here, we review all current approaches to cDNA library construction used during the small RNA-seq workflow, with particular focus on their implementation in commercially available protocols. We provide an overview of each protocol and discuss their applicability. We also review recent benchmarking studies comparing each protocol’s performance and summarize the major conclusions that can be gathered from their usage. The result documents variable performance of the protocols and highlights their different applications in miRNA research. Taken together, our review provides a comprehensive overview of all the current small RNA-seq approaches, summarizes their strengths and weaknesses, and provides guidelines for their applications in miRNA research.


Plants ◽  
2020 ◽  
Vol 10 (1) ◽  
pp. 31
Author(s):  
Jia-Rong Xiao ◽  
Pei-Che Chung ◽  
Hung-Yi Wu ◽  
Quoc-Hung Phan ◽  
Jer-Liang Andrew Yeh ◽  
...  

The strawberry (Fragaria × ananassa Duch.) is a high-value crop with an annual cultivated area of ~500 ha in Taiwan. Over 90% of strawberry cultivation is in Miaoli County. Unfortunately, various diseases significantly decrease strawberry production. The leaf and fruit disease became an epidemic in 1986. From 2010 to 2016, anthracnose crown rot caused the loss of 30–40% of seedlings and ~20% of plants after transplanting. The automation of agriculture and image recognition techniques are indispensable for detecting strawberry diseases. We developed an image recognition technique for the detection of strawberry diseases using a convolutional neural network (CNN) model. CNN is a powerful deep learning approach that has been used to enhance image recognition. In the proposed technique, two different datasets containing the original and feature images are used for detecting the following strawberry diseases—leaf blight, gray mold, and powdery mildew. Specifically, leaf blight may affect the crown, leaf, and fruit and show different symptoms. By using the ResNet50 model with a training period of 20 epochs for 1306 feature images, the proposed CNN model achieves a classification accuracy rate of 100% for leaf blight cases affecting the crown, leaf, and fruit; 98% for gray mold cases, and 98% for powdery mildew cases. In 20 epochs, the accuracy rate of 99.60% obtained from the feature image dataset was higher than that of 1.53% obtained from the original one. This proposed model provides a simple, reliable, and cost-effective technique for detecting strawberry diseases.


BioTechniques ◽  
2012 ◽  
Vol 53 (6) ◽  
Author(s):  
Paul Coupland ◽  
Tamir Chandra ◽  
Mike Quail ◽  
Wolf Reik ◽  
Harold Swerdlow

Sign in / Sign up

Export Citation Format

Share Document