SMAP: exploiting high-throughput sequencing data of patient derived xenografts

Mapping Intimacies ◽

10.1101/440008 ◽

2018 ◽

Author(s):

Yuna Blum ◽

Aurélien de Reyniès ◽

Nelson Dusetti ◽

Juan Iovanna ◽

Laetitia Marisa ◽

...

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Ad Hoc ◽

Population Analysis ◽

Sequencing Analysis ◽

Sequencing Data ◽

Processing Methods ◽

High Throughput Sequencing Data ◽

Rnaseq Data ◽

Mapping Process

AbstractBackgroundPatient-derived xenograft is the model of reference in oncology fordrug response analyses. Xenografts samples have the specificity to be composedof cells from both the graft and the host species. Sequencing analysis ofxenograft samples therefore requires specific processing methods to properlyreconstruct genomic profiles of both the host and graft compartments.ResultsWe propose a novel xenograft sequencing process pipeline termedSMAP for Simultaneous mapping. SMAP integrates the distinction of host andgraft sequencing reads to the mapping process by simultaneously aligning to bothgenome references. We show that SMAP increases accuracy of species-assignmentwhile reducing the number of discarded ambiguous reads compared to otherexisting methods. Moreover, SMAP includes a module called SMAP-fuz toimprove the detection of chimeric transcript fusion in xenograft RNAseq data. Finally, we apply SMAP on a real dataset and show the relevance of pathway andcell population analysis of the tumoral and stromal compartments.ConclusionsIn high-throughput sequencing analysis of xenografts, our resultsshow that: i. the use of ad hoc sequence processing methods is essential, ii. highsequence homology does not introduce a significant bias when proper methodsare used and iii. the detection of fusion transcripts can be improved using ourapproach. SMAP is available on GitHub: cit-bioinfo.github.io/SMAP.

Download Full-text

GenOO: A Modern Perl Framework for High Throughput Sequencing analysis

10.1101/019265 ◽

2015 ◽

Cited By ~ 2

Author(s):

Manolis Maragkakis ◽

Panagiotis Alexiou ◽

Zissimos Mourelatos

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Complex Analysis ◽

Object Oriented ◽

Sequencing Analysis ◽

Sequencing Data ◽

Analysis Tools ◽

High Throughput Sequencing Data ◽

Biological Entities ◽

Computational Structures

Background: High throughput sequencing (HTS) has become one of the primary experimental tools used to extract genomic information from biological samples. Bioinformatics tools are continuously being developed for the analysis of HTS data. Beyond some well-defined core analyses, such as quality control or genomic alignment, the consistent development of custom tools and the representation of sequencing data in organized computational structures and entities remains a challenging effort for bioinformaticians. Results: In this work, we present GenOO [jee-noo], an open-source; object-oriented (OO) Perl framework specifically developed for the design and implementation of HTS analysis tools. GenOO models biological entities such as genes and transcripts as Perl objects, and includes relevant modules, attributes and methods that allow for the manipulation of high throughput sequencing data. GenOO integrates these elements in a simple and transparent way which allows for the creation of complex analysis pipelines minimizing the overhead for the researcher. GenOO has been designed with flexibility in mind, and has an easily extendable modular structure with minimal requirements for external tools and libraries. As an example of the framework’s capabilities and usability, we present a short and simple walkthrough of a custom use case in HTS analysis. Conclusions: GenOO is a tool of high software quality which can be efficiently used for advanced HTS analyses. It has been used to develop several custom analysis tools, leading to a number of published works. Using GenOO as a core development module can greatly benefit users, by reducing the overhead and complexity of managing HTS data and biological entities at hand.

Download Full-text

Faculty Opinions recommendation of Coalescent Inference Using Serially Sampled, High-Throughput Sequencing Data from Intrahost HIV Infection.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726132071.793531014 ◽

2017 ◽

Author(s):

Sarah Rowland-Jones ◽

Sophie Andrews

Keyword(s):

Hiv Infection ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

High Throughput Sequencing Data

Download Full-text

BlindCall: ultra-fast base-calling of high-throughput sequencing data by blind deconvolution

Bioinformatics ◽

10.1093/bioinformatics/btu010 ◽

2014 ◽

Vol 30 (9) ◽

pp. 1214-1219 ◽

Cited By ~ 6

Author(s):

C. Ye ◽

C. Hsiao ◽

H. Corrada Bravo

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Blind Deconvolution ◽

Sequencing Data ◽

Base Calling ◽

High Throughput Sequencing Data

Download Full-text

Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding

MycoKeys ◽

10.3897/mycokeys.39.28109 ◽

2018 ◽

Vol 39 ◽

pp. 29-40 ◽

Cited By ~ 21

Author(s):

Sten Anslan ◽

R. Henrik Nilsson ◽

Christian Wurzbacher ◽

Petr Baldrian ◽

Leho Tedersoo ◽

...

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Computation Time ◽

Potential Effect ◽

Data Sets ◽

Sequencing Data ◽

Operational Taxonomic Units ◽

High Throughput Sequencing Data ◽

Recent Developments

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appears to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon dataset. We conclude that the output of each platform requires manual validation of the OTUs by examining the taxonomy assignment values.

Download Full-text

Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis

Genomics ◽

10.1016/j.ygeno.2017.01.005 ◽

2017 ◽

Vol 109 (2) ◽

pp. 83-90 ◽

Cited By ~ 44

Author(s):

Yan Guo ◽

Yulin Dai ◽

Hui Yu ◽

Shilin Zhao ◽

David C. Samuels ◽

...

Keyword(s):

Data Analysis ◽

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

High Throughput Sequencing Data ◽

Sequencing Data Analysis

Download Full-text

HTSeq - A Python framework to work with high-throughput sequencing data

10.1101/002824 ◽

2014 ◽

Cited By ~ 242

Author(s):

Simon Anders ◽

Paul Theodor Pyl ◽

Wolfgang Huber

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Rapid Development ◽

Differential Expression Analysis ◽

Rna Seq ◽

Sequencing Data ◽

Standard Work ◽

Data Formats ◽

High Throughput Sequencing Data ◽

Python Package

Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard work flows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data such as genomic coordinates, sequences, sequencing reads, alignments, gene model information, variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability: HTSeq is released as open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index, https://pypi.python.org/pypi/HTSeq

Download Full-text

SEED 2: a user-friendly platform for amplicon high-throughput sequencing data analyses

Bioinformatics ◽

10.1093/bioinformatics/bty071 ◽

2018 ◽

Vol 34 (13) ◽

pp. 2292-2294 ◽

Cited By ~ 59

Author(s):

Tomáš Větrovský ◽

Petr Baldrian ◽

Daniel Morais

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

Data Analyses ◽

High Throughput Sequencing Data ◽

User Friendly

Download Full-text

Computational Analysis of High Throughput Sequencing Data

Methods in Molecular Biology - Bioinformatics for Omics Data ◽

10.1007/978-1-61779-027-0_9 ◽

2011 ◽

pp. 199-217 ◽

Cited By ~ 4

Author(s):

Steve Hoffmann

Keyword(s):

High Throughput ◽

Computational Analysis ◽

High Throughput Sequencing ◽

Sequencing Data ◽

High Throughput Sequencing Data

Download Full-text

High-throughput sequencing data of soil bacterial communities from Tweefontein indigenous and commercial forests, South Africa

Data in Brief ◽

10.1016/j.dib.2019.104916 ◽

2020 ◽

Vol 28 ◽

pp. 104916

Author(s):

Adenike Eunice Amoo ◽

Ben Jesuorsemwen Enagbonma ◽

Olubukola Oluranti Babalola

Keyword(s):

South Africa ◽

High Throughput ◽

Bacterial Communities ◽

High Throughput Sequencing ◽

Sequencing Data ◽

Soil Bacterial Communities ◽

High Throughput Sequencing Data ◽

Soil Bacterial

Download Full-text

hypeR: an R package for geneset enrichment workflows

Bioinformatics ◽

10.1093/bioinformatics/btz700 ◽

2019 ◽

Cited By ~ 3

Author(s):

Anthony Federico ◽

Stefano Monti

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

R Package ◽

Use Cases ◽

Sequencing Data ◽

Wide Audience ◽

Popular Method ◽

High Throughput Sequencing Data ◽

One Stop ◽

Recent Version

Abstract Summary Geneset enrichment is a popular method for annotating high-throughput sequencing data. Existing tools fall short in providing the flexibility to tackle the varied challenges researchers face in such analyses, particularly when analyzing many signatures across multiple experiments. We present a comprehensive R package for geneset enrichment workflows that offers multiple enrichment, visualization, and sharing methods in addition to novel features such as hierarchical geneset analysis and built-in markdown reporting. hypeR is a one-stop solution to performing geneset enrichment for a wide audience and range of use cases. Availability and implementation The most recent version of the package is available at https://github.com/montilab/hypeR. Contact [email protected] or [email protected]

Download Full-text