scholarly journals iWhale: a computational pipeline based on Docker and SCons for detection and annotation of somatic variants in cancer WES data

Author(s):  
Andrea Binatti ◽  
Silvia Bresolin ◽  
Stefania Bortoluzzi ◽  
Alessandro Coppe

Abstract Whole exome sequencing (WES) is a powerful approach for discovering sequence variants in cancer cells but its time effectiveness is limited by the complexity and issues of WES data analysis. Here we present iWhale, a customizable pipeline based on Docker and SCons, reliably detecting somatic variants by three complementary callers (MuTect2, Strelka2 and VarScan2). The results are combined to obtain a single variant call format file for each sample and variants are annotated by integrating a wide range of information extracted from several reference databases, ultimately allowing variant and gene prioritization according to different criteria. iWhale allows users to conduct a complex series of WES analyses with a powerful yet customizable and easy-to-use tool, running on most operating systems (macOs, GNU/Linux and Windows). iWhale code is freely available at https://github.com/alexcoppe/iWhale and the docker image is downloadable from https://hub.docker.com/r/alexcoppe/iwhale.

2016 ◽  
Author(s):  
Brent S. Pedersen ◽  
Ryan M. Layer ◽  
Aaron R. Quinlan

ABSTRACTBackgroundThe integration of genome annotations and reference databases is critical to the identification of genetic variants that may be of interest in studies of disease or other traits. However, comprehensive variant annotation with diverse file formats is difficult with existing methods.ResultsWe have developed vcfanno as a flexible toolset that simplifies the annotation of genetic variants in VCF format. Vcfanno can extract and summarize multiple attributes from one or more annotation files and append the resulting annotations to the INFO field of the original VCF file. Vcfanno also integrates the lua scripting language so that users can easily develop custom annotations and metrics. By leveraging a new parallel “chromosome sweeping” algorithm, it enables rapid annotation of both whole-exome and whole-genome datasets. We demonstrate this performance by annotating over 85.3 million variants in less than 17 minutes (>85,000 variants per second) with 50 attributes from 17 commonly used genome annotation resources.ConclusionsVcfanno is a flexible software package that provides researchers with the ability to annotate genetic variation with a wide range of datasets and reference databases in diverse genomic formats.AvailabilityThe vcfanno source code is available at https://github.com/brentp/vcfanno under the MIT license, and platform-specific binaries are available at https://github.com/brentp/vcfanno/releases. Detailed documentation is available at http://brentp.github.io/vcfanno/, and the code underlying the analyses presented can be found at https://github.com/brentp/vcfanno/tree/master/scripts/paper.


Author(s):  
Michael F Lin ◽  
Xiaodong Bai ◽  
William J Salerno ◽  
Jeffrey G Reid

Abstract Summary Variant Call Format (VCF), the prevailing representation for germline genotypes in population sequencing, suffers rapid size growth as larger cohorts are sequenced and more rare variants are discovered. We present Sparse Project VCF (spVCF), an evolution of VCF with judicious entropy reduction and run-length encoding, delivering >10X size reduction for modern studies with practically minimal information loss. spVCF interoperates with VCF efficiently, including tabix-based random access. We demonstrate its effectiveness with the DiscovEHR and UK Biobank whole-exome sequencing cohorts. Availability and implementation Apache-licensed reference implementation: github.com/mlin/spVCF Supplementary information Supplementary data are available at Bioinformatics online.


2006 ◽  
Vol 52 (10) ◽  
pp. 1855-1863 ◽  
Author(s):  
Giulia Amicarelli ◽  
Daniel Adlerstein ◽  
Erlet Shehi ◽  
Fengfei Wang ◽  
G Mike Makrigiorgos

Abstract Background: Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. Materials and Methods: We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled “amplifier”, and an “anchor”. The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. Results: The system detected and genotyped KRAS sequence variants down to ∼0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. Conclusion: OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.


2014 ◽  
Vol 23 (24) ◽  
pp. 6607-6615 ◽  
Author(s):  
Mengmeng Du ◽  
Paul L. Auer ◽  
Shuo Jiao ◽  
Jeffrey Haessler ◽  
David Altshuler ◽  
...  

Synthesis ◽  
2019 ◽  
Vol 52 (01) ◽  
pp. 40-50 ◽  
Author(s):  
Nikita M. Chernov ◽  
Roman V. Shutov ◽  
Anastasia E. Potapova ◽  
Igor P. Yakovlev

We report an easy and powerful approach to the synthesis of novel chromeno[4,3-d]pyrimidine-5-acetic acids through ANRORC reaction of electron-deficient 3-vinylchromones and 1,3-N,N-binucleophiles. The reaction proceeds under mild conditions (EtOH, rt) and is applicable to a wide range of substrates. The described compounds show fluorescence in the violet-blue range (390–460 nm) with Stokes shift of 40–80 nm and moderate quantum yield (0.15–0.20). As the electron-withdrawing group is conserved in the form of an acetic acid fragment, these compounds may readily be functionalized or conjugated to a required substrate for (bio)analytical purposes.


Nanomaterials ◽  
2020 ◽  
Vol 10 (8) ◽  
pp. 1613
Author(s):  
Yunfeng Yan ◽  
Hangwei Ding

Immunotherapy has recently become a promising strategy for the treatment of a wide range of cancers. However, the broad implementation of cancer immunotherapy suffers from inadequate efficacy and toxic side effects. Integrating pH-responsive nanoparticles into immunotherapy is a powerful approach to tackle these challenges because they are able to target the tumor tissues and organelles of antigen-presenting cells (APCs) which have a characteristic acidic microenvironment. The spatiotemporal control of immunotherapeutic drugs using pH-responsive nanoparticles endows cancer immunotherapy with enhanced antitumor immunity and reduced off-tumor immunity. In this review, we first discuss the cancer-immunity circle and how nanoparticles can modulate the key steps in this circle. Then, we highlight the recent advances in cancer immunotherapy with pH-responsive nanoparticles and discuss the perspective for this emerging area.


2018 ◽  
Vol 55 (6) ◽  
pp. 422.2-429 ◽  
Author(s):  
Mathilde Lefebvre ◽  
Anne Dieux-Coeslier ◽  
Geneviève Baujat ◽  
Elise Schaefer ◽  
Saint-Onge Judith ◽  
...  

BackgroundSegmentation defects of the vertebrae (SDV) are non-specific features found in various syndromes. The molecular bases of SDV are not fully elucidated due to the wide range of phenotypes and classification issues. The genes involved are in the Notch signalling pathway, which is a key system in somitogenesis. Here we report on mutations identified in a diagnosis cohort of SDV. We focused on spondylocostal dysostosis (SCD) and the phenotype of these patients in order to establish a diagnostic strategy when confronted with SDV.Patients and methodsWe used DNA samples from a cohort of 73 patients and performed targeted sequencing of the five known SCD-causing genes (DLL3, MESP2, LFNG, HES7 and TBX6) in the first 48 patients and whole-exome sequencing (WES) in 28 relevant patients.ResultsTen diagnoses, including four biallelic variants in TBX6, two biallelic variants in LFNG and DLL3, and one in MESP2 and HES7, were made with the gene panel, and two diagnoses, including biallelic variants in FLNB and one variant in MEOX1, were made by WES. The diagnostic yield of the gene panel was 10/73 (13.7%) in the global cohort but 8/10 (80%) in the subgroup meeting the SCD criteria; the diagnostic yield of WES was 2/28 (8%).ConclusionAfter negative array CGH, targeted sequencing of the five known SCD genes should only be performed in patients who meet the diagnostic criteria of SCD. The low proportion of candidate genes identified by WES in our cohort suggests the need to consider more complex genetic architectures in cases of SDV.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Teresita M. Porter ◽  
Dave M. Morris ◽  
Nathan Basiliko ◽  
Mehrdad Hajibabaei ◽  
Daniel Doucet ◽  
...  

AbstractTerrestrial arthropod fauna have been suggested as a key indicator of ecological integrity in forest systems. Because phenotypic identification is expert-limited, a shift towards DNA metabarcoding could improve scalability and democratize the use of forest floor arthropods for biomonitoring applications. The objective of this study was to establish the level of field sampling and DNA extraction replication needed for arthropod biodiversity assessments from soil. Processing 15 individually collected soil samples recovered significantly higher median richness (488–614 sequence variants) than pooling the same number of samples (165–191 sequence variants) prior to DNA extraction, and we found no significant richness differences when using 1 or 3 pooled DNA extractions. Beta diversity was robust to changes in methodological regimes. Though our ability to identify taxa to species rank was limited, we were able to use arthropod COI metabarcodes from forest soil to assess richness, distinguish among sites, and recover site indicators based on unnamed exact sequence variants. Our results highlight the need to continue DNA barcoding local taxa during COI metabarcoding studies to help build reference databases. All together, these sampling considerations support the use of soil arthropod COI metabarcoding as a scalable method for biomonitoring.


2014 ◽  
Vol 15 (Suppl 7) ◽  
pp. S12 ◽  
Author(s):  
Mark TW Ebbert ◽  
Mark E Wadsworth ◽  
Kevin L Boehme ◽  
Kaitlyn L Hoyt ◽  
Aaron R Sharp ◽  
...  

1992 ◽  
Vol 281 (2) ◽  
pp. 359-368 ◽  
Author(s):  
L M Forrester ◽  
C J Henderson ◽  
M J Glancey ◽  
D J Back ◽  
B K Park ◽  
...  

Cytochrome P450s play a central role in the metabolism and disposition of an extremely wide range of drugs and chemical carcinogens. Individual differences in the expression of these enzymes may be an important determinant in susceptibility to adverse drug reactions, chemical toxins and mutagens. In this paper, we have measured the relative levels of expression of cytochrome P450 isoenzymes from eight gene families or subfamilies in a panel of twelve human liver samples in order to determine the individuality in their expression and whether any forms are co-regulated. Isoenzymes were identified in most cases on Western blots based on the mobility of authentic recombinant human cytochrome P450 standards. The levels of the following P450 proteins correlated with each other: CYP2A6, CYP2B6 and a protein from the CYP2C gene subfamily, CYP2E1 and a member of the CYP2A gene subfamily, CYP2C8, CYP3A3/A4 and total cytochrome P450 content. Also, the levels of two proteins in the CYP4A gene subfamily were highly correlated. These correlations are consistent with the relative regulation of members of these gene families in rats or mice. In addition, the level of expression of specific isoenzymes has also been compared with the rate of metabolism of a panel of drugs, carcinogens and model P450 substrates. These latter studies demonstrate and confirm that the correlations obtained in this manner represent a powerful approach towards the assignment of the metabolism of substrates by specific human P450 isoenzymes.


Sign in / Sign up

Export Citation Format

Share Document