iWhale: a computational pipeline based on Docker and SCons for detection and annotation of somatic variants in cancer WES data

Briefings in Bioinformatics ◽

10.1093/bib/bbaa065 ◽

2020 ◽

Author(s):

Andrea Binatti ◽

Silvia Bresolin ◽

Stefania Bortoluzzi ◽

Alessandro Coppe

Keyword(s):

Operating Systems ◽

Sequence Variants ◽

Computational Pipeline ◽

Variant Call Format ◽

Variant Call ◽

Whole Exome ◽

Wide Range ◽

Powerful Approach ◽

Variant Call Format File ◽

Reference Databases

Abstract Whole exome sequencing (WES) is a powerful approach for discovering sequence variants in cancer cells but its time effectiveness is limited by the complexity and issues of WES data analysis. Here we present iWhale, a customizable pipeline based on Docker and SCons, reliably detecting somatic variants by three complementary callers (MuTect2, Strelka2 and VarScan2). The results are combined to obtain a single variant call format file for each sample and variants are annotated by integrating a wide range of information extracted from several reference databases, ultimately allowing variant and gene prioritization according to different criteria. iWhale allows users to conduct a complex series of WES analyses with a powerful yet customizable and easy-to-use tool, running on most operating systems (macOs, GNU/Linux and Windows). iWhale code is freely available at https://github.com/alexcoppe/iWhale and the docker image is downloadable from https://hub.docker.com/r/alexcoppe/iwhale.

Download Full-text

Vcfanno: fast, flexible annotation of genetic variants

10.1101/041863 ◽

2016 ◽

Author(s):

Brent S. Pedersen ◽

Ryan M. Layer ◽

Aaron R. Quinlan

Keyword(s):

Genetic Variants ◽

Source Code ◽

Variant Annotation ◽

Link Type ◽

File Formats ◽

Whole Exome ◽

Wide Range ◽

Reference Databases ◽

Scripting Language ◽

Genome Annotations

ABSTRACTBackgroundThe integration of genome annotations and reference databases is critical to the identification of genetic variants that may be of interest in studies of disease or other traits. However, comprehensive variant annotation with diverse file formats is difficult with existing methods.ResultsWe have developed vcfanno as a flexible toolset that simplifies the annotation of genetic variants in VCF format. Vcfanno can extract and summarize multiple attributes from one or more annotation files and append the resulting annotations to the INFO field of the original VCF file. Vcfanno also integrates the lua scripting language so that users can easily develop custom annotations and metrics. By leveraging a new parallel “chromosome sweeping” algorithm, it enables rapid annotation of both whole-exome and whole-genome datasets. We demonstrate this performance by annotating over 85.3 million variants in less than 17 minutes (>85,000 variants per second) with 50 attributes from 17 commonly used genome annotation resources.ConclusionsVcfanno is a flexible software package that provides researchers with the ability to annotate genetic variation with a wide range of datasets and reference databases in diverse genomic formats.AvailabilityThe vcfanno source code is available at https://github.com/brentp/vcfanno under the MIT license, and platform-specific binaries are available at https://github.com/brentp/vcfanno/releases. Detailed documentation is available at http://brentp.github.io/vcfanno/, and the code underlying the analyses presented can be found at https://github.com/brentp/vcfanno/tree/master/scripts/paper.

Download Full-text

Sparse project VCF: efficient encoding of population genotype matrices

Bioinformatics ◽

10.1093/bioinformatics/btaa1004 ◽

2020 ◽

Author(s):

Michael F Lin ◽

Xiaodong Bai ◽

William J Salerno ◽

Jeffrey G Reid

Keyword(s):

Rare Variants ◽

Random Access ◽

Supplementary Information ◽

Variant Call Format ◽

Variant Call ◽

Whole Exome ◽

Size Growth ◽

Entropy Reduction ◽

Reference Implementation ◽

Minimal Information

Abstract Summary Variant Call Format (VCF), the prevailing representation for germline genotypes in population sequencing, suffers rapid size growth as larger cohorts are sequenced and more rare variants are discovered. We present Sparse Project VCF (spVCF), an evolution of VCF with judicious entropy reduction and run-length encoding, delivering >10X size reduction for modern studies with practically minimal information loss. spVCF interoperates with VCF efficiently, including tabix-based random access. We demonstrate its effectiveness with the DiscovEHR and UK Biobank whole-exome sequencing cohorts. Availability and implementation Apache-licensed reference implementation: github.com/mlin/spVCF Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Genotype-Specific Signal Generation Based on Digestion of 3-Way DNA Junctions: Application to KRAS Variation Detection

Clinical Chemistry ◽

10.1373/clinchem.2006.068817 ◽

2006 ◽

Vol 52 (10) ◽

pp. 1855-1863 ◽

Cited By ~ 11

Author(s):

Giulia Amicarelli ◽

Daniel Adlerstein ◽

Erlet Shehi ◽

Fengfei Wang ◽

G Mike Makrigiorgos

Keyword(s):

Nucleic Acid ◽

Concordance Rate ◽

Signal Generation ◽

Sequence Variants ◽

Wild Type ◽

Specific Signal ◽

Novel Technology ◽

Dna Junctions ◽

Wide Range ◽

Kras Codon

Abstract Background: Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. Materials and Methods: We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled “amplifier”, and an “anchor”. The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. Results: The system detected and genotyped KRAS sequence variants down to ∼0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. Conclusion: OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.

Download Full-text

Whole-exome imputation of sequence variants identified two novel alleles associated with adult body height in African Americans

Human Molecular Genetics ◽

10.1093/hmg/ddu361 ◽

2014 ◽

Vol 23 (24) ◽

pp. 6607-6615 ◽

Cited By ~ 9

Author(s):

Mengmeng Du ◽

Paul L. Auer ◽

Shuo Jiao ◽

Jeffrey Haessler ◽

David Altshuler ◽

...

Keyword(s):

African Americans ◽

Body Height ◽

Sequence Variants ◽

Whole Exome ◽

Adult Body ◽

Novel Alleles

Download Full-text

Convenient Synthesis of Fluorescent Chromeno[4,3-d]pyrimidines from Electron-Deficient 3-Vinylchromones

Synthesis ◽

10.1055/s-0039-1690723 ◽

2019 ◽

Vol 52 (01) ◽

pp. 40-50 ◽

Cited By ~ 1

Author(s):

Nikita M. Chernov ◽

Roman V. Shutov ◽

Anastasia E. Potapova ◽

Igor P. Yakovlev

Keyword(s):

Stokes Shift ◽

Mild Conditions ◽

Convenient Synthesis ◽

Acid Fragment ◽

Wide Range ◽

Powerful Approach ◽

Reaction Proceeds ◽

Electron Withdrawing Group ◽

Blue Range ◽

Acetic Acids

We report an easy and powerful approach to the synthesis of novel chromeno[4,3-d]pyrimidine-5-acetic acids through ANRORC reaction of electron-deficient 3-vinylchromones and 1,3-N,N-binucleophiles. The reaction proceeds under mild conditions (EtOH, rt) and is applicable to a wide range of substrates. The described compounds show fluorescence in the violet-blue range (390–460 nm) with Stokes shift of 40–80 nm and moderate quantum yield (0.15–0.20). As the electron-withdrawing group is conserved in the form of an acetic acid fragment, these compounds may readily be functionalized or conjugated to a required substrate for (bio)analytical purposes.

Download Full-text

pH-Responsive Nanoparticles for Cancer Immunotherapy: A Brief Review

Nanomaterials ◽

10.3390/nano10081613 ◽

2020 ◽

Vol 10 (8) ◽

pp. 1613

Author(s):

Yunfeng Yan ◽

Hangwei Ding

Keyword(s):

Cancer Immunotherapy ◽

Ph Responsive ◽

Promising Strategy ◽

Cancer Immunity ◽

Tumor Tissues ◽

Wide Range ◽

Powerful Approach ◽

Key Steps ◽

Antigen Presenting ◽

Spatiotemporal Control

Immunotherapy has recently become a promising strategy for the treatment of a wide range of cancers. However, the broad implementation of cancer immunotherapy suffers from inadequate efficacy and toxic side effects. Integrating pH-responsive nanoparticles into immunotherapy is a powerful approach to tackle these challenges because they are able to target the tumor tissues and organelles of antigen-presenting cells (APCs) which have a characteristic acidic microenvironment. The spatiotemporal control of immunotherapeutic drugs using pH-responsive nanoparticles endows cancer immunotherapy with enhanced antitumor immunity and reduced off-tumor immunity. In this review, we first discuss the cancer-immunity circle and how nanoparticles can modulate the key steps in this circle. Then, we highlight the recent advances in cancer immunotherapy with pH-responsive nanoparticles and discuss the perspective for this emerging area.

Download Full-text

Diagnostic strategy in segmentation defect of the vertebrae: a retrospective study of 73 patients

Journal of Medical Genetics ◽

10.1136/jmedgenet-2017-104939 ◽

2018 ◽

Vol 55 (6) ◽

pp. 422.2-429 ◽

Cited By ~ 3

Author(s):

Mathilde Lefebvre ◽

Anne Dieux-Coeslier ◽

Geneviève Baujat ◽

Elise Schaefer ◽

Saint-Onge Judith ◽

...

Keyword(s):

Array Cgh ◽

Diagnostic Yield ◽

Gene Panel ◽

Targeted Sequencing ◽

Diagnostic Strategy ◽

Notch Signalling Pathway ◽

Whole Exome ◽

Spondylocostal Dysostosis ◽

Wide Range ◽

Molecular Bases

BackgroundSegmentation defects of the vertebrae (SDV) are non-specific features found in various syndromes. The molecular bases of SDV are not fully elucidated due to the wide range of phenotypes and classification issues. The genes involved are in the Notch signalling pathway, which is a key system in somitogenesis. Here we report on mutations identified in a diagnosis cohort of SDV. We focused on spondylocostal dysostosis (SCD) and the phenotype of these patients in order to establish a diagnostic strategy when confronted with SDV.Patients and methodsWe used DNA samples from a cohort of 73 patients and performed targeted sequencing of the five known SCD-causing genes (DLL3, MESP2, LFNG, HES7 and TBX6) in the first 48 patients and whole-exome sequencing (WES) in 28 relevant patients.ResultsTen diagnoses, including four biallelic variants in TBX6, two biallelic variants in LFNG and DLL3, and one in MESP2 and HES7, were made with the gene panel, and two diagnoses, including biallelic variants in FLNB and one variant in MEOX1, were made by WES. The diagnostic yield of the gene panel was 10/73 (13.7%) in the global cohort but 8/10 (80%) in the subgroup meeting the SCD criteria; the diagnostic yield of WES was 2/28 (8%).ConclusionAfter negative array CGH, targeted sequencing of the five known SCD genes should only be performed in patients who meet the diagnostic criteria of SCD. The low proportion of candidate genes identified by WES in our cohort suggests the need to consider more complex genetic architectures in cases of SDV.

Download Full-text

Variations in terrestrial arthropod DNA metabarcoding methods recovers robust beta diversity but variable richness and site indicators

Scientific Reports ◽

10.1038/s41598-019-54532-0 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 2

Author(s):

Teresita M. Porter ◽

Dave M. Morris ◽

Nathan Basiliko ◽

Mehrdad Hajibabaei ◽

Daniel Doucet ◽

...

Keyword(s):

Dna Extraction ◽

Beta Diversity ◽

Ecological Integrity ◽

Sequence Variants ◽

Key Indicator ◽

Pooled Dna ◽

Dna Metabarcoding ◽

Arthropod Fauna ◽

Reference Databases ◽

Arthropod Biodiversity

AbstractTerrestrial arthropod fauna have been suggested as a key indicator of ecological integrity in forest systems. Because phenotypic identification is expert-limited, a shift towards DNA metabarcoding could improve scalability and democratize the use of forest floor arthropods for biomonitoring applications. The objective of this study was to establish the level of field sampling and DNA extraction replication needed for arthropod biodiversity assessments from soil. Processing 15 individually collected soil samples recovered significantly higher median richness (488–614 sequence variants) than pooling the same number of samples (165–191 sequence variants) prior to DNA extraction, and we found no significant richness differences when using 1 or 3 pooled DNA extractions. Beta diversity was robust to changes in methodological regimes. Though our ability to identify taxa to species rank was limited, we were able to use arthropod COI metabarcodes from forest soil to assess richness, distinguish among sites, and recover site indicators based on unnamed exact sequence variants. Our results highlight the need to continue DNA barcoding local taxa during COI metabarcoding studies to help build reference databases. All together, these sampling considerations support the use of soil arthropod COI metabarcoding as a scalable method for biomonitoring.

Download Full-text

Variant Tool Chest: an improved tool to analyze and manipulate variant call format (VCF) files

BMC Bioinformatics ◽

10.1186/1471-2105-15-s7-s12 ◽

2014 ◽

Vol 15 (Suppl 7) ◽

pp. S12 ◽

Cited By ~ 7

Author(s):

Mark TW Ebbert ◽

Mark E Wadsworth ◽

Kevin L Boehme ◽

Kaitlyn L Hoyt ◽

Aaron R Sharp ◽

...

Keyword(s):

Variant Call Format ◽

Variant Call

Download Full-text

Relative expression of cytochrome P450 isoenzymes in human liver and association with the metabolism of drugs and xenobiotics

Biochemical Journal ◽

10.1042/bj2810359 ◽

1992 ◽

Vol 281 (2) ◽

pp. 359-368 ◽

Cited By ~ 167

Author(s):

L M Forrester ◽

C J Henderson ◽

M J Glancey ◽

D J Back ◽

B K Park ◽

...

Keyword(s):

Cytochrome P450 ◽

Human Liver ◽

Gene Families ◽

Cytochrome P450s ◽

Western Blots ◽

Human Cytochrome ◽

Wide Range ◽

Powerful Approach ◽

Highly Correlated ◽

Gene Subfamily

Cytochrome P450s play a central role in the metabolism and disposition of an extremely wide range of drugs and chemical carcinogens. Individual differences in the expression of these enzymes may be an important determinant in susceptibility to adverse drug reactions, chemical toxins and mutagens. In this paper, we have measured the relative levels of expression of cytochrome P450 isoenzymes from eight gene families or subfamilies in a panel of twelve human liver samples in order to determine the individuality in their expression and whether any forms are co-regulated. Isoenzymes were identified in most cases on Western blots based on the mobility of authentic recombinant human cytochrome P450 standards. The levels of the following P450 proteins correlated with each other: CYP2A6, CYP2B6 and a protein from the CYP2C gene subfamily, CYP2E1 and a member of the CYP2A gene subfamily, CYP2C8, CYP3A3/A4 and total cytochrome P450 content. Also, the levels of two proteins in the CYP4A gene subfamily were highly correlated. These correlations are consistent with the relative regulation of members of these gene families in rats or mice. In addition, the level of expression of specific isoenzymes has also been compared with the rate of metabolism of a panel of drugs, carcinogens and model P450 substrates. These latter studies demonstrate and confirm that the correlations obtained in this manner represent a powerful approach towards the assignment of the metabolism of substrates by specific human P450 isoenzymes.

Download Full-text