Collection and Storage of HLA NGS Genotyping Data for the 17th International HLA and Immunogenetics Workshop

Mapping Intimacies ◽

10.1101/116871 ◽

2017 ◽

Author(s):

Chia-Jung Chang ◽

Kazutoyo Osoegawa ◽

Robert P. Milius ◽

Martin Maiers ◽

Wenzhong Xiao ◽

...

Keyword(s):

Next Generation Sequencing ◽

Killer Cell ◽

Human Leukocyte ◽

Markup Language ◽

Next Generation ◽

File Transfer ◽

Specific Priming ◽

Ngs Data ◽

And Storage ◽

Generation Sequencing

AbstractFor over 50 years, the International HLA and Immunogenetics Workshops (IHIW) have advanced the fields of histocompatibility and immunogenetics (H&I) via community sharing of technology, experience and reagents, and the establishment of ongoing collaborative projects. In the fall of 2017, the 17th IHIW will focus on the application of next generation sequencing (NGS) technologies for clinical and research goals in the H&I fields. NGS technologies have the potential to allow dramatic insights and advances in these fields, but the scope and sheer quantity of data associated with NGS raise challenges for their analysis, collection, exchange and storage. The 17 th IHIW has adopted a centralized approach to these issues, and we have been developing the tools, services and systems to create an effective system for capturing and managing these NGS data. We have worked with NGS platform and software developers to define a set of distinct but equivalent NGS typing reports that record NGS data in a uniform fashion. The 17th IHIW database applies our standards, tools and services to collect, validate and store those structured, multi-platform data in an automated fashion. We are creating community resources to enable exploration of the vast store of curated sequence and allele-name data in the IPD-IMGT/HLA Database, with the goal of creating a long-term community resource that integrates these curated data with new NGS sequence and polymorphism data, for advanced analyses and applications.AbbreviationsAbbreviationsCSVComma-Separated ValuesGFEGene Feature EnumerationGLGenotype ListHLAHuman Leukocyte AntigenHMLHistoimmunogenetics Markup LanguageH&IHistocompatibility and ImmunogeneticsIHIWInternational HLA and Immunogenetics WorkshopIMGTImMunoGeneTicsIPDImmunoPolymorphism DatabaseIUPACInternational Union of Pure and Applied ChemistryKIRKiller-cell Immunoglobulin-like ReceptorMIRINGMinimum Information for Reporting Immunogenomic NGS GenotypingNGSNext Generation SequencingPIPrincipal InvestigatorRSCAReference Strand Conformation AnalysisrSSOReverse Sequence-Specific OligoSBTSequence-Based TypingsFTPsecure File Transfer ProtocolSSSequence-SpecificSSOSequence-Specific OligoSSPSequence-Specific PrimingWMDAWorld Marrow Donor AssociationWSWorkshopXMLeXtensible Markup Language

Download Full-text

Histoimmunogenetics Markup Language 1.0: Reporting Next Generation Sequencing-based HLA and KIR Genotyping

10.1101/014951 ◽

2015 ◽

Author(s):

Robert P Milius ◽

Michael Heuer ◽

Daniel Valiga ◽

Kathryn J Doroschak ◽

Caleb J. Kennedy ◽

...

Keyword(s):

Next Generation Sequencing ◽

Data Exchange ◽

Consensus Sequence ◽

Markup Language ◽

Next Generation ◽

Multiple Group ◽

Specific Priming ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

We present an electronic format for exchanging data for HLA and KIR genotyping with extensions for next-generation sequencing (NGS). This format addresses NGS data exchange by refining the Histoimmunogenetics Markup Language (HML) to conform to the proposed Minimum Information for Reporting Immunogenomic NGS Genotyping (MIRING) reporting guidelines (miring.immunogenomics.org). Our refinements of HML include two major additions. First, NGS is supported by new XML structures to capture additional NGS data and metadata required to produce a genotyping result, including analysis-dependent (dynamic) and method-dependent (static) components. A full genotype, consensus sequence, and the surrounding metadata are included directly, while the raw sequence reads and platform documentation are externally referenced. Second, genotype ambiguity is fully represented by integrating Genotype List Strings, which use a hierarchical set of delimiters to represent allele and genotype ambiguity in a complete and accurate fashion. HML also continues to enable the transmission of legacy methods (e.g. site-specific oligonucleotide, sequence-specific priming, and sequence based typing (SBT)), adding features such as allowing multiple group-specific sequencing primers, and fully leveraging techniques that combine multiple methods to obtain a single result, such as SBT integrated with NGS.

Download Full-text

WBFQC: A new approach for compressing next-generation sequencing data splitting into homogeneous streams

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972001850018x ◽

2018 ◽

Vol 16 (05) ◽

pp. 1850018 ◽

Cited By ~ 1

Author(s):

Sanjeev Kumar ◽

Suneeta Agarwal ◽

Ranvijay

Keyword(s):

Next Generation Sequencing ◽

Genomic Data ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Compression Technique ◽

Compression Algorithms ◽

Ngs Data ◽

And Storage ◽

Generation Sequencing

Genomic data nowadays is playing a vital role in number of fields such as personalized medicine, forensic, drug discovery, sequence alignment and agriculture, etc. With the advancements and reduction in the cost of next-generation sequencing (NGS) technology, these data are growing exponentially. NGS data are being generated more rapidly than they could be significantly analyzed. Thus, there is much scope for developing novel data compression algorithms to facilitate data analysis along with data transfer and storage directly. An innovative compression technique is proposed here to address the problem of transmission and storage of large NGS data. This paper presents a lossless non-reference-based FastQ file compression approach, segregating the data into three different streams and then applying appropriate and efficient compression algorithms on each. Experiments show that the proposed approach (WBFQC) outperforms other state-of-the-art approaches for compressing NGS data in terms of compression ratio (CR), and compression and decompression time. It also has random access capability over compressed genomic data. An open source FastQ compression tool is also provided here ( http://www.algorithm-skg.com/wbfqc/home.html ).

Download Full-text

A New Human Leukocyte Antigen Typing Algorithm Combined With Currently Available Genotyping Tools Based on Next-Generation Sequencing Data and Guidelines to Select the Most Likely Human Leukocyte Antigen Genotype

Frontiers in Immunology ◽

10.3389/fimmu.2021.688183 ◽

2021 ◽

Vol 12 ◽

Author(s):

Miseon Lee ◽

Jeong-Han Seo ◽

Sungjae Song ◽

In Hye Song ◽

Su Yeon Kim ◽

...

Keyword(s):

Next Generation Sequencing ◽

Human Leukocyte Antigen ◽

Human Leukocyte ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Leukocyte Antigen ◽

Hla Type ◽

Hla Genotyping ◽

Ngs Data ◽

Generation Sequencing

BackgroundHigh-precision human leukocyte antigen (HLA) genotyping is crucial for anti-cancer immunotherapy, but existing tools predicting HLA genotypes using next-generation sequencing (NGS) data are insufficiently accurate.Materials and MethodsWe compared availability, accuracy, correction score, and complementary ratio of eight HLA genotyping tools (OptiType, HLA-HD, PHLAT, seq2HLA, arcasHLA, HLAscan, HLA*LA, and Kourami) using 1,005 cases from the 1000 Genomes Project data. We created a new HLA-genotyping algorithm combining tools based on the precision and the accuracy of tools’ combinations. Then, we assessed the new algorithm’s performance in 39 in-house samples with normal whole-exome sequencing (WES) data and polymerase chain reaction–sequencing-based typing (PCR-SBT) results.ResultsRegardless of the type of tool, the calls presented by more than six tools concordantly showed high accuracy and precision. The accuracy of the group with at least six concordant calls was 100% (97/97) in HLA-A, 98.2% (112/114) in HLA-B, 97.3% (142/146) in HLA-C. The precision of the group with at least six concordant calls was over 98% in HLA-ABC. We additionally calculated the accuracy of the combination tools considering the complementary ratio of each tool and the accuracy of each tool, and the accuracy was over 98% in all groups with six or more concordant calls. We created a new algorithm that matches the above results. It was to select the HLA type if more than six out of eight tools presented a matched type. Otherwise, determine the HLA type experimentally through PCR-SBT. When we applied the new algorithm to 39 in-house cases, there were more than six matching calls in all HLA-A, B, and C, and the accuracy of these concordant calls was 100%.ConclusionsHLA genotyping accuracy using NGS data could be increased by combining the current HLA genotyping tools. This new algorithm could also be useful for preliminary screening to decide whether to perform an additional PCR-based experimental method instead of using tools with NGS data.

Download Full-text

NGSremix: A software tool for estimating pairwise relatedness between admixed individuals from next-generation sequencing data

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab174 ◽

2021 ◽

Author(s):

Anne Krogh Nøhr ◽

Kristian Hanghøj ◽

Genis Garcia Erill ◽

Zilong Li ◽

Ida Moltke ◽

...

Keyword(s):

Next Generation Sequencing ◽

Genetic Research ◽

Likelihood Estimation ◽

Software Tool ◽

Estimation Methods ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

Ngs Data ◽

Generation Sequencing

Abstract Estimation of relatedness between pairs of individuals is important in many genetic research areas. When estimating relatedness, it is important to account for admixture if this is present. However, the methods that can account for admixture are all based on genotype data as input, which is a problem for low-depth next-generation sequencing (NGS) data from which genotypes are called with high uncertainty. Here we present a software tool, NGSremix, for maximum likelihood estimation of relatedness between pairs of admixed individuals from low-depth NGS data, which takes the uncertainty of the genotypes into account via genotype likelihoods. Using both simulated and real NGS data for admixed individuals with an average depth of 4x or below we show that our method works well and clearly outperforms all the commonly used state-of-the-art relatedness estimation methods PLINK, KING, relateAdmix, and ngsRelate that all perform quite poorly. Hence, NGSremix is a useful new tool for estimating relatedness in admixed populations from low-depth NGS data. NGSremix is implemented in C/C ++ in a multi-threaded software and is freely available on Github https://github.com/KHanghoj/NGSremix.

Download Full-text

Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS) Data in Plants

Molecules ◽

10.3390/molecules23020399 ◽

2018 ◽

Vol 23 (2) ◽

pp. 399 ◽

Cited By ~ 41

Author(s):

Sima Taheri ◽

Thohirah Lee Abdullah ◽

Mohd Yusop ◽

Mohamed Hanafi ◽

Mahbod Sahebi ◽

...

Keyword(s):

Next Generation Sequencing ◽

Ssr Markers ◽

Next Generation ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

Download Full-text

Appendix A: Common File Types Used in Next-Generation Sequencing (NGS) Data Analysis

Next-Generation Sequencing Data Analysis ◽

10.1201/b19532-20 ◽

2016 ◽

pp. 199-202

Keyword(s):

Data Analysis ◽

Next Generation Sequencing ◽

Next Generation ◽

Ngs Data Analysis ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

Download Full-text

Identification of Genetic Hereditary Predisposition to Hematologic Malignancies By Clinical Next-Generation Sequencing

Blood ◽

10.1182/blood.v126.23.3854.3854 ◽

2015 ◽

Vol 126 (23) ◽

pp. 3854-3854 ◽

Cited By ~ 2

Author(s):

Amy E Knight Johnson ◽

Lucia Guidugli ◽

Kelly Arndt ◽

Gorka Alkorta-Aranburu ◽

Viswateja Nelakuditi ◽

...

Keyword(s):

Next Generation Sequencing ◽

Sanger Sequencing ◽

Family Members ◽

Hematologic Malignancies ◽

Dyskeratosis Congenita ◽

Molecular Diagnostic ◽

Next Generation ◽

Hereditary Predisposition ◽

Ngs Data ◽

Generation Sequencing

Abstract Introduction: Myelodysplastic syndrome (MDS) and acute leukemia (AL) are a clinically diverse and genetically heterogeneous group of hematologic malignancies. Familial forms of MDS/AL have been increasingly recognized in recent years, and can occur as a primary event or secondary to genetic syndromes, such as inherited bone marrow failure syndromes (IBMFS). It is critical to confirm a genetic diagnosis in patients with hereditary predisposition to hematologic malignancies in order to provide prognostic information and cancer risk assessment, and to aid in identification of at-risk or affected family members. In addition, a molecular diagnosis can help tailor medical management including informing the selection of family members for allogeneic stem cell transplantation donors. Until recently, clinical testing options for this diverse group of hematologic malignancy predisposition genes were limited to the evaluation of single genes by Sanger sequencing, which is a time consuming and expensive process. To improve the diagnosis of hereditary predisposition to hematologic malignancies, our CLIA-licensed laboratory has recently developed Next-Generation Sequencing (NGS) panel-based testing for these genes. Methods: Thirty six patients with personal and/or family history of aplastic anemia, MDS or AL were referred for clinical diagnostic testing. DNA from the referred patients was obtained from cultured skin fibroblasts or peripheral blood and was utilized for preparing libraries with the SureSelectXT Enrichment System. Libraries were sequenced on an Illumina MiSeq instrument and the NGS data was analyzed with a custom bioinformatic pipeline, targeting a panel of 76 genes associated with IBMFS and/or familial MDS/AL. Results: Pathogenic and highly likely pathogenic variants were identified in 7 out of 36 patients analyzed, providing a positive molecular diagnostic rate of 20%. Overall, 6 out of the 7 pathogenic changes identified were novel. In 2 unrelated patients with MDS, heterozygous pathogenic sequence changes were identified in the GATA2 gene. Heterozygous pathogenic changes in the following autosomal dominant genes were each identified in a single patient: RPS26 (Diamond-Blackfan anemia 10), RUNX1 (familial platelet disorder with propensity to myeloid malignancy), TERT (dyskeratosis congenita 4) and TINF2 (dyskeratosis congenita 3). In addition, one novel heterozygous sequence change (c.826+5_826+9del, p.?) in the Fanconi anemia associated gene FANCA was identified. . The RNA analysis demonstrated this variant causes skipping of exon 9 and results in a premature stop codon in exon 10. Further review of the NGS data provided evidence of an additional large heterozygous multi-exon deletion in FANCA in the same patient. This large deletion was confirmed using array-CGH (comparative genomic hybridization). Conclusions: This study demonstrates the effectiveness of using NGS technology to identify patients with a hereditary predisposition to hematologic malignancies. As many of the genes associated with hereditary predisposition to hematologic malignancies have similar or overlapping clinical presentations, analysis of a diverse panel of genes is an efficient and cost-effective approach to molecular diagnostics for these disorders. Unlike Sanger sequencing, NGS technology also has the potential to identify large exonic deletions and duplications. In addition, RNA splicing assay has proven to be helpful in clarifying the pathogenicity of variants suspected to affect splicing. This approach will also allow for identification of a molecular defect in patients who may have atypical presentation of disease. Disclosures No relevant conflicts of interest to declare.

Download Full-text

PATH-03. CLINICAL UTILITY OF NEXT GENERATION SEQUENCING IN IDH-WILDTYPE GLIOBLASTOMA: THE DANA-FARBER CANCER INSTITUTE EXPERIENCE

Neuro-Oncology ◽

10.1093/neuonc/noaa215.685 ◽

2020 ◽

Vol 22 (Supplement_2) ◽

pp. ii164-ii164

Author(s):

Mary Jane Lim-Fat ◽

Gilbert Youssef ◽

Mehdi Touat ◽

Bryan Iorgulescu ◽

Eleanor Woodward ◽

...

Keyword(s):

Clinical Trial ◽

Clinical Trials ◽

Next Generation Sequencing ◽

Immune Checkpoint Blockade ◽

Next Generation ◽

Single Nucleotide Variants ◽

Dana Farber Cancer Institute ◽

Clinical Records ◽

Ngs Data ◽

Generation Sequencing

Abstract BACKGROUND Comprehensive next generation sequencing (NGS) is available through many academic institutions and commercial entities, and is incorporated in practice guidelines for glioblastoma (GBM). We retrospective evaluated the practice patterns and utility of incorporating NGS data into routine care of GBM patients at a clinical trials-focused academic center. METHODS We identified 1,011 consecutive adult patients with histologically confirmed GBM with OncoPanel testing, a targeted exome NGS platform of 447 cancer-associated genes at Dana Farber Cancer Institute (DFCI), from 2013-2019. We selected and retrospectively reviewed clinical records of all IDH-wildtype GBM patients treated at DFCI. RESULTS We identified 557 GBM IDH-wildtype patients, of which 227 were male (40.7%). OncoPanel testing revealed 833 single nucleotide variants and indels in 44 therapeutically relevant genes (Tier 1 or 2 mutations) including PIK3CA (n=51), BRAF (n=9), FGFR1 (n=8), MSH2 (n=4), MSH6 (n=2) and MLH1 (n=1). Copy number analysis revealed 509 alterations in 18 therapeutically relevant genes including EGFR amplification (n= 186), PDGFRA amplification (N=39) and CDKN2A/2B homozygous loss (N=223). Median overall survival was 17.5 months for the whole cohort. Seventy-four therapeutic clinical trials accrued 144 patients in the upfront setting (25.9%) and 203 patients (36.4%) at recurrence. Altogether, NGS data for 107 patients (19.2%) were utilized for clinical trial enrollment or targeted therapy indications. High mutational burden (>17mutations/Mb) was identified in 11/464 samples (2.4%); of whom 3/11 received immune checkpoint blockade. Four patients received compassionate use therapy targeting EGFRvIII (rindopepimut, n=2), CKD4/6 (abemaciclib, n=1) and BRAFV600E (dabrafenib/trametinib, n=1). CONCLUSION While NGS has greatly improved diagnosis and molecular classification, we highlight that NGS remains underutilized in selecting therapy in GBM, even in a setting where clinical trials and off-label therapies are relatively accessible. Continued efforts to develop better targeted therapies and efficient clinical trial design are required to maximize the potential benefits of genomically-stratified data.

Download Full-text

EVALUATION OF A NOVEL ULTRA-HIGH-RESOLUTION ASSAY AND ANALYTIC SOFTWARE FOR NEXT GENERATION SEQUENCING BASED GENOTYPING OF 11 HUMAN LEUKOCYTE ANTIGEN LOCI

Transplantation Journal ◽

10.1097/01.tp.0000700080.78085.be ◽

2020 ◽

Vol 104 (S3) ◽

pp. S310-S310

Author(s):

Ji Yeon Kim ◽

Hae-In Song ◽

Mi Mi Jang ◽

Kyoungyong Jung ◽

Soo Ho Park ◽

...

Keyword(s):

Next Generation Sequencing ◽

High Resolution ◽

Human Leukocyte Antigen ◽

Human Leukocyte ◽

Next Generation ◽

Leukocyte Antigen ◽

Generation Sequencing

Download Full-text

ViennaNGS: A toolbox for building efficient next- generation sequencing analysis pipelines

F1000Research ◽

10.12688/f1000research.6157.2 ◽

2015 ◽

Vol 4 ◽

pp. 50 ◽

Cited By ~ 7

Author(s):

Michael T. Wolfinger ◽

Jörg Fallmann ◽

Florian Eggenhofer ◽

Fabian Amman

Keyword(s):

Next Generation Sequencing ◽

Sequence Motif ◽

Software Components ◽

Sequencing Analysis ◽

Next Generation ◽

File Formats ◽

Next Generation Sequencing Analysis ◽

Next Generation Sequencing Ngs ◽

Ngs Data ◽

Generation Sequencing

Recent achievements in next-generation sequencing (NGS) technologies lead to a high demand for reuseable software components to easily compile customized analysis workflows for big genomics data. We present ViennaNGS, an integrated collection of Perl modules focused on building efficient pipelines for NGS data processing. It comes with functionality for extracting and converting features from common NGS file formats, computation and evaluation of read mapping statistics, as well as normalization of RNA abundance. Moreover, ViennaNGS provides software components for identification and characterization of splice junctions from RNA-seq data, parsing and condensing sequence motif data, automated construction of Assembly and Track Hubs for the UCSC genome browser, as well as wrapper routines for a set of commonly used NGS command line tools.

Download Full-text