The cancer precision medicine knowledge base for structured clinical-grade mutations and interpretations

Linda Huang; Helen Fernandes; Hamid Zia; Peyman Tavassoli; Hanna Rennert; David Pisapia; Marcin Imielinski; Andrea Sboner; Mark A Rubin; Michael Kluk; Olivier Elemento

doi:10.1093/jamia/ocw148

The cancer precision medicine knowledge base for structured clinical-grade mutations and interpretations

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocw148 ◽

2016 ◽

Vol 24 (3) ◽

pp. 513-519 ◽

Cited By ~ 18

Author(s):

Linda Huang ◽

Helen Fernandes ◽

Hamid Zia ◽

Peyman Tavassoli ◽

Hanna Rennert ◽

...

Keyword(s):

Knowledge Base ◽

Precision Medicine ◽

Web Application ◽

Cancer Genomics ◽

Application Programming Interface ◽

Application Framework ◽

Clinical Grade ◽

Online Application ◽

Collaborative Editing ◽

Clinical Cancer

Objective: This paper describes the Precision Medicine Knowledge Base (PMKB; https://pmkb.weill.cornell.edu), an interactive online application for collaborative editing, maintenance, and sharing of structured clinical-grade cancer mutation interpretations. Materials and Methods: PMKB was built using the Ruby on Rails Web application framework. Leveraging existing standards such as the Human Genome Variation Society variant description format, we implemented a data model that links variants to tumor-specific and tissue-specific interpretations. Key features of PMKB include support for all major variant types, standardized authentication, distinct user roles including high-level approvers, and detailed activity history. A REpresentational State Transfer (REST) application-programming interface (API) was implemented to query the PMKB programmatically. Results: At the time of writing, PMKB contains 457 variant descriptions with 281 clinical-grade interpretations. The EGFR, BRAF, KRAS, and KIT genes are associated with the largest numbers of interpretable variants. PMKB’s interpretations have been used in over 1500 AmpliSeq tests and 750 whole-exome sequencing tests. The interpretations are accessed either directly via the Web interface or programmatically via the existing API. Discussion: An accurate and up-to-date knowledge base of genomic alterations of clinical significance is critical to the success of precision medicine programs. The open-access, programmatically accessible PMKB represents an important attempt at creating such a resource in the field of oncology. Conclusion: The PMKB was designed to help collect and maintain clinical-grade mutation interpretations and facilitate reporting for clinical cancer genomic testing. The PMKB was also designed to enable the creation of clinical cancer genomics automated reporting pipelines via an API.

Download Full-text

The Precision Medicine Knowledge Base: an online application for collaborative editing, maintenance and sharing of structured clinical-grade cancer mutations interpretations

10.1101/059824 ◽

2016 ◽

Cited By ~ 4

Author(s):

Linda Huang ◽

Helen Fernandes ◽

Hamid Zia ◽

Peyman Tavassoli ◽

Hanna Rennert ◽

...

Keyword(s):

Knowledge Base ◽

Precision Medicine ◽

Web Application ◽

Cancer Genomics ◽

Application Framework ◽

Clinical Grade ◽

Online Application ◽

Collaborative Editing ◽

Clinical Cancer ◽

Cancer Mutations

ABSTRACTObjectiveThis paper describes the Precision Medicine Knowledge Base (PMKB; https://pmkb.weill.cornell.edu), an interactive online application for collaborative editing, maintenance and sharing of structured clinical-grade cancer mutations interpretations.Materials and MethodsPMKB was built using the Ruby on Rails Web application framework. Leveraging existing standards such as Human Genome Variation Society (HGVS) variant description format, we implemented a data model that links variants to tumor-specific and tissue-specific interpretations. Key features of PMKB include support for all major variant types, standardized authentication, distinct user roles including high-level approvers, detailed activity history. A REpresentational State Transfer (REST) application-programming interface (API) was implemented to query the PMKB programmatically.ResultsAt the time of writing, PMKB contains 457 variant descriptions with 281 clinical-grade interpretations. The EGFR, BRAF, KRAS, and KIT genes are associated with the largest numbers of interpretable variants. The PMKB’s interpretations have been used in over 1,500 AmpliSeq tests and 750 whole exome sequencing tests. The interpretations are accessed either directly via the Web interface or programmatically via the existing API.DiscussionAn accurate and up-to-date knowledge base of genomic alterations of clinical significance is critical to the success of precision medicine programs. The open-access, programmatically accessible PMKB represents an important attempt at creating such a resource in the field of oncology.ConclusionThe PMKB was designed to help collect and maintain clinical-grade mutation interpretations and facilitates reporting for clinical cancer genomic testing. The PMKB was also designed to enable the creation of clinical cancer genomics automated reporting pipelines via an API.

Download Full-text

Multiomic Integration of Public Oncology Databases in Bioconductor

JCO Clinical Cancer Informatics ◽

10.1200/cci.19.00119 ◽

2020 ◽

pp. 958-971

Author(s):

Marcel Ramos ◽

Ludwig Geistlinger ◽

Sehyun Oh ◽

Lucas Schiffer ◽

Rimsha Azhar ◽

...

Keyword(s):

Web Application ◽

Cancer Genomics ◽

Application Programming Interface ◽

Data Representation ◽

The Cancer Genome Atlas ◽

Data Sets ◽

Data Types ◽

Data Infrastructure ◽

Integrative Framework ◽

Pan Cancer

PURPOSE Investigations of the molecular basis for the development, progression, and treatment of cancer increasingly use complementary genomic assays to gather multiomic data, but management and analysis of such data remain complex. The cBioPortal for cancer genomics currently provides multiomic data from > 260 public studies, including The Cancer Genome Atlas (TCGA) data sets, but integration of different data types remains challenging and error prone for computational methods and tools using these resources. Recent advances in data infrastructure within the Bioconductor project enable a novel and powerful approach to creating fully integrated representations of these multiomic, pan-cancer databases. METHODS We provide a set of R/Bioconductor packages for working with TCGA legacy data and cBioPortal data, with special considerations for loading time; efficient representations in and out of memory; analysis platform; and an integrative framework, such as MultiAssayExperiment. Large methylation data sets are provided through out-of-memory data representation to provide responsive loading times and analysis capabilities on machines with limited memory. RESULTS We developed the curatedTCGAData and cBioPortalData R/Bioconductor packages to provide integrated multiomic data sets from the TCGA legacy database and the cBioPortal web application programming interface using the MultiAssayExperiment data structure. This suite of tools provides coordination of diverse experimental assays with clinicopathological data with minimal data management burden, as demonstrated through several greatly simplified multiomic and pan-cancer analyses. CONCLUSION These integrated representations enable analysts and tool developers to apply general statistical and plotting methods to extensive multiomic data through user-friendly commands and documented examples.

Download Full-text

DGIdb 3.0: a redesign and expansion of the drug-gene interaction database

10.1101/200527 ◽

2017 ◽

Cited By ~ 3

Author(s):

Kelsy C. Cotto ◽

Alex H. Wagner ◽

Yang-Yang Feng ◽

Susanna Kiwala ◽

Adam C. Coffman ◽

...

Keyword(s):

User Interface ◽

Web Application ◽

Response Times ◽

Gene Interaction ◽

Application Programming Interface ◽

Gene Interactions ◽

Application Framework ◽

Interaction Database ◽

Search Filters ◽

Ease Of Access

ABSTRACTThe Drug-Gene Interaction Database (DGIdb, www.dgidb.org) consolidates, organizes, and presents drug-gene interactions and gene druggability information from papers, databases, and web resources. DGIdb normalizes content from more than thirty disparate sources and allows for user-friendly advanced browsing, searching and filtering for ease of access through an intuitive web user interface, application programming interface (API), and public cloud-based server image. DGIdb v3.0 represents a major update of the database. Nine of the previously included twenty-eight sources were updated. Six new resources were added, bringing the total number of sources to thirty-three. These updates and additions of sources have cumulatively resulted in 56,309 interaction claims. This has also substantially expanded the comprehensive catalogue of druggable genes and antineoplastic drug-gene interactions included in the DGIdb. Along with these content updates, v3.0 has received a major overhaul of its codebase, including an updated user interface, preset interaction search filters, consolidation of interaction information into interaction groups, greatly improved search response times, and upgrading the underlying web application framework. In addition, the expanded API features new endpoints which allow users to extract more detailed information about queried drugs, genes, and drug-gene interactions, including listings of PubMed IDs (PMIDs), interaction type, and other interaction metadata.

Download Full-text

Multiplatform Application Technology – Based Heutagogy on Learning Batik: A Curriculum Development Framework

Indonesian Journal of Science and Technology ◽

10.17509/ijost.v5i1.18754 ◽

2020 ◽

Vol 5 (1) ◽

pp. 45-61 ◽

Cited By ~ 3

Author(s):

Isma Widiaty ◽

Lala Septem Riza ◽

Ade Gafar Abdullah ◽

Sugeng Rifqi Mubaroq

Keyword(s):

High School Students ◽

Web Application ◽

Application Programming Interface ◽

Application Framework ◽

Vocational High School ◽

School Students ◽

Web Based ◽

Development Framework ◽

Application Programming ◽

Main Components

This study aimed to design a batik learning medium for vocational high school students in based on multiplatform. The application made was expected to support heutagogy approach – based learning and to deal with the development of science and technology integrated in the curriculum of vocational high schools. The application developed, namely e-botik, was an integration of several previously-designed applications using Code ignitor (CI) framework. The database used was My-SQL. It is commonly known that Code igniter is an open source web application framework utilized to create dynamic PHP applications. In this study, e-botik consisted of three main components including interface, database, and application programming interface (API). Some of the applications combined were ARtikon_joyful (Android-based), Video Kasumedangan Batik (movie player), Nalungtik Batik (desktop-based), Digi_Learnik (web-based), Batik UPI (manual), Batik Cireundeu (manual), and Lembar Balik (manual). The combination proceeded web-based so that it was compatible with various operating systems. The application (e-botik) was designed and then tested. The test was performed through whitebox testing and blackbox testing. The results of the test showed that it ran well and was able to be used a batik learning media. It is expected that students can utilize e-botik in selecting topics of learning batik in accordance with their competences and needs. This condition enables e-botik to support learning batik through heutagogical approach. In addition, the application was also validated in terms of both system and usage aspects.

Download Full-text

Analysis of the Optimal Application of Blockchain-Based Smart Lockers in the Logistics Industry Based on FFD-SAGA and Grey Decision-Making

Symmetry ◽

10.3390/sym13020329 ◽

2021 ◽

Vol 13 (2) ◽

pp. 329

Author(s):

Shen-Tsu Wang ◽

Meng-Hua Li ◽

Chun-Chi Lien

Keyword(s):

Web Application ◽

Success Factors ◽

Service Providers ◽

Cost Effective ◽

Application Programming Interface ◽

Fractional Factorial ◽

Office Workers ◽

Logistics Industry ◽

Blockchain Technology ◽

Multiple Attribute

Blockchain technology has been applied to logistics tracking, but it is not cost-effective. The development of smart lockers has solved the problem of repeated distribution to improve logistics efficiency, thereby becoming a solution with convenience and privacy compared to the in-store purchase and pickup alternative. This study prioritized the key factors of smart lockers using a simulated annealing–genetic algorithm by fractional factorial design (FFD-SAGA) and grey relational analysis, and investigated the main users of smart lockers by grey multiple attribute decision analysis. The results show that the Web application programming interface (API) concatenation and money flow provider are the key success factors of smart lockers, and office workers are the main users of the lockers. Hence, how to better meet the needs of office workers will be an issue of concern for service providers.

Download Full-text

A case of me: clinical cancer sequencing and the future of precision medicine

Molecular Case Studies ◽

10.1101/mcs.a000349 ◽

2015 ◽

Vol 1 (1) ◽

pp. a000349 ◽

Cited By ~ 6

Author(s):

Lukas D. Wartman

Keyword(s):

Precision Medicine ◽

Clinical Cancer ◽

The Future

Download Full-text

Genetic Gastric Cancer Susceptibility in the International Clinical Cancer Genomics Community Research Network

Cancer Genetics ◽

10.1016/j.cancergen.2017.08.001 ◽

2017 ◽

Vol 216-217 ◽

pp. 111-119 ◽

Cited By ~ 15

Author(s):

Thomas Slavin ◽

Susan L. Neuhausen ◽

Christina Rybak ◽

Ilana Solomon ◽

Bita Nehoray ◽

...

Keyword(s):

Gastric Cancer ◽

Cancer Genomics ◽

Cancer Susceptibility ◽

Research Network ◽

Community Research ◽

Clinical Cancer

Download Full-text

Use of Linked Data principles for semantic management of scanned documents

Transinformação ◽

10.1590/2318-08892016000200010 ◽

2016 ◽

Vol 28 (2) ◽

pp. 241-251 ◽

Cited By ~ 1

Author(s):

Luciane Lena Pessanha Monteiro ◽

Mark Douglas de Azevedo Jacyntho

Keyword(s):

Decision Making ◽

Knowledge Base ◽

Web Application ◽

Linked Data ◽

World Wide ◽

Decision Making Process ◽

Whole Process ◽

The World ◽

Scanned Documents ◽

The Web

The study addresses the use of the Semantic Web and Linked Data principles proposed by the World Wide Web Consortium for the development of Web application for semantic management of scanned documents. The main goal is to record scanned documents describing them in a way the machine is able to understand and process them, filtering content and assisting us in searching for such documents when a decision-making process is in course. To this end, machine-understandable metadata, created through the use of reference Linked Data ontologies, are associated to documents, creating a knowledge base. To further enrich the process, (semi)automatic mashup of these metadata with data from the new Web of Linked Data is carried out, considerably increasing the scope of the knowledge base and enabling to extract new data related to the content of stored documents from the Web and combine them, without the user making any effort or perceiving the complexity of the whole process.

Download Full-text

Safety Measures and Auto Detection against SQL Injection Attacks

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijeat.b3316.129219 ◽

2019 ◽

Vol 8 (4) ◽

pp. 2827-2833

Keyword(s):

Web Application ◽

Application Programming Interface ◽

Database Management System ◽

Transport Layer ◽

Prototype System ◽

Large Network ◽

Security Measures ◽

Sql Injection ◽

Sql Injection Attack ◽

The Web

The SQL injection attack (SQLIA) occurred when the attacker integrating a code of a malicious SQL query into a valid query statement via a non-valid input. As a result the relational database management system will trigger these malicious query that cause to SQL injection attack. After successful execution, it may interrupts the CIA (confidentiality, integrity and availability) of web API. The vulnerability of Web Application Programming Interface (API) is the prior concern for any programming. The Web API is mainly based of Simple Object Access Protocol (SOAP) protocol which provide its own security and Representational State Transfer (REST) is provide the architectural style to security measures form transport layer. Most of the time developers or newly programmers does not follow the standards of safe programming and forget to validate their input fields in the form. This vulnerability in the web API opens the door for the threats and it’s become a cake walk for the attacker to exploit the database associated with the web API. The objective of paper is to automate the detection of SQL injection attack and secure the poorly coded web API access through large network traffic. The Snort and Moloch approaches are used to develop the hybrid model for auto detection as well as analyze the SQL injection attack for the prototype system

Download Full-text

SPECIES: Supporting big-data-driven research

Biodiversity Information Science and Standards ◽

10.3897/biss.3.36095 ◽

2019 ◽

Vol 3 ◽

Cited By ~ 1

Author(s):

Raul Sierra-Alcocer ◽

Christopher Stephens ◽

Juan Barrios ◽

Constantino González‐Salazar ◽

Juan Carlos Salazar Carrillo ◽

...

Keyword(s):

Web Application ◽

Application Programming Interface ◽

Data Driven ◽

Spatial Correlations ◽

Species Occurrence ◽

Global Biodiversity Information Facility ◽

Application Programming ◽

Abiotic Variables ◽

Programming Interface ◽

Biodiversity Information

SPECIES (Stephens et al. 2019) is a tool to explore spatial correlations in biodiversity occurrence databases. The main idea behind the SPECIES project is that the geographical correlations between the distributions of taxa records have useful information. The problem, however, is that if we have thousands of species (Mexico's National System of Biodiversity Information has records of around 70,000 species) then we have millions of potential associations, and exploring them is far from easy. Our goal with SPECIES is to facilitate the discovery and application of meaningful relations hiding in our data. The main variables in SPECIES are the geographical distributions of species occurrence records. Other types of variables, like the climatic variables from WorldClim (Hijmans et al. 2005), are explanatory data that serve for modeling. The system offers two modes of analysis. In one, the user defines a target species, and a selection of species and abiotic variables; then the system computes the spatial correlations between the target species and each of the other species and abiotic variables. The request from the user can be as small as comparing one species to another, or as large as comparing one species to all the species in the database. A user may wonder, for example, which species are usual neighbors of the jaguar, this mode could help answer this question. The second mode of analysis gives a network perspective, in it, the user defines two groups of taxa (and/or environmental variables), the output in this case is a correlation network where the weight of a link between two nodes represents the spatial correlation between the variables that the nodes represent. For example, one group of taxa could be hummingbirds (Trochilidae family) and the second flowers of the Lamiaceae family. This output would help the user analyze which pairs of hummingbird and flower are highly correlated in the database. SPECIES data architecture is optimized to support fast hypotheses prototyping and testing with the analysis of thousands of biotic and abiotic variables. It has a visualization web interface that presents descriptive results to the user at different levels of detail. The methodology in SPECIES is relatively simple, it partitions the geographical space with a regular grid and treats a species occurrence distribution as a present/not present boolean variable over the cells. Given two species (or one species and one abiotic variable) it measures if the number of co-occurrences between the two is more (or less) than expected. If it is more than expected indicates a signal of a positive relation, whereas if it is less it would be evidence of disjoint distributions. SPECIES provides an open web application programming interface (API) to request the computation of correlations and statistical dependencies between variables in the database. Users can create applications that consume this 'statistical web service' or use it directly to further analyze the results in frameworks like R or Python. The project includes an interactive web application that does exactly that: requests analysis from the web service and lets the user experiment and visually explore the results. We believe this approach can be used on one side to augment the services provided from data repositories; and on the other side, facilitate the creation of specialized applications that are clients of these services. This scheme supports big-data-driven research for a wide range of backgrounds because end users do not need to have the technical know-how nor the infrastructure to handle large databases. Currently, SPECIES hosts: all records from Mexico's National Biodiversity Information System (CONABIO 2018) and a subset of Global Biodiversity Information Facility data that covers the contiguous USA (GBIF.org 2018b) and Colombia (GBIF.org 2018a). It also includes discretizations of environmental variables from WorldClim, from the Environmental Rasters for Ecological Modeling project (Title and Bemmels 2018), from CliMond (Kriticos et al. 2012), and topographic variables (USGS EROS Center 1997b, USGS EROS Center 1997a). The long term plan, however, is to incrementally include more data, specially all data from the Global Biodiversity Information Facility. The code of the project is open source, and the repositories are available online (Front-end, Web Services Application Programming Interface, Database Building scripts). This presentation is a demonstration of SPECIES' functionality and its overall design.

Download Full-text