Obstacles to the reuse of study metadata in ClinicalTrials.gov

Scientific Data ◽

10.1038/s41597-020-00780-z ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Laura Miron ◽

Rafael S. Gonçalves ◽

Mark A. Musen

Keyword(s):

Clinical Studies ◽

Free Text ◽

Biomedical Data ◽

Biomedical Ontologies ◽

Experimental Protocol ◽

Data Types ◽

Eligibility Criteria ◽

Government Regulations ◽

Contact Information ◽

Mesh Terms

AbstractMetadata that are structured using principled schemas and that use terms from ontologies are essential to making biomedical data findable and reusable for downstream analyses. The largest source of metadata that describes the experimental protocol, funding, and scientific leadership of clinical studies is ClinicalTrials.gov. We evaluated whether values in 302,091 trial records adhere to expected data types and use terms from biomedical ontologies, whether records contain fields required by government regulations, and whether structured elements could replace free-text elements. Contact information, outcome measures, and study design are frequently missing or underspecified. Important fields for search, such as condition and intervention, are not restricted to ontologies, and almost half of the conditions are not denoted by MeSH terms, as recommended. Eligibility criteria are stored as semi-structured free text. Enforcing the presence of all required elements, requiring values for certain fields to be drawn from ontologies, and creating a structured eligibility criteria element would improve the reusability of data from ClinicalTrials.gov in systematic reviews, metanalyses, and matching of eligible patients to trials.

Download Full-text

Unsupervised deep learning on biomedical data with BoltzmannMachines.jl

10.1101/578252 ◽

2019 ◽

Cited By ~ 2

Author(s):

Stefan Lenz ◽

Moritz Hess ◽

Harald Binder

Keyword(s):

Input Data ◽

Genomic Data ◽

Small Sample ◽

Pattern Detection ◽

Biomedical Data ◽

Data Types ◽

Boltzmann Machines ◽

Link Type ◽

Unsupervised Deep Learning ◽

Small Sample Sizes

AbstractDeep Boltzmann machines (DBMs) are models for unsupervised learning in the field of artificial intelligence, promising to be useful for dimensionality reduction and pattern detection in clinical and genomic data. Multimodal and partitioned DBMs alleviate the problem of small sample sizes and make it possible to combine different input data types in one DBM model. We present the package “BoltzmannMachines” for the Julia programming language, which makes this model class available for practical use in working with biomedical data.AvailabilityNotebook with example data: http://github.com/stefan-m-lenz/BMs4BInf2019 Julia package: http://github.com/stefan-m-lenz/BoltzmannMachines.jl

Download Full-text

Where to search top-K biomedical ontologies?

Briefings in Bioinformatics ◽

10.1093/bib/bby015 ◽

2018 ◽

Vol 20 (4) ◽

pp. 1477-1491 ◽

Cited By ~ 1

Author(s):

Daniela Oliveira ◽

Anila Sahar Butt ◽

Armin Haller ◽

Dietrich Rebholz-Schuhmann ◽

Ratnesh Sahay

Keyword(s):

Search Engines ◽

Ground Truth ◽

Systematic Evaluation ◽

Free Text ◽

Biomedical Data ◽

Biomedical Ontologies ◽

Daily Work ◽

Ranking Algorithms ◽

Retrieval Mechanism ◽

The Right

AbstractMotivationSearching for precise terms and terminological definitions in the biomedical data space is problematic, as researchers find overlapping, closely related and even equivalent concepts in a single or multiple ontologies. Search engines that retrieve ontological resources often suggest an extensive list of search results for a given input term, which leads to the tedious task of selecting the best-fit ontological resource (class or property) for the input term and reduces user confidence in the retrieval engines. A systematic evaluation of these search engines is necessary to understand their strengths and weaknesses in different search requirements.ResultWe have implemented seven comparable Information Retrieval ranking algorithms to search through ontologies and compared them against four search engines for ontologies. Free-text queries have been performed, the outcomes have been judged by experts and the ranking algorithms and search engines have been evaluated against the expert-based ground truth (GT). In addition, we propose a probabilistic GT that is developed automatically to provide deeper insights and confidence to the expert-based GT as well as evaluating a broader range of search queries.ConclusionThe main outcome of this work is the identification of key search factors for biomedical ontologies together with search requirements and a set of recommendations that will help biomedical experts and ontology engineers to select the best-suited retrieval mechanism in their search scenarios. We expect that this evaluation will allow researchers and practitioners to apply the current search techniques more reliably and that it will help them to select the right solution for their daily work.AvailabilityThe source code (of seven ranking algorithms), ground truths and experimental results are available at https://github.com/danielapoliveira/bioont-search-benchmark

Download Full-text

Estimating the scale of biomedical data generation using text mining

10.1101/182857 ◽

2017 ◽

Author(s):

Gabriel Rosenfeld ◽

Dawei Lin

Keyword(s):

Text Mining ◽

Biomedical Research ◽

Word Embedding ◽

Research Articles ◽

Free Text ◽

Similar Amount ◽

Biomedical Data ◽

Data Types ◽

Data Repositories ◽

The Impact

AbstractWhile the impact of biomedical research has traditionally been measured using bibliographic metrics such as citation or journal impact factor, the data itself is an output which can be directly measured to provide additional context about a publication’s impact. Data are a resource that can be repurposed and reused providing dividends on the original investment used to support the primary work. Moreover, it is the cornerstone upon which a tested hypothesis is rejected or accepted and specific scientific conclusions are reached. Understanding how and where it is being produced enhances the transparency and reproducibility of the biomedical research enterprise. Most biomedical data are not directly deposited in data repositories and are instead found in the publication within figures or attachments making it hard to measure. We attempted to address this challenge by using recent advances in word embedding to identify the technical and methodological features of terms used in the free text of articles’ methods sections. We created term usage signatures for five types of biomedical research data, which were used in univariate clustering to correctly identify a large fraction of positive control articles and a set of manually annotated articles where generation of data types could be validated. The approach was then used to estimate the fraction of PLOS articles generating each biomedical data type over time. Out of all PLOS articles analyzed (n = 129,918), ~7%, 19%, 12%, 18%, and 6% generated flow cytometry, immunoassay, genomic microarray, microscopy, and high-throughput sequencing data. The estimate portends a vast amount of biomedical data being produced: in 2016, if other publishers generated a similar amount of data then roughly 40,000 NIH-funded research articles would produce ~56,000 datasets consisting of the five data types we analyzed.One Sentence SummaryApplication of a word-embedding model trained on the methods sections of research articles allows for estimation of the production of diverse biomedical data types using text mining.

Download Full-text

The Role of Leukotrienes Inhibitors in the Management of Chronic Inflammatory Diseases

Recent Patents on Inflammation & Allergy Drug Discovery ◽

10.2174/1872213x14666200130095040 ◽

2020 ◽

Vol 14 (1) ◽

pp. 15-31

Author(s):

Deepak Meshram ◽

Khushbo Bhardwaj ◽

Charulata Rathod ◽

Gail B. Mahady ◽

Kapil K. Soni

Keyword(s):

Respiratory Diseases ◽

Inflammatory Diseases ◽

Leukotriene B4 ◽

Published Data ◽

Free Text ◽

Plant Origin ◽

Medical Subject Headings ◽

Leukotriene Antagonists ◽

Mesh Terms ◽

Mediators Of Inflammation

Background: Leukotrienes are powerful mediators of inflammation and interact with specific receptors in target cell membrane to initiate an inflammatory response. Thus, Leukotrienes (LTs) are considered to be potent mediators of inflammatory diseases including allergic rhinitis, inflammatory bowel disease and asthma. Leukotriene B4 and the series of cysteinyl leukotrienes (C4, D4, and E4) are metabolites of arachidonic acid metabolism that cause inflammation. The cysteinyl LTs are known to increase vascular permeability, bronco-constriction and mucus secretion. Objectives: To review the published data for leukotriene inhibitors of plant origin and the recent patents for leukotriene inhibitors, as well as their role in the management of inflammatory diseases. Methods: Published data for leukotrienes antagonists of plant origin were searched from 1938 to 2019, without language restrictions using relevant keywords in both free text and Medical Subject Headings (MeSH terms) format. Literature and patent searches in the field of leukotriene inhibitors were carried out by using numerous scientific databases including Science Direct, PubMed, MEDLINE, Google Patents, US Patents, US Patent Applications, Abstract of Japan, German Patents, European Patents, WIPO and NAPRALERT. Finally, data from these information resources were analyzed and reported in the present study. Results: Currently, numerous anti-histaminic medicines are available including chloropheneremine, brompheniramine, cetirizine, and clementine. Furthermore, specific leukotriene antagonists from allopathic medicines are also available including zileuton, montelukast, pranlukast and zafirlukast and are considered effective and safe medicines as compared to the first generation medicines. The present study reports leukotrienes antagonistic agents of natural products and certain recent patents that could be an alternative medicine in the management of inflammation in respiratory diseases. Conclusion: The present study highlights recent updates on the pharmacology and patents on leukotriene antagonists in the management of inflammation respiratory diseases.

Download Full-text

Autonomy and control in the wish to die in terminally ill patients: A systematic integrative review

Palliative & Supportive Care ◽

10.1017/s1478951521000985 ◽

2021 ◽

pp. 1-8

Author(s):

Andrea Rodríguez-Prat ◽

Donna M. Wilson ◽

Remei Agulles

Keyword(s):

Cultural Context ◽

Personal Autonomy ◽

Empirical Studies ◽

Academic Library ◽

Integrative Review ◽

Free Text ◽

Screening Process ◽

Mesh Terms ◽

Wish To Die ◽

And Control

Abstract Background/Objective Personal autonomy and control are major concepts for people with life-limiting conditions. Patients who express a wish to die (WTD) are often thought of wanting it because of loss of autonomy or control. The research conducted so far has not focused on personal beliefs and perspectives; and little is known about patients’ understanding of autonomy and control in this context. The aim of this review was to analyze what role autonomy and control may play in relation to the WTD expressed by people with life-limiting conditions. Methods A systematic integrative review was conducted. The search strategy used MeSH terms in combination with free-text searching of the EBSCO Discovery Service (which provides access to multiple academic library literature databases, including PubMed and CINAHL), as well as the large PsycINFO, Scopus, and Web of Science library literature databases from their inception until February 2019. The search was updated to January 2021. Results After the screening process, 85 full texts were included for the final analysis. Twenty-seven studies, recording the experiences of 1,824 participants, were identified. The studies were conducted in Australia (n = 5), Canada (n = 5), USA (n = 5), The Netherlands (n = 3), Spain (n = 2), Sweden (n = 2), Switzerland (n = 2), Finland (n = 1), Germany (n = 1), and the UK (n = 1). Three themes were identified: (1) the presence of autonomy for the WTD, (2) the different ways in which autonomy is conceptualized, and (3) the socio-cultural context of research participants. Significance of results Despite the importance given to the concept of autonomy in the WTD discourse, only a few empirical studies have focused on personal interests. Comprehending the context is crucial because personal understandings of autonomy are shaped by socio-cultural–ethical backgrounds and these impact personal WTD attitudes.

Download Full-text

A Framework for Systematic Assessment of Clinical Trial Population Representativeness Using Electronic Health Records Data

Applied Clinical Informatics ◽

10.1055/s-0041-1733846 ◽

2021 ◽

Vol 12 (04) ◽

pp. 816-825

Author(s):

Yingcheng Sun ◽

Alex Butler ◽

Ibrahim Diallo ◽

Jae Hyun Kim ◽

Casey Ta ◽

...

Keyword(s):

Clinical Trial ◽

Clinical Trials ◽

Electronic Health Records ◽

The United States ◽

Design Stage ◽

Common Data Model ◽

Free Text ◽

Eligibility Criteria ◽

Health Records ◽

Electronic Health

Abstract Background Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population. Objectives This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage. Methods We present an end-to-end analytical framework for transforming free-text clinical trial eligibility criteria into executable database queries conformant with the Observational Medical Outcomes Partnership Common Data Model and for systematically quantifying the population representativeness for each clinical trial. Results We calculated the population representativeness of 782 novel coronavirus disease 2019 (COVID-19) trials and 3,827 type 2 diabetes mellitus (T2DM) trials in the United States respectively using this framework. With the use of overly restrictive eligibility criteria, 85.7% of the COVID-19 trials and 30.1% of T2DM trials had poor population representativeness. Conclusion This research demonstrates the potential of using the EHR data to assess the clinical trials population representativeness, providing data-driven metrics to inform the selection and optimization of eligibility criteria.

Download Full-text

Association Between Apical Periodontitis and TNF-α -308 G>A Gene Polymorphism: A Systematic Review and Meta-Analysis

Brazilian Dental Journal ◽

10.1590/0103-6440201701491 ◽

2017 ◽

Vol 28 (5) ◽

pp. 535-542 ◽

Cited By ~ 4

Author(s):

Alessandro Guimarães Salles ◽

Lívia Azeredo Alves Antunes ◽

Patrícia Arriaga Carvalho ◽

Erika Calvano Küchler ◽

Leonardo Santos Antunes

Keyword(s):

Systematic Review ◽

Meta Analysis ◽

Apical Periodontitis ◽

Permanent Teeth ◽

Genotype Distribution ◽

Nucleotide Polymorphisms ◽

Eligibility Criteria ◽

Single Nucleotide ◽

Mesh Terms ◽

Tnf Α

Abstract Currently, investigations have focused on the identification of Single Nucleotide Polymorphisms (SNP) involved in host response and its ability to generate an immunity deficiency. The aim of this study was to perform a systematic review (SR) and meta-analysis to evaluate the association between TNF-α -308 G>A polymorphism and apical periodontitis (AP) phenotypes. A broad search for studies was conducted. The following databases were used: PubMed, Scopus, Web of Science, and VHL (Medline, SciELO, Ibecs, and Lilacs). The MeSH terms “Periapical Periodontitis,” “Periapical Abscess,” “Polymorphism, Genetic,” and “Polymorphism, Single Nucleotide” were used. MeSH synonyms, related terms, and free terms were included. Clinical investigations of individuals with different AP phenotypes in permanent teeth were selected. After application of the eligibility criteria, selected studies were qualified by assessing their methodological quality. A fixed effect model was used for the meta-analysis. The initial search identified 71 references. After excluding duplicate abstracts, 33 were selected. From these, two were eligible for quality assessment and were classified as being of moderate evidence. The included studies did not demonstrate association between AP and TNF-α -308 G>A SNP. However, the meta-analysis demonstrated an association between the genotype distribution and AP phenotype (OR= 0.49; confidence interval= 0.25, 0.96; p=0.04). The role of TNF-α -308 G>A SNP in AP phenotypes is debatable. Further studies are needed to confirm and understand the underlying mechanisms of the identified association.

Download Full-text

Does Socioeconomic Position Affect Knowledge of the Risk Factors and Warning Signs of Stroke in the WHO European Region? A Systematic Literature Review

10.21203/rs.3.rs-22447/v2 ◽

2020 ◽

Author(s):

Katie Stack ◽

Wendy Robertson ◽

Clare Blackburn

Keyword(s):

Risk Factors ◽

Socioeconomic Position ◽

Stroke Risk ◽

Educational Interventions ◽

Warning Signs ◽

European Region ◽

Free Text ◽

Stroke Risk Factors ◽

Mesh Terms ◽

Increasing Knowledge

Abstract Background: Strokes are one of the leading causes of death worldwide. People with a lower socioeconomic position (SEP) (i.e. with regards to education, income and occupation) are at a higher risk of having a stroke and have worse clinical outcomes compared to the general population. Good knowledge levels about stroke risk factors and warning signs are key to prolonging life and reducing health issues caused by stroke. This systematic review examined differences in knowledge of stroke risk factors and warning signs with regards to SEP in the WHO European region. Methods: MEDLINE, Embase, Web of Science, PsycINFO and CINAHL were systematically searched using appropriate Medical Subject Headings (MeSH) terms and free text, combining search terms with Boolean operators. Two independent reviewers selected studies in two stages (title and abstract, and full-text), and screened reference lists of included studies. Only studies in English and based in the WHO European region were included. Results: Screening identified 2,118 records. In the final review, 20 articles were included, with 67,309 study participants between them. Out of 17 studies that looked at stroke risk factors, 11 found increasing knowledge to be associated with higher SEP, four found no difference by SEP, one showed a mixed pattern and one outlier study found increasing knowledge of risk factors to be associated with a lower SEP. Out of 19 studies that looked at stroke warning signs or symptoms, 15 found there to be better knowledge of warning signs with a higher SEP, three found there to be no difference, and the same outlier study found increasing knowledge of warning signs with a lower SEP. Studies that seemed to have a higher quality rating found increasing knowledge of stroke with a higher SEP. A meta-analysis was not possible due to heterogeneity of studies. Conclusions: In the WHO European region, better knowledge of stroke risk factors and warning signs is associated with a higher SEP. Public health campaigns and educational interventions aiming to increase stroke knowledge should be targeted at people with a lower SEP.

Download Full-text

LandScape: a web application for interactive genomic summary visualization

10.1101/866087 ◽

2019 ◽

Author(s):

Wenlong Jia ◽

Hechen Li ◽

Shiying Li ◽

Shuaicheng Li

Keyword(s):

Genetic Information ◽

Web Application ◽

Genomic Research ◽

File Format ◽

Data Types ◽

Web Based ◽

Link Type ◽

Level Data ◽

Real Time Visualization ◽

Information Landscape

ABSTRACTSummaryVisualizing integrated-level data from genomic research remains a challenge, as it requires sufficient coding skills and experience. Here, we present LandScapeoviz, a web-based application for interactive and real-time visualization of summarized genetic information. LandScape utilizes a well-designed file format that is capable of handling various data types, and offers a series of built-in functions to customize the appearance, explore results, and export high-quality diagrams that are available for publication.Availability and implementationLandScape is deployed at bio.oviz.org/demo-project/analyses/landscape for online use. Documentation and demo data are freely available on this website and GitHub (github.com/Nobel-Justin/Oviz-Bio-demo)[email protected]

Download Full-text