Detection and normalization of medical terms using domain-specific term frequency and adaptive ranking

Collecting specialty-related medical terms: Development and evaluation of a resource for Spanish

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01495-w ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Pilar López-Úbeda ◽

Alexandra Pomares-Quimbaya ◽

Manuel Carlos Díaz-Galiano ◽

Stefan Schulz

Keyword(s):

Language Processing ◽

Classification Problem ◽

Snomed Ct ◽

Language Resources ◽

Clinical Specialty ◽

Controlled Vocabularies ◽

Clinical Text ◽

Domain Specific ◽

Medical Terms ◽

Core Vocabulary

Abstract Background Controlled vocabularies are fundamental resources for information extraction from clinical texts using natural language processing (NLP). Standard language resources available in the healthcare domain such as the UMLS metathesaurus or SNOMED CT are widely used for this purpose, but with limitations such as lexical ambiguity of clinical terms. However, most of them are unambiguous within text limited to a given clinical specialty. This is one rationale besides others to classify clinical text by the clinical specialty to which they belong. Results This paper addresses this limitation by proposing and applying a method that automatically extracts Spanish medical terms classified and weighted per sub-domain, using Spanish MEDLINE titles and abstracts as input. The hypothesis is biomedical NLP tasks benefit from collections of domain terms that are specific to clinical subdomains. We use PubMed queries that generate sub-domain specific corpora from Spanish titles and abstracts, from which token n-grams are collected and metrics of relevance, discriminatory power, and broadness per sub-domain are computed. The generated term set, called Spanish core vocabulary about clinical specialties (SCOVACLIS), was made available to the scientific community and used in a text classification problem obtaining improvements of 6 percentage points in the F-measure compared to the baseline using Multilayer Perceptron, thus demonstrating the hypothesis that a specialized term set improves NLP tasks. Conclusion The creation and validation of SCOVACLIS support the hypothesis that specific term sets reduce the level of ambiguity when compared to a specialty-independent and broad-scope vocabulary.

Download Full-text

Domain-Specific Term Rankings Using Topic Models

Information Retrieval Technology - Lecture Notes in Computer Science ◽

10.1007/978-3-642-17187-1_44 ◽

2010 ◽

pp. 454-465 ◽

Cited By ~ 1

Author(s):

Zhiyuan Liu ◽

Maosong Sun

Keyword(s):

Topic Models ◽

Domain Specific ◽

Specific Term

Download Full-text

Exploring domain-specific term weight in archived question search

Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10 ◽

10.1145/1871437.1871683 ◽

2010 ◽

Cited By ~ 14

Author(s):

Zhao-Yan Ming ◽

Tat-Seng Chua ◽

Gao Cong

Keyword(s):

Domain Specific ◽

Specific Term

Download Full-text

Domain-Specific Term Extraction for Concept Identification in Ontology Construction

2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI) ◽

10.1109/wi.2016.0016 ◽

2016 ◽

Cited By ~ 3

Author(s):

Kiruparan Balachandran ◽

Surangika Ranathunga

Keyword(s):

Concept Identification ◽

Ontology Construction ◽

Domain Specific ◽

Term Extraction ◽

Specific Term

Download Full-text

TLATR: Automatic Topic Labeling using Automatic (Domain-Specific) Term Recognition

IEEE Access ◽

10.1109/access.2021.3083000 ◽

2021 ◽

pp. 1-1

Author(s):

Ciprian-Octavian Truica ◽

Elena-Simona Apostol

Keyword(s):

Domain Specific ◽

Specific Term ◽

Topic Labeling

Download Full-text

Using Domain-Specific Term Frequencies to Identify and Classify Health Queries

Advances in Intelligent Systems and Computing - Advances in Information Systems and Technologies ◽

10.1007/978-3-642-36981-0_21 ◽

2013 ◽

pp. 221-226 ◽

Cited By ~ 1

Author(s):

Carla Teixeira Lopes ◽

Daniela Dias ◽

Cristina Ribeiro

Keyword(s):

Domain Specific ◽

Specific Term

Download Full-text

Domain-specific term extraction from free texts

2012 9th International Conference on Fuzzy Systems and Knowledge Discovery ◽

10.1109/fskd.2012.6234350 ◽

2012 ◽

Cited By ~ 3

Author(s):

Chunxia Zhang ◽

Zhendong Niu ◽

Peng Jiang ◽

Hongping Fu

Keyword(s):

Domain Specific ◽

Term Extraction ◽

Specific Term

Download Full-text

An Improved Term Weighting Method Based on Relevance Frequency for Text Classification

10.21203/rs.3.rs-680515/v1 ◽

2021 ◽

Author(s):

Chuanxiao Li ◽

Wenqiang Li ◽

Zhong Tang ◽

Song Li ◽

Hai Xiang

Keyword(s):

Text Classification ◽

Great Influence ◽

Term Weighting ◽

Classification Result ◽

Weighting Method ◽

Local Weight ◽

Weighting Schemes ◽

Term Frequency ◽

Specific Term ◽

Global Weight

Abstract As a vital step of text classification (TC) task, the assignment of term weight has a great influence on the performance of TC. Currently, masses of term weighting schemes can be utilized, such as term frequency-inverse documents frequency (TF-IDF) and term frequency-relevance frequency (TF-RF), and they are all consisted of local part (TF) and global part (e.g., IDF, RF). However, most of these schemes adopt the logarithmic processing on their respective global parts, and it is natural to consider whether the logarithmic processing apply to all these schemes or not. Actually, for a specific term weighting scheme, due to its different ratio of local weight and global weight resulting from logarithmic processing, it usually shows diverse text clasification results on different text sets, which presents poor robustness. To explore the influence of logarithmic processing imposed on the global weight on the classification result of term weighting schemes, TF-RF is selected as the representative because it can achieve a better performance among these schemes adopting logarithmic processing. Then, two propositions along with corresponding methods about the relation between TF part and RF part are proposed based on TF-RF. In addition, two groups of experiments are conducted on the two methods. The first group of experiments proves that one method (denoted as TF-ERF) is more helpful to the improvement than the other one (denoted as ETF-RF). The second group of experiments shows that TF-ERF not only ourperforms TF-RF but also obtains better performance than other existing term weighting schemes.

Download Full-text