scholarly journals CTSS: A Tool for Efficient Information Extraction with Soft Matching Rules for Text Mining

2008 ◽  
Vol 4 (5) ◽  
pp. 375-381 ◽  
Author(s):  
A. Christy ◽  
P. Thambidura
2015 ◽  
Vol 6 (4) ◽  
pp. 35-49 ◽  
Author(s):  
Laurent Issertial ◽  
Hiroshi Tsuji

This paper proposes a system called CFP Manager specialized on IT field and designed to ease the process of searching conference suitable to one's need. At present, the handling of CFP faces two problems: for emails, the huge quantity of CFP received can be easily skimmed through. For websites, the reviewing of some of the main CFP aggregators available online points out the lack of usable criteria. This system proposes to answer to these problems via its architecture consisting of three components: firstly an Information Extraction module extracting relevant information (as date, location, etc...) from CFP using rule based text mining algorithm. The second component enriches the now extracted data with external one from ontology models. Finally the last one displays the said data and allows the end user to perform complex queries on the CFP dataset and thus allow him to only access to CFP suitable for him. In order to validate the authors' proposal, they eventually process the well-known precision / recall metric on our information extraction component with an average of 0.95 for precision and 0.91 for recall on three different 100 CFP dataset. This paper finally discusses the validity of our approach by confronting our system for different queries with two systems already available online (WikiCFP and IEEE Conference Search) and basic text searching approach standing for searching in an email box. On a 100 CFP dataset with the wide variety of usable data and the possibility to perform complex queries we surpass basic text searching method and WikiCFP by not returning the false positive usually returned by them and find a result close to the IEEE system.


2021 ◽  
Author(s):  
tatsawan timakum ◽  
Min Song ◽  
Qing Xie

Abstract Background: E-mentalhealthcare is the convergence of digital technologies with mental health services. It has beendevelopedto fill a gap in healthcare for people who need mental wellbeing support and may never otherwise receive psychological treatment.This study aimed to apply text mining techniques to analyze the huge data of e-mental health researches and to report on research clusters and trends as well as the co-occurrence of biomedical and the use of information technology in this field.Methods: The e-mentalhealth research data was obtainedfrom 3,663 bibliographicrecords from Web of Science (WoS)and 3,172 full-text articlesfrom PubMed Central (PMC). The text mining techniques utilized for this study includedbibliometric analysis, information extraction, and visualization.Results: The e-mental health research topic trendsprimarily involvede-health care services and medical informatics research. The clusters of research comprise 16 clusters, which refer to mental sickness, ehealth, diseases, IT, and self-management. Based onthe information extraction analysis, in the biomedical domain, a “depression” entity was frequently detected and it pairs with other entities in the network with a betweenness centrality weighted at 0.046869 (eg. depression-online, depression-diabetes, depression-measure, and depression-mobile).The IT entity-relations of “mobile” were the most frequently found(weighted at 0.043466). The top pairs are related to depression, mobile health, and text message.Conclusions: E-mental health research trends focused on disease related-depression and using IT for treatment and prevention, primarily via online and mobile devices. Producing AI and machine learning are also being studied for e-mental healthcare. The results illustrate that physical sickness is likely to cause a mental health problem and identify the IT that was applied to help manage and mitigate mental health impacts.


2021 ◽  
Vol 3 ◽  
Author(s):  
Luke T. Slater ◽  
Andreas Karwath ◽  
Robert Hoehndorf ◽  
Georgios V. Gkoutos

Semantic similarity is a useful approach for comparing patient phenotypes, and holds the potential of an effective method for exploiting text-derived phenotypes for differential diagnosis, text and document classification, and outcome prediction. While approaches for context disambiguation are commonly used in text mining applications, forming a standard component of information extraction pipelines, their effects on semantic similarity calculations have not been widely explored. In this work, we evaluate how inclusion and disclusion of negated and uncertain mentions of concepts from text-derived phenotypes affects similarity of patients, and the use of those profiles to predict diagnosis. We report on the effectiveness of these approaches and report a very small, yet significant, improvement in performance when classifying primary diagnosis over MIMIC-III patient visits.


Sign in / Sign up

Export Citation Format

Share Document