The Development of the Military Service Identification Tool: Identifying Military Veterans in a Clinical Research Database Using Natural Language Processing and Machine Learning (Preprint)

Mapping Intimacies ◽

10.2196/preprints.15852 ◽

2019 ◽

Author(s):

Daniel Leightley ◽

David Pernet ◽

Sumithra Velupillai ◽

Robert J Stewart ◽

Katharine M Mark ◽

...

Keyword(s):

Machine Learning ◽

Health Care ◽

Positive Predictive Value ◽

Negative Predictive Value ◽

Language Processing ◽

Predictive Value ◽

Military Service ◽

Well Being ◽

Free Text ◽

The Military

BACKGROUND Electronic health care records (EHRs) are a rich source of health-related information, with potential for secondary research use. In the United Kingdom, there is no national marker for identifying those who have previously served in the Armed Forces, making analysis of the health and well-being of veterans using EHRs difficult. OBJECTIVE This study aimed to develop a tool to identify veterans from free-text clinical documents recorded in a psychiatric EHR database. METHODS Veterans were manually identified using the South London and Maudsley (SLaM) Biomedical Research Centre Clinical Record Interactive Search—a database holding secondary mental health care electronic records for the SLaM National Health Service Foundation Trust. An iterative approach was taken; first, a structured query language (SQL) method was developed, which was then refined using natural language processing and machine learning to create the Military Service Identification Tool (MSIT) to identify if a patient was a civilian or veteran. Performance, defined as correct classification of veterans compared with incorrect classification, was measured using positive predictive value, negative predictive value, sensitivity, F1 score, and accuracy (otherwise termed Youden Index). RESULTS A gold standard dataset of 6672 free-text clinical documents was manually annotated by human coders. Of these documents, 66.00% (4470/6672) were then used to train the SQL and MSIT approaches and 34.00% (2202/6672) were used for testing the approaches. To develop the MSIT, an iterative 2-stage approach was undertaken. In the first stage, an SQL method was developed to identify veterans using a keyword rule–based approach. This approach obtained an accuracy of 0.93 in correctly predicting civilians and veterans, a positive predictive value of 0.81, a sensitivity of 0.75, and a negative predictive value of 0.95. This method informed the second stage, which was the development of the MSIT using machine learning, which, when tested, obtained an accuracy of 0.97, a positive predictive value of 0.90, a sensitivity of 0.91, and a negative predictive value of 0.98. CONCLUSIONS The MSIT has the potential to be used in identifying veterans in the United Kingdom from free-text clinical documents, providing new and unique insights into the health and well-being of this population and their use of mental health care services.

Download Full-text

The Development of the Military Service Identification Tool: Identifying Military Veterans in a Clinical Research Database Using Natural Language Processing and Machine Learning

JMIR Medical Informatics ◽

10.2196/15852 ◽

2020 ◽

Vol 8 (5) ◽

pp. e15852

Author(s):

Daniel Leightley ◽

David Pernet ◽

Sumithra Velupillai ◽

Robert J Stewart ◽

Katharine M Mark ◽

...

Keyword(s):

Machine Learning ◽

Health Care ◽

Positive Predictive Value ◽

Negative Predictive Value ◽

Language Processing ◽

Predictive Value ◽

Military Service ◽

Well Being ◽

Free Text ◽

The Military

Background Electronic health care records (EHRs) are a rich source of health-related information, with potential for secondary research use. In the United Kingdom, there is no national marker for identifying those who have previously served in the Armed Forces, making analysis of the health and well-being of veterans using EHRs difficult. Objective This study aimed to develop a tool to identify veterans from free-text clinical documents recorded in a psychiatric EHR database. Methods Veterans were manually identified using the South London and Maudsley (SLaM) Biomedical Research Centre Clinical Record Interactive Search—a database holding secondary mental health care electronic records for the SLaM National Health Service Foundation Trust. An iterative approach was taken; first, a structured query language (SQL) method was developed, which was then refined using natural language processing and machine learning to create the Military Service Identification Tool (MSIT) to identify if a patient was a civilian or veteran. Performance, defined as correct classification of veterans compared with incorrect classification, was measured using positive predictive value, negative predictive value, sensitivity, F1 score, and accuracy (otherwise termed Youden Index). Results A gold standard dataset of 6672 free-text clinical documents was manually annotated by human coders. Of these documents, 66.00% (4470/6672) were then used to train the SQL and MSIT approaches and 34.00% (2202/6672) were used for testing the approaches. To develop the MSIT, an iterative 2-stage approach was undertaken. In the first stage, an SQL method was developed to identify veterans using a keyword rule–based approach. This approach obtained an accuracy of 0.93 in correctly predicting civilians and veterans, a positive predictive value of 0.81, a sensitivity of 0.75, and a negative predictive value of 0.95. This method informed the second stage, which was the development of the MSIT using machine learning, which, when tested, obtained an accuracy of 0.97, a positive predictive value of 0.90, a sensitivity of 0.91, and a negative predictive value of 0.98. Conclusions The MSIT has the potential to be used in identifying veterans in the United Kingdom from free-text clinical documents, providing new and unique insights into the health and well-being of this population and their use of mental health care services.

Download Full-text

Development and utilization of an intelligent application for aiding COVID-19 diagnosis

10.1101/2020.03.18.20035816 ◽

2020 ◽

Cited By ~ 12

Author(s):

Zirui Meng ◽

Minjin Wang ◽

Huan Song ◽

Shuo Guo ◽

Yanbing Zhou ◽

...

Keyword(s):

Machine Learning ◽

Positive Predictive Value ◽

Negative Predictive Value ◽

Model Prediction ◽

Predictive Value ◽

External Validation ◽

Prediction Performance ◽

Development And Utilization ◽

Laboratory Examinations ◽

Diagnosis Model

ABSTRACTBackgroundCOVID-19 has been spreading globally since emergence, but the diagnostic resources are relatively insufficient.ResultsIn order to effectively relieve the resource deficiency of diagnosing COVID-19, we developed a machine learning-based diagnosis model on basis of laboratory examinations indicators from a total of 620 samples, and subsequently implemented it as a COVID-19 diagnosis aid APP to facilitate promotion.ConclusionsExternal validation showed satisfiable model prediction performance (i.e., the positive predictive value and negative predictive value was 86.35% and 84.62%, respectively), which guarantees the promising use of this tool for extensive screening.

Download Full-text

Development and Validation of a Predictive Model for Coronary Artery Disease Using Machine Learning

Frontiers in Cardiovascular Medicine ◽

10.3389/fcvm.2021.614204 ◽

2021 ◽

Vol 8 ◽

Author(s):

Chen Wang ◽

Yue Zhao ◽

Bingyu Jin ◽

Xuedong Gan ◽

Bin Liang ◽

...

Keyword(s):

Machine Learning ◽

Coronary Artery Disease ◽

Coronary Artery ◽

Random Forest ◽

Positive Predictive Value ◽

Negative Predictive Value ◽

Predictive Value ◽

Validation Cohort ◽

Random Forest Algorithm ◽

Artery Disease

Early identification of coronary artery disease (CAD) can prevent the progress of CAD and effectually lower the mortality rate, so we intended to construct and validate a machine learning model to predict the risk of CAD based on conventional risk factors and lab test data. There were 3,112 CAD patients and 3,182 controls enrolled from three centers in China. We compared the baseline and clinical characteristics between two groups. Then, Random Forest algorithm was used to construct a model to predict CAD and the model was assessed by receiver operating characteristic (ROC) curve. In the development cohort, the Random Forest model showed a good AUC 0.948 (95%CI: 0.941–0.954) to identify CAD patients from controls, with a sensitivity of 90%, a specificity of 85.4%, a positive predictive value of 0.863 and a negative predictive value of 0.894. Validation of the model also yielded a favorable discriminatory ability with the AUC, sensitivity, specificity, positive predictive value, and negative predictive value of 0.944 (95%CI: 0.934–0.955), 89.5%, 85.8%, 0.868, and 0.886 in the validation cohort 1, respectively, and 0.940 (95%CI: 0.922–0.960), 79.5%, 94.3%, 0.932, and 0.823 in the validation cohort 2, respectively. An easy-to-use tool that combined 15 indexes to assess the CAD risk was constructed and validated using Random Forest algorithm, which showed favorable predictive capability (http://45.32.120.149:3000/randomforest). Our model is extremely valuable for clinical practice, which will be helpful for the management and primary prevention of CAD patients.

Download Full-text

The Perpetuation of Gender Norms and White Hegemonic Patriarchy in the South African Defence Force as Represented in André van der Merwe’s Moffie (2006)

Gender Questions ◽

10.25159/2412-8457/7240 ◽

2020 ◽

Vol 8 (2) ◽

Author(s):

Bernard Nolen Fortuin

Keyword(s):

South Africa ◽

Military Service ◽

Well Being ◽

The Novel ◽

White Family ◽

White Children ◽

White Masculinity ◽

Party Government ◽

Compulsory Military Service ◽

The Military

With the institution of compulsory military service in South Africa in 1948 the National Party government effected a tool well shaped for the construction of hegemonic masculinities. Through this, and other structures like schools and families, white children were shaped into submissive abiding citizens. Due to the brutal nature of a militarised society, gender roles become strictly defined and perpetuated. As such, white men’s time served on the border also “toughened” them up and shaped them into hegemonic copies of each other, ready to enforce patriarchal and racist ideologies. In this article, I look at how the novel Moffie by André Carl van der Merwe (2006) illustrates hegemonic white masculinity in South Africa and how it has long been strictly regulated to perpetuate the well-being of the white family as representative of the capitalist state. I discuss the novel by looking at the ways in which the narrator is marked by service in the military, which functions as a socialising agent, but as importantly by the looming threat of the application of the term “moffie” to himself, by self or others.

Download Full-text

Added value of diffusion-weighted MRI in assessment of pleural lesions

Egyptian Journal of Radiology and Nuclear Medicine ◽

10.1186/s43055-021-00557-3 ◽

2021 ◽

Vol 52 (1) ◽

Author(s):

Youssriah Yahia Sabri ◽

Ikram Hamed Mahmoud ◽

Lamis Tarek El-Gendy ◽

Mohamed Raafat Abd El-Mageed ◽

Sally Fouad Tadros

Keyword(s):

Positive Predictive Value ◽

Negative Predictive Value ◽

Predictive Value ◽

Diagnostic Value ◽

Added Value ◽

Adc Value ◽

Pleural Diseases ◽

Conventional Mri ◽

Malignant Lesions

Abstract Background There are many causes of pleural disease including variable benign and malignant etiologies. DWI is a non-enhanced functional MRI technique that allows qualitative and quantitative characterization of tissues based on their water molecules diffusivity. The aim of this study was to evaluate the diagnostic value of DWI-MRI in detection and characterization of pleural diseases and its capability in differentiating benign from malignant pleural lesions. Results Conventional MRI was able to discriminate benign from malignant lesions by using morphological features (contour and thickness) with sensitivity 89.29%, specificity 76%, positive predictive value 89%, negative predictive value 76.92%, and accuracy 85.37%. ADC value as a quantitative parameter of DWI found that ADC values of malignant pleural diseases were significantly lower than that of benign lesions (P < 0.001). Hence, we discovered that using ADC mean value of 1.68 × 10-3 mm2/s as a cutoff value can differentiate malignant from benign pleural diseases with sensitivity 89.3%, specificity 100%, positive predictive value 100%, negative predictive value 81.2%, and accuracy 92.68% (P < 0.001). Conclusion Although DWI-MRI is unable to differentiate between malignant and benign pleural effusion, its combined morphological and functional information provide valid non-invasive method to accurately characterize pleural soft tissue diseases differentiating benign from malignant lesions with higher specificity and accuracy than conventional MRI.

Download Full-text

Accuracy of Medical Examiner’s Assessment for Near–Real-Time Surveillance of Fatal Drug Overdoses, King County, Washington, March 2017–February 2018

Public Health Reports ◽

10.1177/00333549211008455 ◽

2021 ◽

pp. 003335492110084

Author(s):

Kirsten Vannice ◽

Julia Hood ◽

Nicole Yarid ◽

Meagan Kay ◽

Richard Harruff ◽

...

Keyword(s):

Public Health ◽

Positive Predictive Value ◽

Real Time ◽

Negative Predictive Value ◽

Drug Overdose ◽

Predictive Value ◽

Death Certificate ◽

King County ◽

Public Health Response ◽

Drug Overdoses

Objectives Up-to-date information on the occurrence of drug overdose is critical to guide public health response. The objective of our study was to evaluate a near–real-time fatal drug overdose surveillance system to improve timeliness of drug overdose monitoring. Methods We analyzed data on deaths in the King County (Washington) Medical Examiner’s Office (KCMEO) jurisdiction that occurred during March 1, 2017–February 28, 2018, and that had routine toxicology test results. Medical examiners (MEs) classified probable drug overdoses on the basis of information obtained through the death investigation and autopsy. We calculated sensitivity, positive predictive value, specificity, and negative predictive value of MEs’ classification by using the final death certificate as the gold standard. Results KCMEO investigated 2480 deaths; 1389 underwent routine toxicology testing, and 361 were toxicologically confirmed drug overdoses from opioid, stimulant, or euphoric drugs. Sensitivity of the probable overdose classification was 83%, positive predictive value was 89%, specificity was 96%, and negative predictive value was 94%. Probable overdoses were classified a median of 1 day after the event, whereas the final death certificate confirming an overdose was received by KCMEO an average of 63 days after the event. Conclusions King County MEs’ probable overdose classification provides a near–real-time indicator of fatal drug overdoses, which can guide rapid local public health responses to the drug overdose epidemic.

Download Full-text

Role of annexin A2 and osteopontin for early diagnosis of hepatocellular carcinoma in hepatitis C virus patients

Egyptian Liver Journal ◽

10.1186/s43066-019-0004-9 ◽

2019 ◽

Vol 9 (1) ◽

Author(s):

Abd El-Fattah F. Hanno ◽

Fatma M. Abd El-Aziz ◽

Akram A. Deghady ◽

Ehab H. El-Kholy ◽

Aborawy I. Aborawy

Keyword(s):

Hepatocellular Carcinoma ◽

Chronic Hepatitis ◽

Hepatitis C ◽

Early Diagnosis ◽

Positive Predictive Value ◽

Negative Predictive Value ◽

Chronic Hepatitis C ◽

Predictive Value ◽

Annexin A2 ◽

Alpha Fetoprotein

Abstract Background Liver cancer is the fifth most common cancer and the second most frequent cause of cancer-related death globally. Early stages of hepatocellular carcinoma (0&A) can be treated with curative procedures. The aim of this work was to evaluate the role of annexin A2 and osteopontin for early diagnosis of hepatocellular carcinoma in hepatitis C virus patients. Methods The study was carried out on 80 patients classified into two groups. Group A had 40 chronic hepatitis C patients without hepatocellular carcinoma, while group B had 40 chronic hepatitis C patients with early hepatocellular carcinoma (stages; 0&A). All patients were subjected to thorough history taking, clinical examination, liver function tests, renal function tests, serum alpha-fetoprotein, serum osteopontin, and serum annexin A2. Results Serum alpha-fetoprotein was found to be statistically significantly higher in patients with the hepatocellular carcinoma group than the chronic hepatitis C group. The ROC curve for alpha-fetoprotein for detection of HCC was significant, its diagnostic performance was 0.818* (p < 0.001*), and the cutoff point for predicting the probability for HCC was 6.0 (ng/ml) with sensitivity of 77.50%, specificity of 82.50%, positive predictive value of 81.60%, negative predictive value of 78.6%, and accuracy of 80%. Serum osteopontin was found to be statistically significantly higher in patients from the hepatocellular carcinoma group than the chronic hepatitis C group. The ROC curve for osteopontin was significant, its diagnostic performance was 0.739* (p < 0.001*), the cutoff point was 13.2 (ng/ml) with sensitivity of 65.0%, specificity of 90.0%, positive predictive value of 86.70%, negative predictive value of 72.0%, and accuracy of 77.0%. Serum annexin A2 was found to be statistically significantly higher in patients from the hepatocellular carcinoma group than the chronic hepatitis C group. The ROC curve for annexin A2 was significant, its diagnostic performance was 0.927* (p < 0.001*), the cutoff point was 10.1(ng/ml) with sensitivity of 85.0%, specificity of 85.0%, positive predictive value of 85.0%, negative predictive value of 85.0%, and accuracy of 85.0%. Conclusions Osteopontin had better specificity but lower sensitivity than serum alpha-fetoprotein for early diagnosis of hepatocellular carcinoma. Annexin A2 had better diagnostic sensitivity and specificity than alpha-fetoprotein for early diagnosis of hepatocellular carcinoma.

Download Full-text

POS0706 PERFORMANCES OF DIFFERENT CLASSIFICATION CRITERIA FOR SYSTEMIC LUPUS ERYTHEMATOSUS IN A SINGLE CENTER COHORT FROM TURKEY

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2021-eular.366 ◽

2021 ◽

Vol 80 (Suppl 1) ◽

pp. 602.1-603

Author(s):

E. S. Torun ◽

E. Bektaş ◽

F. Kemik ◽

M. Bektaş ◽

C. Cetin ◽

...

Keyword(s):

Systemic Lupus Erythematosus ◽

Positive Predictive Value ◽

Negative Predictive Value ◽

Antiphospholipid Syndrome ◽

Lupus Erythematosus ◽

Predictive Value ◽

Sjogren's Syndrome ◽

Classification Criteria ◽

Systemic Lupus ◽

Acr Criteria

Background:Recently developed EULAR/ACR classification criteria for systemic lupus erythematosus (SLE) have important differences compared to the 2012 Systemic Lupus International Collaborating Clinics (SLICC) SLE classification criteria and the revised 1997 American College of Rheumatology (ACR) criteria: The obligatory entry criterion of antinuclear antibody (ANA) positivity is introduced and a “weighted” approach is used1. Sensitivity and specificity of these three criteria have been debated and may vary in different populations and clinical settings.Objectives:We aim to compare the performances of three criteria sets/rules in a large cohort of patients and relevant diseased controls from a reference center with dedicated clinics for SLE and other autoimmune/inflammatory connective tissue diseases from Turkey.Methods:We reviewed the medical records of SLE patients and diseased controls for clinical and laboratory features relevant to all sets of criteria. Criteria sets/rules were analysed based on sensitivity, positive predictive value, specificity and negative predictive value, using clinical diagnosis with at least 6 months of follow-up as the gold standard. A subgroup analysis was performed in ANA positive patients for both SLE patients and diseased controls. SLE patients that did not fulfil 2012 SLICC criteria and 2019 EULAR/ACR criteria and diseased controls that fulfilled these criteria were evaluated.Results:A total of 392 SLE patients and 294 non-SLE diseased controls (48 undifferentiated connective tissue disease, 51 Sjögren’s syndrome, 43 idiopathic inflammatory myopathy, 50 systemic sclerosis, 52 primary antiphospholipid syndrome, 15 rheumatoid arthritis, 15 psoriatic arthritis and 20 ANCA associated vasculitis) were included into the study. Hundred and fourteen patients (16.6%) were ANA negative.Sensitivity was more than 90% for 2012 SLICC criteria and 2019 EULAR/ACR criteria and positive predictive value was more than 90% for all three criteria (Table 1). Specificity was the highest for 1997 ACR criteria. Negative predictive value was 76.9% for ACR criteria, 88.4% for SLICC criteria and 91.7% for EULAR/ACR criteria.In only ANA positive patients, sensitivity was 79.6% for 1997 ACR criteria, 92.2% for 2012 SLICC criteria and 96.1% for 2019 EULAR/ACR criteria. Specificity was 92.6% for ACR criteria, 87.8% for SLICC criteria 85.2% for EULAR/ACR criteria.Eleven clinically diagnosed SLE patients had insufficient number of items for both 2012 SLICC and 2019 EULAR/ACR criteria. Both criteria were fulfilled by 16 diseased controls: 9 with Sjögren’s syndrome, 5 with antiphospholipid syndrome, one with dermatomyositis and one with systemic sclerosis.Table 1.Sensitivity, positive predictive value, specificity and negative predictive value of 1997 ACR, 2012 SLICC and 2019 EULAR/ACR classification criteriaSLE (+)SLE (-)Sensitivity (%)Positive Predictive Value (%)Specificity (%)Negative Predictive Value (%)1997 ACR(+) 308(-) 841527978.695.494.976.92012 SLICC(+) 357(-) 352626891.193.291.288.42019 EULAR/ACR(+) 368(-) 242826693.892.990.591.7Conclusion:In this cohort, although all three criteria have sufficient specificity, sensitivity and negative predictive value of 1997 ACR criteria are the lowest. Overall, 2019 EULAR/ACR and 2012 SLICC criteria have a comparable performance, but if only ANA positive cases and controls are analysed, the specificity of both criteria decrease to less than 90%. Some SLE patients with a clinical diagnosis lacked sufficient number of criteria. Mostly, patients with Sjögren’s syndrome or antiphospholipid syndrome are prone to misclassification by both recent criteria.References:[1]Aringer M, Costenbader K, Daikh D, et al. 2019 European League Against Rheumatism/American College of Rheumatology classification criteria for systemic lupus erythematosus. Ann Rheum Dis 2019;78:1151-1159.Disclosure of Interests:None declared

Download Full-text

Sentiment Analysis Techniques Applied to Raw-Text Data from a Csq-8 Questionnaire about Mindfulness in Times of COVID-19 to Improve Strategy Generation

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18126408 ◽

2021 ◽

Vol 18 (12) ◽

pp. 6408

Author(s):

Mario Jojoa Acosta ◽

Gema Castillo-Sánchez ◽

Begonya Garcia-Zapirain ◽

Isabel de la Torre Díez ◽

Manuel Franco-Martín

Keyword(s):

Health Care ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Transfer Learning ◽

Language Processing ◽

Health Care Professionals ◽

Ground Truth ◽

Relevant Information ◽

Free Text

The use of artificial intelligence in health care has grown quickly. In this sense, we present our work related to the application of Natural Language Processing techniques, as a tool to analyze the sentiment perception of users who answered two questions from the CSQ-8 questionnaires with raw Spanish free-text. Their responses are related to mindfulness, which is a novel technique used to control stress and anxiety caused by different factors in daily life. As such, we proposed an online course where this method was applied in order to improve the quality of life of health care professionals in COVID 19 pandemic times. We also carried out an evaluation of the satisfaction level of the participants involved, with a view to establishing strategies to improve future experiences. To automatically perform this task, we used Natural Language Processing (NLP) models such as swivel embedding, neural networks, and transfer learning, so as to classify the inputs into the following three categories: negative, neutral, and positive. Due to the limited amount of data available—86 registers for the first and 68 for the second—transfer learning techniques were required. The length of the text had no limit from the user’s standpoint, and our approach attained a maximum accuracy of 93.02% and 90.53%, respectively, based on ground truth labeled by three experts. Finally, we proposed a complementary analysis, using computer graphic text representation based on word frequency, to help researchers identify relevant information about the opinions with an objective approach to sentiment. The main conclusion drawn from this work is that the application of NLP techniques in small amounts of data using transfer learning is able to obtain enough accuracy in sentiment analysis and text classification stages.

Download Full-text

Applying natural language processing and machine learning techniques to patient experience feedback: a systematic review

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2020-100262 ◽

2021 ◽

Vol 28 (1) ◽

pp. e100262

Author(s):

Mustafa Khanbhai ◽

Patrick Anyadi ◽

Joshua Symons ◽

Kelsey Flott ◽

Ara Darzi ◽

...

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Patient Experience ◽

Language Processing ◽

Performance Metrics ◽

Free Text ◽

Patient Feedback

ObjectivesUnstructured free-text patient feedback contains rich information, and analysing these data manually would require a lot of personnel resources which are not available in most healthcare organisations.To undertake a systematic review of the literature on the use of natural language processing (NLP) and machine learning (ML) to process and analyse free-text patient experience data.MethodsDatabases were systematically searched to identify articles published between January 2000 and December 2019 examining NLP to analyse free-text patient feedback. Due to the heterogeneous nature of the studies, a narrative synthesis was deemed most appropriate. Data related to the study purpose, corpus, methodology, performance metrics and indicators of quality were recorded.ResultsNineteen articles were included. The majority (80%) of studies applied language analysis techniques on patient feedback from social media sites (unsolicited) followed by structured surveys (solicited). Supervised learning was frequently used (n=9), followed by unsupervised (n=6) and semisupervised (n=3). Comments extracted from social media were analysed using an unsupervised approach, and free-text comments held within structured surveys were analysed using a supervised approach. Reported performance metrics included the precision, recall and F-measure, with support vector machine and Naïve Bayes being the best performing ML classifiers.ConclusionNLP and ML have emerged as an important tool for processing unstructured free text. Both supervised and unsupervised approaches have their role depending on the data source. With the advancement of data analysis tools, these techniques may be useful to healthcare organisations to generate insight from the volumes of unstructured free-text data.

Download Full-text