Re-examination of Rule-Based Methods in Deidentification of Electronic Health Records: Algorithm Development and Validation

Zhenyu Zhao; Muyun Yang; Buzhou Tang; Tiejun Zhao

doi:10.2196/17622

Re-examination of Rule-Based Methods in Deidentification of Electronic Health Records: Algorithm Development and Validation

JMIR Medical Informatics ◽

10.2196/17622 ◽

2020 ◽

Vol 8 (4) ◽

pp. e17622

Author(s):

Zhenyu Zhao ◽

Muyun Yang ◽

Buzhou Tang ◽

Tiejun Zhao

Keyword(s):

Ensemble Learning ◽

High Performance ◽

Data Set ◽

Rule Based ◽

Rule Composition ◽

Open Issue ◽

Rule System ◽

Clinical Records ◽

Development And Validation ◽

Effective Contribution

Background Deidentification of clinical records is a critical step before their publication. This is usually treated as a type of sequence labeling task, and ensemble learning is one of the best performing solutions. Under the framework of multi-learner ensemble, the significance of a candidate rule-based learner remains an open issue. Objective The aim of this study is to investigate whether a rule-based learner is useful in a hybrid deidentification system and offer suggestions on how to build and integrate a rule-based learner. Methods We chose a data-driven rule-learner named transformation-based error-driven learning (TBED) and integrated it into the best performing hybrid system in this task. Results On the popular Informatics for Integrating Biology and the Bedside (i2b2) deidentification data set, experiments showed that TBED can offer high performance with its generated rules, and integrating the rule-based model into an ensemble framework, which reached an F1 score of 96.76%, achieved the best performance reported in the community. Conclusions We proved the rule-based method offers an effective contribution to the current ensemble learning approach for the deidentification of clinical records. Such a rule system could be automatically learned by TBED, avoiding the high cost and low reliability of manual rule composition. In particular, we boosted the ensemble model with rules to create the best performance of the deidentification of clinical records.

Download Full-text

Re-examination of Rule-Based Methods in Deidentification of Electronic Health Records: Algorithm Development and Validation (Preprint)

10.2196/preprints.17622 ◽

2019 ◽

Author(s):

Zhenyu Zhao ◽

Muyun Yang ◽

Buzhou Tang ◽

Tiejun Zhao

Keyword(s):

Ensemble Learning ◽

High Performance ◽

Data Set ◽

Rule Based ◽

Rule Composition ◽

Open Issue ◽

Rule System ◽

Clinical Records ◽

Development And Validation ◽

Effective Contribution

BACKGROUND Deidentification of clinical records is a critical step before their publication. This is usually treated as a type of sequence labeling task, and ensemble learning is one of the best performing solutions. Under the framework of multi-learner ensemble, the significance of a candidate rule-based learner remains an open issue. OBJECTIVE The aim of this study is to investigate whether a rule-based learner is useful in a hybrid deidentification system and offer suggestions on how to build and integrate a rule-based learner. METHODS We chose a data-driven rule-learner named transformation-based error-driven learning (TBED) and integrated it into the best performing hybrid system in this task. RESULTS On the popular Informatics for Integrating Biology and the Bedside (i2b2) deidentification data set, experiments showed that TBED can offer high performance with its generated rules, and integrating the rule-based model into an ensemble framework, which reached an F1 score of 96.76%, achieved the best performance reported in the community. CONCLUSIONS We proved the rule-based method offers an effective contribution to the current ensemble learning approach for the deidentification of clinical records. Such a rule system could be automatically learned by TBED, avoiding the high cost and low reliability of manual rule composition. In particular, we boosted the ensemble model with rules to create the best performance of the deidentification of clinical records.

Download Full-text

Development and validation of a high-performance thin-layer chromatography method for the determination of artesunate and amodiaquine in tablet formulations

JPC - Journal of Planar Chromatography - Modern TLC ◽

10.1556/1006.2017.30.4.11 ◽

2017 ◽

Vol 30 (4) ◽

pp. 307-312

Author(s):

Yonah H. Mwalwisi ◽

Seraphina C. Omolo ◽

Ludwig Hoellein ◽

Danstan H. Shewiyo ◽

Ulrike Holzgrabe ◽

...

Keyword(s):

Thin Layer ◽

Thin Layer Chromatography ◽

High Performance ◽

Chromatography Method ◽

Thin Layer Chromatography Method ◽

Tablet Formulations ◽

Development And Validation ◽

Layer Chromatography

Download Full-text

Development and validation of a high-performance thin-layer chromatographic method for the determination of biomarker β-amyrin in the leaves of differentFicusspecies

JPC - Journal of Planar Chromatography - Modern TLC ◽

10.1556/1006.2015.28.3.5 ◽

2015 ◽

Vol 28 (3) ◽

pp. 223-228 ◽

Cited By ~ 1

Author(s):

Omer A. Basudan ◽

Perwez Alam ◽

Nasir A. Siddiqui ◽

Mohamed F. Alajmi ◽

Adnan J. Alrehaily ◽

...

Keyword(s):

Thin Layer ◽

High Performance ◽

Chromatographic Method ◽

Development And Validation

Download Full-text

Development and Validation of a High-Performance Thin-Layer Chromatography Method for the Quantification of Rutin in the Fruit Pulp of Benincasa hispida (Thunb.) Cogniaux

JPC - Journal of Planar Chromatography - Modern TLC ◽

10.1556/1006.2019.32.5.4 ◽

2019 ◽

Vol 32 (5) ◽

pp. 371-377 ◽

Cited By ~ 4

Author(s):

Anshul Shakya ◽

Neelutpal Gogoi ◽

Sushil Kumar Chaudhary ◽

Hans Raj Bhat ◽

Surajit Kumar Ghosh

Keyword(s):

Thin Layer ◽

Thin Layer Chromatography ◽

High Performance ◽

Fruit Pulp ◽

Chromatography Method ◽

Thin Layer Chromatography Method ◽

Benincasa Hispida ◽

Development And Validation ◽

Layer Chromatography

Download Full-text

Development and validation of fingerprints ofTurnera diffusaextracts obtained by use of high-performance liquid chromatography with diode array detection and chemometric methods

Acta Chromatographica ◽

10.1556/achrom.21.2009.2.3 ◽

2009 ◽

Vol 21 (2) ◽

pp. 217-235 ◽

Cited By ~ 5

Author(s):

A. Garza-Juárez ◽

N. Waksman-De-Torres ◽

R. Ramírez-Durón ◽

M. L. Salazar Cavazos

Keyword(s):

High Performance Liquid Chromatography ◽

Liquid Chromatography ◽

High Performance ◽

Diode Array ◽

Diode Array Detection ◽

Chemometric Methods ◽

Array Detection ◽

Development And Validation

Download Full-text

Development and validation of high-performance thin-layer chromatographic method for the simultaneous determination of rifampicin, isoniazid, and pyrazinamide in a fixed dosage combination tablet

JPC - Journal of Planar Chromatography - Modern TLC ◽

10.1556/jpc.27.2014.5.11 ◽

2014 ◽

Vol 27 (5) ◽

pp. 392-397 ◽

Cited By ~ 1

Author(s):

Kagisha Védaste ◽

Kayitare Egide ◽

Kayumba Claver ◽

Eliangiringa Kaale

Keyword(s):

Thin Layer ◽

Simultaneous Determination ◽

High Performance ◽

Chromatographic Method ◽

Combination Tablet ◽

Development And Validation

Download Full-text

Development and Validation of High-Performance Thin-Layer Chromatography Method for Estimation of Teneligliptin in Bulk and in Pharmaceutical Formulation

Archives of Natural and Medicinal Chemistry ◽

10.29011/2577-0195.100008 ◽

2017 ◽

Vol 2 (3) ◽

Keyword(s):

Thin Layer ◽

Thin Layer Chromatography ◽

High Performance ◽

Pharmaceutical Formulation ◽

Chromatography Method ◽

Thin Layer Chromatography Method ◽

Development And Validation ◽

Layer Chromatography

Download Full-text

Hugoniot Data and Equation of State Parameters for an Ultra-High Performance Concrete

Journal of Dynamic Behavior of Materials ◽

10.1007/s40870-021-00315-6 ◽

2021 ◽

Author(s):

C. Sauer ◽

F. Bagusat ◽

M.-L. Ruiz-Ripoll ◽

C. Roller ◽

M. Sauer ◽

...

Keyword(s):

Equation Of State ◽

High Performance ◽

High Performance Concrete ◽

Applied Pressure ◽

High Strength ◽

Parameter Study ◽

Ultra High Performance Concrete ◽

Data Set ◽

Shock Response ◽

Particle Velocities

AbstractThis work aims at the characterization of a modern concrete material. For this purpose, we perform two experimental series of inverse planar plate impact (PPI) tests with the ultra-high performance concrete B4Q, using two different witness plate materials. Hugoniot data in the range of particle velocities from 180 to 840 m/s and stresses from 1.1 to 7.5 GPa is derived from both series. Within the experimental accuracy, they can be seen as one consistent data set. Moreover, we conduct corresponding numerical simulations and find a reasonably good agreement between simulated and experimentally obtained curves. From the simulated curves, we derive numerical Hugoniot results that serve as a homogenized, mean shock response of B4Q and add further consistency to the data set. Additionally, the comparison of simulated and experimentally determined results allows us to identify experimental outliers. Furthermore, we perform a parameter study which shows that a significant influence of the applied pressure dependent strength model on the derived equation of state (EOS) parameters is unlikely. In order to compare the current results to our own partially reevaluated previous work and selected recent results from literature, we use simulations to numerically extrapolate the Hugoniot results. Considering their inhomogeneous nature, a consistent picture emerges for the shock response of the discussed concrete and high-strength mortar materials. Hugoniot results from this and earlier work are presented for further comparisons. In addition, a full parameter set for B4Q, including validated EOS parameters, is provided for the application in simulations of impact and blast scenarios.

Download Full-text

An approach for document retrieval using cluster-based inverted indexing

Journal of Information Science ◽

10.1177/01655515211018401 ◽

2021 ◽

pp. 016555152110184

Author(s):

Gunjan Chandwani ◽

Anil Ahlawat ◽

Gaurav Dubey

Keyword(s):

High Performance ◽

Clustering Algorithm ◽

Pearson Correlation ◽

Relevant Information ◽

Document Retrieval ◽

Bhattacharyya Distance ◽

Data Set ◽

Query Matching ◽

Inverted Indexing ◽

Query Optimisation

Document retrieval plays an important role in knowledge management as it facilitates us to discover the relevant information from the existing data. This article proposes a cluster-based inverted indexing algorithm for document retrieval. First, the pre-processing is done to remove the unnecessary and redundant words from the documents. Then, the indexing of documents is done by the cluster-based inverted indexing algorithm, which is developed by integrating the piecewise fuzzy C-means (piFCM) clustering algorithm and inverted indexing. After providing the index to the documents, the query matching is performed for the user queries using the Bhattacharyya distance. Finally, the query optimisation is done by the Pearson correlation coefficient, and the relevant documents are retrieved. The performance of the proposed algorithm is analysed by the WebKB data set and Twenty Newsgroups data set. The analysis exposes that the proposed algorithm offers high performance with a precision of 1, recall of 0.70 and F-measure of 0.8235. The proposed document retrieval system retrieves the most relevant documents and speeds up the storing and retrieval of information.

Download Full-text

Dental Age Estimation: Development and Validation of a Reference Data Set for Kuwaiti Children, Adolescents, and Young Adults

Archives of Oral Biology ◽

10.1016/j.archoralbio.2021.105130 ◽

2021 ◽

pp. 105130

Author(s):

Anfal Karimi ◽

Muawia A. Qudeimat ◽

Victoria S. Lucas ◽

Graham Roberts

Keyword(s):

Young Adults ◽

Age Estimation ◽

Reference Data ◽

Adolescents And Young Adults ◽

Dental Age ◽

Dental Age Estimation ◽

Data Set ◽

Kuwaiti Children ◽

Development And Validation

Download Full-text