On the Potential of Taxonomic Graphs to Improve Applicability and Performance for the Classification of Biomedical Patents

Kai Frerich; Mark Bukowski; Sandra Geisler; Robert Farkas

doi:10.3390/app11020690

On the Potential of Taxonomic Graphs to Improve Applicability and Performance for the Classification of Biomedical Patents

Applied Sciences ◽

10.3390/app11020690 ◽

2021 ◽

Vol 11 (2) ◽

pp. 690

Author(s):

Kai Frerich ◽

Mark Bukowski ◽

Sandra Geisler ◽

Robert Farkas

Keyword(s):

Technology Management ◽

Classification Performance ◽

Ensemble Classification ◽

Combination Rules ◽

Patent Classification ◽

Tree Graphs ◽

Fusion Methods ◽

And Performance ◽

Artificial Neural Network Ann

A core task in technology management in biomedical engineering and beyond is the classification of patents into domain-specific categories, increasingly automated by machine learning, with the fuzzy language of patents causing particular problems. Striving for higher classification performance, increasingly complex models have been developed, based not only on text but also on a wealth of distinct (meta) data and methods. However, this makes it difficult to access and integrate data and to fuse distinct predictions. Although the already established Cooperate Patent Classification (CPC) offers a plethora of information, it is rarely used in automated patent categorization. Thus, we combine taxonomic and textual information to an ensemble classification system comparing stacking and fixed combination rules as fusion methods. Various classifiers are trained on title/abstract and on both the CPC and IPC (International Patent Classification) assignments of 1230 patents covering six categories of future biomedical innovation. The taxonomies are modeled as tree graphs, parsed and transformed by Dissimilarity Space Embedding (DSE) to real-valued vectors. The classifier ensemble tops the basic performance by nearly 10 points to F1 = 78.7% when stacked with a feed-forward Artificial Neural Network (ANN). Taxonomic base classifiers perform nearly as well as the text-based learners. Moreover, an ensemble only of CPC and IPC learners reaches F1 = 71.2% as fully language independent and straightforward approach of established algorithms and readily available integrated data enabling new possibilities for technology management.

Download Full-text

Chemometrics assisted method for classification of mango juice by FTIR spectroscopic data

Bangladesh Journal of Scientific and Industrial Research ◽

10.3329/bjsir.v52i2.32909 ◽

2017 ◽

Vol 52 (2) ◽

pp. 73-80

Author(s):

MN Uddin ◽

AK Majumder ◽

S Ahamed ◽

BK Saha ◽

B Mumtaz

Keyword(s):

Spectral Data ◽

Spectroscopic Data ◽

Classification Performance ◽

Training Data ◽

Statistical Techniques ◽

Cheap Method ◽

Artificial Neural Network Ann ◽

Mango Juice ◽

Mango Juices

Commercial mango juices are adulterated with heavy use of simple sugars in Bangladesh which poses a serious threat to public health. The present study is aimed to develop chemometrics assisted method for classification of commercial mango juices as adulterated or not with excessive use of glucose, fructose and sucrose with FTIR spectral data. Two statistical techniques, Artificial Neural Network (ANN) and Partial Least Squares-Discriminant Analysis (PLS-DA) have been assessed for their efficiencies in classification in this regard. Before calibration, spectral data were preprocessed with de-noising techniques, Savitzky-Golay (S-G) filtering. Concentration of simple sugars were classified as within or over certain limits. Here spectral values of 64 synthetic mixture solutions are used as training data to develop models and 15 spectral data of real mango juice are used as test data. PLS-DA shows better classification performance over lowercase ANN. From the findings, we develop a method for classification of mango juices adulterated with heavy use of simple sugars (glucose, fructose and sucrose). Therefore, it is a simple and cheap method to classify mango juices as adulterated or safe for consumers, manufacturers and quality regulating authorities.Bangladesh J. Sci. Ind. Res. 52(2), 73-80, 2017

Download Full-text

Protective clothing against chemicals. Test methods and performance classification of chemical protective clothing materials, seams, joins and assemblages

10.3403/30338921 ◽

2018 ◽

Keyword(s):

Protective Clothing ◽

Test Methods ◽

Chemical Protective Clothing ◽

And Performance

Download Full-text

Strategy and Performance across Size Classification of Small Firms in the United States

INTERNATIONAL BUSINESS REVIEW ◽

10.21739/ibr.2004.12.8.2.207 ◽

2004 ◽

Vol 8 (2) ◽

pp. 207

Author(s):

Jin Han Kim

Keyword(s):

United States ◽

Small Firms ◽

The United States ◽

Size Classification ◽

And Performance

Download Full-text

An Optimized Approach for Breast Cancer Classification for Histopathological Images Based on Hybrid Feature Set

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405616666200423085826 ◽

2020 ◽

Vol 16 ◽

Cited By ~ 1

Author(s):

Inzamam Mashood Nasir ◽

Muhammad Rashid ◽

Jamal Hussain Shah ◽

Muhammad Sharif ◽

Muhammad Yahiya Haider Awan ◽

...

Keyword(s):

Breast Cancer ◽

Cancer Detection ◽

State Of The Art ◽

Hybrid Approach ◽

Classification Performance ◽

Diagnose Breast Cancer ◽

Histopathological Images ◽

And Performance ◽

Learned Features ◽

Intelligent Healthcare

Background: Breast cancer is considered as the most perilous sickness among females worldwide and the ratio of new cases is expanding yearly. Many researchers have proposed efficient algorithms to diagnose breast cancer at early stages, which have increased the efficiency and performance by utilizing the learned features of gold standard histopathological images. Objective: Most of these systems have either used traditional handcrafted features or deep features which had a lot of noise and redundancy, which ultimately decrease the performance of the system. Methods: A hybrid approach is proposed by fusing and optimizing the properties of handcrafted and deep features to classify the breast cancer images. HOG and LBP features are serially fused with pretrained models VGG19 and InceptionV3. PCR and ICR are used to evaluate the classification performance of proposed method. Results: The method concentrates on histopathological images to classify the breast cancer. The performance is compared with state-of-the-art techniques, where an overall patient-level accuracy of 97.2% and image-level accuracy of 96.7% is recorded. Conclusion: The proposed hybrid method achieves the best performance as compared to previous methods and it can be used for the intelligent healthcare systems and early breast cancer detection.

Download Full-text

Breast Cancer Detection and Classification using Traditional Computer Vision Techniques: A Comprehensive Review

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405616666200406110547 ◽

2020 ◽

Vol 16 ◽

Cited By ~ 3

Author(s):

Saliha Zahoor ◽

Ikram Ullah Lali ◽

Muhammad Attique Khan ◽

Kashif Javed ◽

Waqar Mehmood

Keyword(s):

Breast Cancer ◽

Computer Aided Diagnosis ◽

Diagnose Breast Cancer ◽

Initial Stage ◽

Computer Aided ◽

Diagnosis And Classification ◽

And Performance ◽

Tools And Techniques ◽

Aided Diagnosis

: Breast Cancer is a common dangerous disease for women. In the world, many women died due to Breast cancer. However, in the initial stage, the diagnosis of breast cancer can save women's life. To diagnose cancer in the breast tissues there are several techniques and methods. The image processing, machine learning and deep learning methods and techniques are presented in this paper to diagnose the breast cancer. This work will be helpful to adopt better choices and reliable methods to diagnose breast cancer in an initial stage to survive the women's life. To detect the breast masses, microcalcifications, malignant cells the different techniques are used in the Computer-Aided Diagnosis (CAD) systems phases like preprocessing, segmentation, feature extraction, and classification. We have been reported a detailed analysis of different techniques or methods with their usage and performance measurement. From the reported results, it is concluded that for the survival of women’s life it is essential to improve the methods or techniques to diagnose breast cancer at an initial stage by improving the results of the Computer-Aided Diagnosis systems. Furthermore, segmentation and classification phases are challenging for researchers for the diagnosis of breast cancer accurately. Therefore, more advanced tools and techniques are still essential for the accurate diagnosis and classification of breast cancer.

Download Full-text

A Study on the Auxiliary Diagnosis of Thyroid Disease Images Based on Multiple Dimensional Deep Learning Algorithms

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405615666190115155223 ◽

2020 ◽

Vol 16 (3) ◽

pp. 199-205

Author(s):

Yuejun Liu ◽

Yifei Xu ◽

Xiangzheng Meng ◽

Xuguang Wang ◽

Tianxu Bai

Keyword(s):

Deep Learning ◽

Learning Algorithms ◽

Region Of Interest ◽

Classification Performance ◽

Thyroid Diseases ◽

Great Success ◽

Learning Models ◽

Good Classification Performance ◽

Spect Images

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.

Download Full-text

Road Characteristics Detection Based on Joint Convolutional Neural Networks with Adaptive Squares

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10060377 ◽

2021 ◽

Vol 10 (6) ◽

pp. 377

Author(s):

Chiao-Ling Kuo ◽

Ming-Hua Tsai

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Autonomous Vehicles ◽

Detection Accuracy ◽

Geospatial Information ◽

Combination Rules ◽

And Performance ◽

Road Characteristics ◽

Machine Readable ◽

Background Image

The importance of road characteristics has been highlighted, as road characteristics are fundamental structures established to support many transportation-relevant services. However, there is still huge room for improvement in terms of types and performance of road characteristics detection. With the advantage of geographically tiled maps with high update rates, remarkable accessibility, and increasing availability, this paper proposes a novel simple deep-learning-based approach, namely joint convolutional neural networks (CNNs) adopting adaptive squares with combination rules to detect road characteristics from roadmap tiles. The proposed joint CNNs are responsible for the foreground and background image classification and various types of road characteristics classification from previous foreground images, raising detection accuracy. The adaptive squares with combination rules help efficiently focus road characteristics, augmenting the ability to detect them and provide optimal detection results. Five types of road characteristics—crossroads, T-junctions, Y-junctions, corners, and curves—are exploited, and experimental results demonstrate successful outcomes with outstanding performance in reality. The information of exploited road characteristics with location and type is, thus, converted from human-readable to machine-readable, the results will benefit many applications like feature point reminders, road condition reports, or alert detection for users, drivers, and even autonomous vehicles. We believe this approach will also enable a new path for object detection and geospatial information extraction from valuable map tiles.

Download Full-text

Natural Disasters Intensity Analysis and Classification Based on Multispectral Images Using Multi-Layered Deep Convolutional Neural Network

Sensors ◽

10.3390/s21082648 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2648

Author(s):

Muhammad Aamir ◽

Tariq Ali ◽

Muhammad Irfan ◽

Ahmad Shaf ◽

Muhammad Zeeshan Azam ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Natural Disasters ◽

Deep Convolutional Neural Network ◽

Multispectral Images ◽

Learning Techniques ◽

Proposed Model ◽

Disaster Intensity ◽

And Performance

Natural disasters not only disturb the human ecological system but also destroy the properties and critical infrastructures of human societies and even lead to permanent change in the ecosystem. Disaster can be caused by naturally occurring events such as earthquakes, cyclones, floods, and wildfires. Many deep learning techniques have been applied by various researchers to detect and classify natural disasters to overcome losses in ecosystems, but detection of natural disasters still faces issues due to the complex and imbalanced structures of images. To tackle this problem, we propose a multilayered deep convolutional neural network. The proposed model works in two blocks: Block-I convolutional neural network (B-I CNN), for detection and occurrence of disasters, and Block-II convolutional neural network (B-II CNN), for classification of natural disaster intensity types with different filters and parameters. The model is tested on 4428 natural images and performance is calculated and expressed as different statistical values: sensitivity (SE), 97.54%; specificity (SP), 98.22%; accuracy rate (AR), 99.92%; precision (PRE), 97.79%; and F1-score (F1), 97.97%. The overall accuracy for the whole model is 99.92%, which is competitive and comparable with state-of-the-art algorithms.

Download Full-text

Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-021-01492-z ◽

2021 ◽

Vol 21 (S2) ◽

Author(s):

Kun Zeng ◽

Yibin Xu ◽

Ge Lin ◽

Likeng Liang ◽

Tianyong Hao

Keyword(s):

Clinical Trial ◽

Ensemble Learning ◽

Metric Learning ◽

Classification Performance ◽

Ensemble Model ◽

Automated Classification ◽

Eligibility Criteria ◽

Data Imbalance ◽

The Impact

Abstract Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.

Download Full-text

FRI0524 THE ACR’S RHEUMATOLOGY INFORMATICS SYSTEM FOR EFFECTIVENESS (RISE) REGISTRY SUPPORTS SMALL RHEUMATOLOGY PRACTICES FOR FEDERAL QUALITY REPORTING PROGRAM

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.6220 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 861-862

Author(s):

Z. Izadi ◽

T. Johansson ◽

J. LI ◽

G. Schmajuk ◽

J. Yazdany

Keyword(s):

Quality Measures ◽

Quality Reporting ◽

Practice Characteristics ◽

Informatics System ◽

Group Practices ◽

Status Assessment ◽

And Performance ◽

Reporting Programs

Background:The Rheumatology Informatics System for Effectiveness (RISE) Registry was developed by the ACR to help rheumatologists improve quality of care and meet federal reporting requirements. In the current quality program administered by the U.S. Centers for Medicare and Medicaid services, rheumatologists are scored on quality measures, and performance is tied to financial incentives or penalties. Rheumatoid arthritis (RA)-specific quality measures can only be submitted through RISE to federal programs.Objectives:This study used data from the RISE registry to investigate rheumatologists’ federal reporting patterns on five RA-specific quality measures in 2018 and investigated the effect of practice characteristics on federal reporting of these measures.Methods:We analyzed data on all rheumatologists who continuously participated in RISE between Jan 2017 to Dec 2018 and who had patients eligible for at least one RA-specific measure. Five measures were examined: tuberculosis screening before biologic use, disease activity assessment, functional status assessment, assessment and classification of disease prognosis, and glucocorticoid management. We assessed whether or not rheumatologists reported specific quality measures via RISE. We investigated the effect of practice characteristics (practice structure; number of providers; geographic region) on the likelihood of reporting using adjusted analyses that controlled for measure performance (performance in 2018; change in performance from 2017; and performance relative to national average performance). Analyses accounted for clustering by practice.Results:Data from 799 providers from 207 practices managing 213,757 RA patients was examined. The most common practice structure was a single-specialty group practice (53%), followed by solo (28%) and multi-specialty group practice (12%). Most providers (73%) had patients eligible for all five RA quality measures. Federal reporting of quality measures through RISE varied significantly by provider, ranging from no reporting (60%) to reporting all eligible RA measures (12.2%). Reporting through RISE also varied significantly by quality measure and was highest for functional status assessment (36%) and lowest for assessment and classification of disease prognosis (20%). Small practices (1-4 providers) were more likely to report all eligible RA quality measures compared to larger practices (21%, 6%; p<0.001). In adjusted analyses, solo practices were more likely than single-specialty group practices to report RA measures (42%, 31%; p<0.027) while multispecialty group practices were less likely (18%, 31%; p<0.001). Additionally, higher performance in 2018 and performance ≥ the national average performance was associated with federal reporting of the measures through RISE (p≤0.004).Conclusion:Forty percent of U.S. rheumatologists participating in RISE used the registry for federal quality reporting. Physicians using RISE for reporting were disproportionately in small and solo practices, suggesting that the registry is fulfilling an important role in helping these practices participate in national quality reporting programs. Supporting small practices is especially important given the workforce shortages in rheumatology. We observed that practices reporting through RISE had higher measure performance than other participating practices, which suggests that the registry is facilitating quality improvement. Studies are ongoing to further investigate the impact of federal quality reporting programs and RISE participation on the quality of rheumatologic care in the United States.Disclaimer: This data was supported by the ACR’s RISE Registry. However, the views expressed represent those of the authors, not necessarily those of the ACR.Disclosure of Interests:Zara Izadi: None declared, Tracy Johansson: None declared, Jing Li: None declared, Gabriela Schmajuk Grant/research support from: Pfizer, Jinoos Yazdany Grant/research support from: Pfizer

Download Full-text