Machine-learning dimensionality reduction for multi-objective design of photonic devices

Power Allocation for Multi-user Cooperation: a Multi-Objective and Machine Learning Approach

2021 IEEE 93rd Vehicular Technology Conference (VTC2021-Spring) ◽

10.1109/vtc2021-spring51267.2021.9448892 ◽

2021 ◽

Author(s):

Kezhong Jin ◽

Hosung Park ◽

Zhenzhou Tang

Keyword(s):

Machine Learning ◽

Power Allocation ◽

Learning Approach ◽

User Cooperation ◽

Multi Objective ◽

Machine Learning Approach

Download Full-text

FRI0585 HIGH-THROUGHPUT METHODOLOGY FOR EMR-BASED IDENTIFICATION OF CLINICAL SUB-PHENOTYPES IN COMPLEX PATIENT POPULATIONS

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.3489 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 897.2-897

Author(s):

M. Maurits ◽

T. Huizinga ◽

M. Reinders ◽

S. Raychaudhuri ◽

E. Karlson ◽

...

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Dimensionality Reduction ◽

High Throughput ◽

Brain Cancer ◽

Machine Learning Techniques ◽

Summary Statistics ◽

Medical Problems ◽

Learning Techniques ◽

Icd Codes

Background:Heterogeneity in disease populations complicates discovery of risk factors. To identify risk factors for subpopulations of diseases, we need analytical methods that can deal with unidentified disease subgroups.Objectives:Inspired by successful approaches from the Big Data field, we developed a high-throughput approach to identify subpopulations within patients with heterogeneous, complex diseases using the wealth of information available in Electronic Medical Records (EMRs).Methods:We extracted longitudinal healthcare-interaction records coded by 1,853 PheCodes[1] of the 64,819 patients from the Boston’s Partners-Biobank. Through dimensionality reduction using t-SNE[2] we created a 2D embedding of 32,424 of these patients (set A). We then identified distinct clusters post-t-SNE using DBscan[3] and visualized the relative importance of individual PheCodes within them using specialized spectrographs. We replicated this procedure in the remaining 32,395 records (set B).Results:Summary statistics of both sets were comparable (Table 1).Table 1.Summary statistics of the total Partners Biobank dataset and the 2 partitions.Set-Aset-BTotalEntries12,200,31112,177,13124,377,442Patients32,42432,39564,819Patientyears369,546.33368,597.92738,144.2unique ICD codes25,05624,95326,305unique Phecodes1,8511,8531,853We found 284 clusters in set A and 295 in set B, of which 63.4% from set A could be mapped to a cluster in set B with a median (range) correlation of 0.24 (0.03 – 0.58).Clusters represented similar yet distinct clinical phenotypes; e.g. patients diagnosed with “other headache syndrome” were separated into four distinct clusters characterized by migraines, neurofibromatosis, epilepsy or brain cancer, all resulting in patients presenting with headaches (Fig. 1 & 2). Though EMR databases tend to be noisy, our method was also able to differentiate misclassification from true cases; SLE patients with RA codes clustered separately from true RA cases.Figure 1.Two dimensional representation of Set A generated using dimensionality reduction (tSNE) and clustering (DBScan).Figure 2.Phenotype Spectrographs (PheSpecs) of four clusters characterized by “Other headache syndromes”, driven by codes relating to migraine, epilepsy, neurofibromatosis or brain cancer.Conclusion:We have shown that EMR data can be used to identify and visualize latent structure in patient categorizations, using an approach based on dimension reduction and clustering machine learning techniques. Our method can identify misclassified patients as well as separate patients with similar problems into subsets with different associated medical problems. Our approach adds a new and powerful tool to aid in the discovery of novel risk factors in complex, heterogeneous diseases.References:[1] Denny, J.C. et al. Bioinformatics (2010)[2]van der Maaten et al. Journal of Machine Learning Research (2008)[3] Ester, M. et al. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. (1996)Disclosure of Interests:Marc Maurits: None declared, Thomas Huizinga Grant/research support from: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Consultant of: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Marcel Reinders: None declared, Soumya Raychaudhuri: None declared, Elizabeth Karlson: None declared, Erik van den Akker: None declared, Rachel Knevel: None declared

Download Full-text

IoT Bonet and Network Intrusion Detection using Dimensionality Reduction and Supervised Machine Learning

2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON) ◽

10.1109/uemcon51285.2020.9298146 ◽

2020 ◽

Author(s):

Madhuri Gurunathrao Desai ◽

Yong Shi ◽

Kun Suo

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Dimensionality Reduction ◽

Supervised Machine Learning ◽

Network Intrusion Detection ◽

Network Intrusion

Download Full-text

A hybrid machine learning–based multi-objective supervisory control strategy of a full-scale wastewater treatment for cost-effective and sustainable operation under varying influent conditions

Journal of Cleaner Production ◽

10.1016/j.jclepro.2021.125853 ◽

2021 ◽

Vol 291 ◽

pp. 125853 ◽

Cited By ~ 1

Author(s):

SungKu Heo ◽

KiJeon Nam ◽

Shahzeb Tariq ◽

Juin Yau Lim ◽

Junkyu Park ◽

...

Keyword(s):

Machine Learning ◽

Wastewater Treatment ◽

Control Strategy ◽

Supervisory Control ◽

Cost Effective ◽

Full Scale ◽

Multi Objective ◽

Sustainable Operation ◽

Hybrid Machine

Download Full-text

A Strategy for Dimensionality Reduction and Data Analysis Applied to Microstructure–Property Relationships of Nanoporous Metals

Materials ◽

10.3390/ma14081822 ◽

2021 ◽

Vol 14 (8) ◽

pp. 1822

Author(s):

Norbert Huber

Keyword(s):

Machine Learning ◽

Dimensionality Reduction ◽

Work Hardening ◽

Large Range ◽

Scaling Law ◽

Principal Component ◽

Underlying Structure ◽

Structure Property ◽

Work Hardening Rate ◽

Nanoporous Metals

Nanoporous metals, with their complex microstructure, represent an ideal candidate for the development of methods that combine physics, data, and machine learning. The preparation of nanporous metals via dealloying allows for tuning of the microstructure and macroscopic mechanical properties within a large design space, dependent on the chosen dealloying conditions. Specifically, it is possible to define the solid fraction, ligament size, and connectivity density within a large range. These microstructural parameters have a large impact on the macroscopic mechanical behavior. This makes this class of materials an ideal science case for the development of strategies for dimensionality reduction, supporting the analysis and visualization of the underlying structure–property relationships. Efficient finite element beam modeling techniques were used to generate ~200 data sets for macroscopic compression and nanoindentation of open pore nanofoams. A strategy consisting of dimensional analysis, principal component analysis, and machine learning allowed for data mining of the microstructure–property relationships. It turned out that the scaling law of the work hardening rate has the same exponent as the Young’s modulus. Simple linear relationships are derived for the normalized work hardening rate and hardness. The hardness to yield stress ratio is not limited to 1, as commonly assumed for foams, but spreads over a large range of values from 0.5 to 3.

Download Full-text

Evolutionary Machine Learning for Multi-Objective Class Solutions in Medical Deformable Image Registration

Algorithms ◽

10.3390/a12050099 ◽

2019 ◽

Vol 12 (5) ◽

pp. 99 ◽

Cited By ~ 2

Author(s):

Kleopatra Pirpinia ◽

Peter A. N. Bosman ◽

Jan-Jakob Sonke ◽

Marcel van Herk ◽

Tanja Alderliesten

Keyword(s):

Machine Learning ◽

Image Registration ◽

State Of The Art ◽

Deformable Image Registration ◽

Optimization Approach ◽

High Quality ◽

Trade Off ◽

Multi Objective ◽

Current State ◽

Image Artefacts

Current state-of-the-art medical deformable image registration (DIR) methods optimize a weighted sum of key objectives of interest. Having a pre-determined weight combination that leads to high-quality results for any instance of a specific DIR problem (i.e., a class solution) would facilitate clinical application of DIR. However, such a combination can vary widely for each instance and is currently often manually determined. A multi-objective optimization approach for DIR removes the need for manual tuning, providing a set of high-quality trade-off solutions. Here, we investigate machine learning for a multi-objective class solution, i.e., not a single weight combination, but a set thereof, that, when used on any instance of a specific DIR problem, approximates such a set of trade-off solutions. To this end, we employed a multi-objective evolutionary algorithm to learn sets of weight combinations for three breast DIR problems of increasing difficulty: 10 prone-prone cases, 4 prone-supine cases with limited deformations and 6 prone-supine cases with larger deformations and image artefacts. Clinically-acceptable results were obtained for the first two problems. Therefore, for DIR problems with limited deformations, a multi-objective class solution can be machine learned and used to compute straightforwardly multiple high-quality DIR outcomes, potentially leading to more efficient use of DIR in clinical practice.

Download Full-text

Analysis of the Geometrical Evolution in On-the-Fly Surface-Hopping Nonadiabatic Dynamics with Machine Learning Dimensionality Reduction Approaches: Classical Multidimensional Scaling and Isometric Feature Mapping

Journal of Chemical Theory and Computation ◽

10.1021/acs.jctc.7b00394 ◽

2017 ◽

Vol 13 (10) ◽

pp. 4611-4623 ◽

Cited By ~ 11

Author(s):

Xusong Li ◽

Yu Xie ◽

Deping Hu ◽

Zhenggang Lan

Keyword(s):

Machine Learning ◽

Multidimensional Scaling ◽

Dimensionality Reduction ◽

Nonadiabatic Dynamics ◽

Feature Mapping ◽

Surface Hopping ◽

Isometric Feature Mapping

Download Full-text

Correction to Analysis of the Geometrical Evolution in On-the-Fly Surface-Hopping Nonadiabatic Dynamics with Machine Learning Dimensionality Reduction Approaches: Classical Multidimensional Scaling and Isometric Feature Mapping

Journal of Chemical Theory and Computation ◽

10.1021/acs.jctc.7b01155 ◽

2017 ◽

Vol 13 (12) ◽

pp. 6434-6434 ◽

Cited By ~ 3

Author(s):

Xusong Li ◽

Yu Xie ◽

Deping Hu ◽

Zhenggang Lan

Keyword(s):

Machine Learning ◽

Multidimensional Scaling ◽

Dimensionality Reduction ◽

Nonadiabatic Dynamics ◽

Feature Mapping ◽

Surface Hopping ◽

Isometric Feature Mapping

Download Full-text

MLCV: Bridging Machine-Learning-Based Dimensionality Reduction and Free-Energy Calculation

Journal of Chemical Information and Modeling ◽

10.1021/acs.jcim.1c01010 ◽

2021 ◽

Author(s):

Haochuan Chen ◽

Han Liu ◽

Heying Feng ◽

Haohao Fu ◽

Wensheng Cai ◽

...

Keyword(s):

Machine Learning ◽

Free Energy ◽

Dimensionality Reduction ◽

Free Energy Calculation ◽

Energy Calculation

Download Full-text

MODC: A Pareto-Optimal Optimization Approach for Network Traffic Classification Based on the Divide and Conquer Strategy

Information ◽

10.3390/info9090233 ◽

2018 ◽

Vol 9 (9) ◽

pp. 233 ◽

Cited By ~ 1

Author(s):

Zuleika Nascimento ◽

Djamel Sadok

Keyword(s):

Machine Learning ◽

Network Traffic ◽

Machine Learning Algorithms ◽

Divide And Conquer ◽

Pareto Optimal ◽

Optimization Approach ◽

Traffic Classification ◽

Multi Objective ◽

Network Traffic Classification ◽

Changes Over Time

Network traffic classification aims to identify categories of traffic or applications of network packets or flows. It is an area that continues to gain attention by researchers due to the necessity of understanding the composition of network traffics, which changes over time, to ensure the network Quality of Service (QoS). Among the different methods of network traffic classification, the payload-based one (DPI) is the most accurate, but presents some drawbacks, such as the inability of classifying encrypted data, the concerns regarding the users’ privacy, the high computational costs, and ambiguity when multiple signatures might match. For that reason, machine learning methods have been proposed to overcome these issues. This work proposes a Multi-Objective Divide and Conquer (MODC) model for network traffic classification, by combining, into a hybrid model, supervised and unsupervised machine learning algorithms, based on the divide and conquer strategy. Additionally, it is a flexible model since it allows network administrators to choose between a set of parameters (pareto-optimal solutions), led by a multi-objective optimization process, by prioritizing flow or byte accuracies. Our method achieved 94.14% of average flow accuracy for the analyzed dataset, outperforming the six DPI-based tools investigated, including two commercial ones, and other machine learning-based methods.

Download Full-text