Deep Causal Graphs for Causal Inference, Black-Box Explainability and Fairness

Mapping Intimacies ◽

10.3233/faia210162 ◽

2021 ◽

Author(s):

Álvaro Parafita ◽

Jordi Vitrià

Keyword(s):

Machine Learning ◽

Ad Hoc ◽

Single Point ◽

Black Box ◽

Expected Value ◽

Step Process ◽

Alternative Framework ◽

Causal Graphs ◽

Counterfactual Distributions ◽

Using Data

Causal Estimation is usually tackled as a two-step process: identification, to transform a causal query into a statistical estimand, and modelling, to compute this estimand by using data. This reliance on the derived statistical estimand makes these methods ad hoc, used to answer one and only one query. We present an alternative framework called Deep Causal Graphs: with a single model, it answers any identifiable causal query without compromising on performance, thanks to the use of Normalizing Causal Flows, and outputs complex counterfactual distributions instead of single-point estimations of their expected value. We conclude with applications of the framework to Machine Learning Explainability and Fairness.

Download Full-text

Instant medical care and drug suggestion service using data mining and machine learning based intelligent self-diagnosis medical system

International Journal of Advanced Life Sciences ◽

10.26627/ijals/2017/10.03.0022 ◽

2017 ◽

Vol 10 (03) ◽

pp. 318-325

Author(s):

sudha M

Keyword(s):

Machine Learning ◽

Data Mining ◽

Medical Care ◽

Medical System ◽

Using Data

Download Full-text

Using Machine Learning Methods to Identify Particle Types from Doppler Lidar Measurements in Iceland

Remote Sensing ◽

10.3390/rs13132433 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2433

Author(s):

Shu Yang ◽

Fengchao Peng ◽

Sibylle von Löwis ◽

Guðrún Nína Petersen ◽

David Christian Finger

Keyword(s):

Machine Learning ◽

Weather Conditions ◽

Dust Storms ◽

Machine Learning Algorithms ◽

Lidar Data ◽

Data Sets ◽

Doppler Lidar ◽

Lidar Measurements ◽

Using Data ◽

Filter Noise

Doppler lidars are used worldwide for wind monitoring and recently also for the detection of aerosols. Automatic algorithms that classify the lidar signals retrieved from lidar measurements are very useful for the users. In this study, we explore the value of machine learning to classify backscattered signals from Doppler lidars using data from Iceland. We combined supervised and unsupervised machine learning algorithms with conventional lidar data processing methods and trained two models to filter noise signals and classify Doppler lidar observations into different classes, including clouds, aerosols and rain. The results reveal a high accuracy for noise identification and aerosols and clouds classification. However, precipitation detection is underestimated. The method was tested on data sets from two instruments during different weather conditions, including three dust storms during the summer of 2019. Our results reveal that this method can provide an efficient, accurate and real-time classification of lidar measurements. Accordingly, we conclude that machine learning can open new opportunities for lidar data end-users, such as aviation safety operators, to monitor dust in the vicinity of airports.

Download Full-text

Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data

Proceedings of the Workshop on Human-In-the-Loop Data Analytics - HILDA'19 ◽

10.1145/3328519.3329126 ◽

2019 ◽

Author(s):

Sergey Redyuk ◽

Sebastian Schelter ◽

Tammo Rukat ◽

Volker Markl ◽

Felix Biessmann

Keyword(s):

Machine Learning ◽

Black Box ◽

Learning Models ◽

Unseen Data ◽

Machine Learning Models

Download Full-text

MODES: model-based optimization on distributed embedded systems

Machine Learning ◽

10.1007/s10994-021-06014-6 ◽

2021 ◽

Author(s):

Junjie Shi ◽

Jiang Bian ◽

Jakob Richter ◽

Kuan-Hsun Chen ◽

Jörg Rahnenführer ◽

...

Keyword(s):

Machine Learning ◽

Embedded Systems ◽

Learning Model ◽

Black Box ◽

Distributed Embedded Systems ◽

Data Set ◽

Individual Model ◽

Model Based ◽

Machine Learning Model ◽

Distributed Machine Learning

AbstractThe predictive performance of a machine learning model highly depends on the corresponding hyper-parameter setting. Hence, hyper-parameter tuning is often indispensable. Normally such tuning requires the dedicated machine learning model to be trained and evaluated on centralized data to obtain a performance estimate. However, in a distributed machine learning scenario, it is not always possible to collect all the data from all nodes due to privacy concerns or storage limitations. Moreover, if data has to be transferred through low bandwidth connections it reduces the time available for tuning. Model-Based Optimization (MBO) is one state-of-the-art method for tuning hyper-parameters but the application on distributed machine learning models or federated learning lacks research. This work proposes a framework $$\textit{MODES}$$ MODES that allows to deploy MBO on resource-constrained distributed embedded systems. Each node trains an individual model based on its local data. The goal is to optimize the combined prediction accuracy. The presented framework offers two optimization modes: (1) $$\textit{MODES}$$ MODES -B considers the whole ensemble as a single black box and optimizes the hyper-parameters of each individual model jointly, and (2) $$\textit{MODES}$$ MODES -I considers all models as clones of the same black box which allows it to efficiently parallelize the optimization in a distributed setting. We evaluate $$\textit{MODES}$$ MODES by conducting experiments on the optimization for the hyper-parameters of a random forest and a multi-layer perceptron. The experimental results demonstrate that, with an improvement in terms of mean accuracy ($$\textit{MODES}$$ MODES -B), run-time efficiency ($$\textit{MODES}$$ MODES -I), and statistical stability for both modes, $$\textit{MODES}$$ MODES outperforms the baseline, i.e., carry out tuning with MBO on each node individually with its local sub-data set.

Download Full-text

How to fool a black box machine learning based side-channel security evaluation

Cryptography and Communications ◽

10.1007/s12095-021-00479-x ◽

2021 ◽

Author(s):

Charles-Henry Bertrand Van Ouytsel ◽

Olivier Bronchain ◽

Gaëtan Cassiers ◽

François-Xavier Standaert

Keyword(s):

Machine Learning ◽

Black Box ◽

Security Evaluation ◽

Side Channel

Download Full-text

Classification and photometric redshift estimation of quasars in photometric surveys

Proceedings of the International Astronomical Union ◽

10.1017/s1743921320001829 ◽

2020 ◽

Vol 15 (S359) ◽

pp. 40-41

Author(s):

L. M. Izuti Nakazono ◽

C. Mendes de Oliveira ◽

N. S. T. Hirata ◽

S. Jeram ◽

A. Gonzalez ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Nearest Neighbour ◽

Random Forest Algorithm ◽

Photometric Redshift ◽

Using Data

AbstractWe present a machine learning methodology to separate quasars from galaxies and stars using data from S-PLUS in the Stripe-82 region. In terms of quasar classification, we achieved 95.49% for precision and 95.26% for recall using a Random Forest algorithm. For photometric redshift estimation, we obtained a precision of 6% using k-Nearest Neighbour.

Download Full-text

Effective Prediction of Heart Disease Using Data Mining and Machine Learning: A Review

2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS) ◽

10.1109/icais50930.2021.9395963 ◽

2021 ◽

Author(s):

Simran Verma ◽

Abhishek Gupta

Keyword(s):

Machine Learning ◽

Data Mining ◽

Heart Disease ◽

Using Data

Download Full-text

Distributed Learning Applications in Power Systems: A Review of Methods, Gaps, and Challenges

Energies ◽

10.3390/en14123654 ◽

2021 ◽

Vol 14 (12) ◽

pp. 3654

Author(s):

Nastaran Gholizadeh ◽

Petr Musilek

Keyword(s):

Machine Learning ◽

Power Systems ◽

Learning Algorithm ◽

Single Point ◽

Distributed Learning ◽

Large Data ◽

Multi Agent Systems ◽

Power Quality Monitoring ◽

Multi Agent ◽

Learning Frameworks

In recent years, machine learning methods have found numerous applications in power systems for load forecasting, voltage control, power quality monitoring, anomaly detection, etc. Distributed learning is a subfield of machine learning and a descendant of the multi-agent systems field. Distributed learning is a collaboratively decentralized machine learning algorithm designed to handle large data sizes, solve complex learning problems, and increase privacy. Moreover, it can reduce the risk of a single point of failure compared to fully centralized approaches and lower the bandwidth and central storage requirements. This paper introduces three existing distributed learning frameworks and reviews the applications that have been proposed for them in power systems so far. It summarizes the methods, benefits, and challenges of distributed learning frameworks in power systems and identifies the gaps in the literature for future studies.

Download Full-text

Explainable AI: A Review of Machine Learning Interpretability Methods

Entropy ◽

10.3390/e23010018 ◽

2020 ◽

Vol 23 (1) ◽

pp. 18

Author(s):

Pantelis Linardatos ◽

Vasilis Papastefanopoulos ◽

Sotiris Kotsiantis

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Black Box ◽

Learning Systems ◽

Model Complexity ◽

Learning Models ◽

New Methods ◽

Industrial Adoption ◽

Machine Learning Models ◽

The Way

Recent advances in artificial intelligence (AI) have led to its widespread industrial adoption, with machine learning systems demonstrating superhuman performance in a significant number of tasks. However, this surge in performance, has often been achieved through increased model complexity, turning such systems into “black box” approaches and causing uncertainty regarding the way they operate and, ultimately, the way that they come to decisions. This ambiguity has made it problematic for machine learning systems to be adopted in sensitive yet critical domains, where their value could be immense, such as healthcare. As a result, scientific interest in the field of Explainable Artificial Intelligence (XAI), a field that is concerned with the development of new methods that explain and interpret machine learning models, has been tremendously reignited over recent years. This study focuses on machine learning interpretability methods; more specifically, a literature review and taxonomy of these methods are presented, as well as links to their programming implementations, in the hope that this survey would serve as a reference point for both theorists and practitioners.

Download Full-text

Investigating the Spread of Coronavirus Disease via Edge-AI and Air Pollution Correlation

ACM Transactions on Internet Technology ◽

10.1145/3424222 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1-10

Author(s):

V. Gomathy ◽

K. Janarthanan ◽

Fadi Al-Turjman ◽

R. Sitharthan ◽

M. Rajesh ◽

...

Keyword(s):

Machine Learning ◽

Air Pollution ◽

Mortality Rate ◽

Respiratory Diseases ◽

Past Research ◽

Viral Disease ◽

Machine Learning Method ◽

Learning Method ◽

Death Rates ◽

Using Data

Coronavirus Disease 19 (COVID-19) is a highly infectious viral disease affecting millions of people worldwide in 2020. Several studies have shown that COVID-19 results in a severe acute respiratory syndrome and may lead to death. In past research, a greater number of respiratory diseases has been caused by exposure to air pollution for long periods of time. This article investigates the spread of COVID-19 as a result of air pollution by applying linear regression in machine learning method based edge computing. The analysis in this investigation have been based on the death rates caused by COVID-19 as well as the region of death rates based on hazardous air pollution using data retrieved from the Copernicus Sentinel-5P satellite. The results obtained in the investigation prove that the mortality rate due to the spread of COVID-19 is 77% higher in areas with polluted air. This investigation also proves that COVID-19 severely affected 68% of the individuals who had been exposed to polluted air.

Download Full-text