Increasing Interpretability of Bayesian Probabilistic Programming Models Through Interactive Representations

Frontiers in Computer Science ◽

10.3389/fcomp.2020.567344 ◽

2020 ◽

Vol 2 ◽

Author(s):

Evdoxia Taka ◽

Sebastian Stein ◽

John H. Williamson

Keyword(s):

Probabilistic Models ◽

Graphical Representation ◽

Decision Makers ◽

Programming Models ◽

Graphical Representations ◽

Seamless Integration ◽

Probabilistic Programming ◽

Uncertainty Visualization ◽

Visualization Tools ◽

Probabilistic Programs

Bayesian probabilistic modeling is supported by powerful computational tools like probabilistic programming and efficient Markov Chain Monte Carlo (MCMC) sampling. However, the results of Bayesian inference are challenging for users to interpret in tasks like decision-making under uncertainty or model refinement. Decision-makers need simultaneous insight into both the model's structure and its predictions, including uncertainty in inferred parameters. This enables better assessment of the risk overall possible outcomes compatible with observations and thus more informed decisions. To support this, we see a need for visualization tools that make probabilistic programs interpretable to reveal the interdependencies in probabilistic models and their inherent uncertainty. We propose the automatic transformation of Bayesian probabilistic models, expressed in a probabilistic programming language, into an interactive graphical representation of the model's structure at varying levels of granularity, with seamless integration of uncertainty visualization. This interactive graphical representation supports the exploration of the prior and posterior distribution of MCMC samples. The interpretability of Bayesian probabilistic programming models is enhanced through the interactive graphical representations, which provide human users with more informative, transparent, and explainable probabilistic models. We present a concrete implementation that translates probabilistic programs to interactive graphical representations and show illustrative examples for a variety of Bayesian probabilistic models.

Download Full-text

Probabilistic programming in Python using PyMC3

PeerJ Computer Science ◽

10.7717/peerj-cs.55 ◽

2016 ◽

Vol 2 ◽

pp. e55 ◽

Cited By ~ 510

Author(s):

John Salvatier ◽

Thomas V. Wiecki ◽

Christopher Fonnesbeck

Keyword(s):

Monte Carlo ◽

Programming Languages ◽

Probabilistic Models ◽

Automatic Differentiation ◽

Direct Interaction ◽

Model Specification ◽

Probabilistic Programming ◽

Domain Specific ◽

Complex Models ◽

Probabilistic Programs

Probabilistic programming allows for automatic Bayesian inference on user-defined probabilistic models. Recent advances in Markov chain Monte Carlo (MCMC) sampling allow inference on increasingly complex models. This class of MCMC, known as Hamiltonian Monte Carlo, requires gradient information which is often not readily available. PyMC3 is a new open source probabilistic programming framework written in Python that uses Theano to compute gradients via automatic differentiation as well as compile probabilistic programs on-the-fly to C for increased speed. Contrary to other probabilistic programming languages, PyMC3 allows model specification directly in Python code. The lack of a domain specific language allows for great flexibility and direct interaction with the model. This paper is a tutorial-style introduction to this software package.

Download Full-text

Probabilistic programming in Python using PyMC3

10.7287/peerj.preprints.1686v1 ◽

2016 ◽

Cited By ~ 7

Author(s):

John Salvatier ◽

Thomas V Wiecki ◽

Christopher Fonnesbeck

Keyword(s):

Monte Carlo ◽

Programming Languages ◽

Probabilistic Models ◽

Automatic Differentiation ◽

Direct Interaction ◽

Model Specification ◽

Probabilistic Programming ◽

Domain Specific ◽

Complex Models ◽

Probabilistic Programs

Probabilistic Programming allows for automatic Bayesian inference on user-defined probabilistic models. Recent advances in Markov chain Monte Carlo (MCMC) sampling allow inference on increasingly complex models. This class of MCMC, known as Hamliltonian Monte Carlo, requires gradient information which is often not readily available. PyMC3 is a new open source Probabilistic Programming framework written in Python that uses Theano to compute gradients via automatic differentiation as well as compile probabilistic programs on-the-fly to C for increased speed. Contrary to other Probabilistic Programming languages, PyMC3 allows model specification directly in Python code. The lack of a domain specific language allows for great flexibility and direct interaction with the model. This paper is a tutorial-style introduction to this software package.

Download Full-text

Probabilistic programming in Python using PyMC3

10.7287/peerj.preprints.1686 ◽

2016 ◽

Cited By ~ 8

Author(s):

John Salvatier ◽

Thomas V Wiecki ◽

Christopher Fonnesbeck

Keyword(s):

Monte Carlo ◽

Programming Languages ◽

Probabilistic Models ◽

Automatic Differentiation ◽

Direct Interaction ◽

Model Specification ◽

Probabilistic Programming ◽

Domain Specific ◽

Complex Models ◽

Probabilistic Programs

Download Full-text

Proof Theory of the Cut Rule

10.1093/oso/9780198748991.003.0010 ◽

2018 ◽

Author(s):

J. R. B. Cockett ◽

R. A. G. Seely

Keyword(s):

Proof Theory ◽

Graphical Representation ◽

Basic Component ◽

Graphical Representations ◽

Mathematical Language ◽

Basic Logic ◽

Cut Rule ◽

Structural Rules ◽

The One ◽

Categorical Semantics

This chapter describes the categorical proof theory of the cut rule, a very basic component of any sequent-style presentation of a logic, assuming a minimum of structural rules and connectives, in fact, starting with none. It is shown how logical features can be added to this basic logic in a modular fashion, at each stage showing the appropriate corresponding categorical semantics of the proof theory, starting with multicategories, and moving to linearly distributive categories and *-autonomous categories. A key tool is the use of graphical representations of proofs (“proof circuits”) to represent formal derivations in these logics. This is a powerful symbolism, which on the one hand is a formal mathematical language, but crucially, at the same time, has an intuitive graphical representation.

Download Full-text

Graphical Representations of a Television Series: A Study with Deaf and Hearing Adolescents

The Spanish Journal of Psychology ◽

10.1017/s1138741600002420 ◽

2010 ◽

Vol 13 (2) ◽

pp. 765-776 ◽

Cited By ~ 1

Author(s):

Cristina Cambra ◽

Aurora Leal ◽

Núria Silvestre

Keyword(s):

Graphical Representation ◽

Background Knowledge ◽

Hearing Impaired ◽

Graphical Representations ◽

Television Series ◽

Linguistic Information ◽

Visual Elements ◽

The Way

The understanding of a television story can be very different depending on the age of the viewer, their background knowledge, the content of the programme and the way in which they combine the information gathered from linguistic, audio and visual elements. This study explores the different ways of interpreting an audiovisual document considering that, due to a hearing impaired, visual, audio and linguistic information could be perceived very differently to the way it is by hearing people. The study involved the participation of 20 deaf and 20 hearing adolescents, aged 12 to 19 years who, after watching a fragment of a television series, were asked to draw a picture of what had happened in the story. The results show that the graphical representation of the film is similar for both groups in terms of the number of scenes, but there is greater profusion, in the deaf group, of details about the context and characters, and there are differences in their interpretations of some of the sequences in the story.

Download Full-text

Communicating earthquake risk: mapped parameters and cartographic representation

Natural Hazards and Earth System Science ◽

10.5194/nhess-11-359-2011 ◽

2011 ◽

Vol 11 (2) ◽

pp. 359-366 ◽

Cited By ~ 7

Author(s):

J. M. Gaspar-Escribano ◽

T. Iturrioz

Keyword(s):

Risk Assessment ◽

Seismic Risk ◽

Risk Information ◽

Decision Makers ◽

Earthquake Risk ◽

Graphical Representations ◽

Related Risk ◽

Different Types ◽

Earthquake Effects ◽

Earthquake Risk Assessment

Abstract. Earthquake risk assessment is probably the most effective tool for reducing adverse earthquake effects and for developing pre- and post-event planning actions. The related risk information (data and results) is of interest for persons with different backgrounds and interests, including scientists, emergency planners, decision makers and other stakeholders. Hence, it is important to ensure that this information is properly transferred to all persons involved in seismic risk, considering the nature of the information and the particular circumstances of the source and of the receiver of the information. Some experience-based recommendations about the parameters and the graphical representations that can be used to portray earthquake risk information to different types of audiences are presented in this work.

Download Full-text

Evaluating probabilistic programming and fast variational Bayesian inference in phylogenetics

10.1101/702944 ◽

2019 ◽

Cited By ~ 1

Author(s):

Mathieu Fourment ◽

Aaron E. Darling

Keyword(s):

Probabilistic Models ◽

Probability Distributions ◽

Mean Field ◽

Black Box ◽

Variational Inference ◽

Machine Learning Techniques ◽

Mcmc Methods ◽

Substitution Model ◽

Probabilistic Programming ◽

Phylogenetic Models

AbstractRecent advances in statistical machine learning techniques have led to the creation of probabilistic programming frameworks. These frameworks enable probabilistic models to be rapidly prototyped and fit to data using scalable approximation methods such as variational inference. In this work, we explore the use of the Stan language for probabilistic programming in application to phylogenetic models. We show that many commonly used phylogenetic models including the general time reversible (GTR) substitution model, rate heterogeneity among sites, and a range of coalescent models can be implemented using a probabilistic programming language. The posterior probability distributions obtained via the black box variational inference engine in Stan were compared to those obtained with reference implementations of Markov chain Monte Carlo (MCMC) for phylogenetic inference. We find that black box variational inference in Stan is less accurate than MCMC methods for phylogenetic models, but requires far less compute time. Finally, we evaluate a custom implementation of mean-field variational inference on the Jukes-Cantor substitution model and show that a specialized implementation of variational inference can be two orders of magnitude faster and more accurate than a general purpose probabilistic implementation.

Download Full-text

Artificial Intelligence for Decision Makers

Journal of Emerging Technologies in Accounting ◽

10.2308/jeta-19-04-30-21 ◽

2019 ◽

pp. 0000-0000

Author(s):

Viktor Elliot ◽

Mari Paananen ◽

Miroslaw Staron

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Decision Making ◽

Machine Learning Algorithms ◽

Decision Makers ◽

Accounting Data ◽

Ethical Concerns ◽

Key Concepts ◽

Basic Understanding ◽

Visualization Tools

We propose an exercise with the purpose of providing a basic understanding of key concepts within AI and extending the understanding of AI beyond mathematics. The exercise allows participants to carry out analysis based on accounting data using visualization tools as well as to develop their own machine learning algorithms that can mimic their decisions. Finally, we also problematize the use of AI in decision-making, with such aspects as biases in data and/or ethical concerns.

Download Full-text

Automating statistical diagrammatic representations with data characterization

Information Visualization ◽

10.1177/1473871617715326 ◽

2017 ◽

Vol 17 (4) ◽

pp. 316-334

Author(s):

Pere Millán-Martínez ◽

Pedro Valero-Mora

Keyword(s):

Graphical Representation ◽

Open Data ◽

Graphical Representations ◽

One Dimensional ◽

The Public ◽

Cognitive Studies ◽

Multidimensional Databases ◽

Diagrammatic Representations ◽

Input Variables

The search for an efficient method to enhance data cognition is especially important when managing data from multidimensional databases. Open data policies have dramatically increased not only the volume of data available to the public, but also the need to automate the translation of data into efficient graphical representations. Graphic automation involves producing an algorithm that necessarily contains inputs derived from the type of data. A set of rules are then applied to combine the input variables and produce a graphical representation. Automated systems, however, fail to provide an efficient graphical representation because they only consider either a one-dimensional characterization of variables, which leads to an overwhelmingly large number of available solutions, a compositional algebra that leads to a single solution, or requires the user to predetermine the graphical representation. Therefore, we propose a multidimensional characterization of statistical variables that when complemented with a catalog of graphical representations that match any single combination, presents the user with a more specific set of suitable graphical representations to choose from. Cognitive studies can then determine the most efficient perceptual procedures to further shorten the path to the most efficient graphical representations. The examples used herein are limited to graphical representations with three variables given that the number of combinations increases drastically as the number of selected variables increases.

Download Full-text

Employing Multidimensional Data Visualization Tools to Assess the Impact of Constraint Uncertainties on Complex Design Problems

Volume 2A: 43rd Design Automation Conference ◽

10.1115/detc2017-67902 ◽

2017 ◽

Author(s):

Gary M. Stump ◽

Simon W. Miller ◽

Michael A. Yukish ◽

Christopher M. Farrell

Keyword(s):

Data Visualization ◽

Decision Makers ◽

Multidimensional Data ◽

Design Problems ◽

Design Constraints ◽

Performance Space ◽

Multidimensional Data Visualization ◽

Complex Design ◽

Visualization Tools ◽

The Impact

A potential source of uncertainty within multi-objective design problems can be the exact value of the underlying design constraints. This uncertainty will affect the resulting performance of the selected system commensurate with the level of risk that decision-makers are willing to accept. This research focuses on developing visualization tools that allow decision-makers to specify uncertainty distributions on design constraints and to visualize their effects in the performance space using multidimensional data visualization methods to solve problems with high orders of computational complexity. These visual tools will be demonstrated using an example portfolio design scenario in which the goal of the design problem is to maximize the performance of a portfolio with an uncertain budget constraint.

Download Full-text