Abductive learning of quantized stochastic processes with probabilistic finite automata

Ishanu Chattopadhyay; Hod Lipson

doi:10.1098/rsta.2011.0543

Abductive learning of quantized stochastic processes with probabilistic finite automata

Philosophical Transactions of The Royal Society A Mathematical Physical and Engineering Sciences ◽

10.1098/rsta.2011.0543 ◽

2013 ◽

Vol 371 (1984) ◽

pp. 20110543 ◽

Cited By ~ 8

Author(s):

Ishanu Chattopadhyay ◽

Hod Lipson

Keyword(s):

Stochastic Processes ◽

Learning Algorithm ◽

Finite Automata ◽

Causal Structure ◽

Real Data ◽

Stochastic Dynamical Systems ◽

Modelling Framework ◽

Performance Guarantees ◽

Automated Inference ◽

Finite State

We present an unsupervised learning algorithm ( GenESeSS ) to infer the causal structure of quantized stochastic processes, defined as stochastic dynamical systems evolving over discrete time, and producing quantized observations. Assuming ergodicity and stationarity, GenESeSS infers probabilistic finite state automata models from a sufficiently long observed trace. Our approach is abductive; attempting to infer a simple hypothesis, consistent with observations and modelling framework that essentially fixes the hypothesis class. The probabilistic automata we infer have no initial and terminal states, have no structural restrictions and are shown to be probably approximately correct-learnable. Additionally, we establish rigorous performance guarantees and data requirements, and show that GenESeSS correctly infers long-range dependencies. Modelling and prediction examples on simulated and real data establish relevance to automated inference of causal stochastic structures underlying complex physical phenomena.

Download Full-text

Application of a Rough Set-Based Inductive Learning System

Fundamenta Informaticae ◽

10.3233/fi-1993-182-409 ◽

1993 ◽

Vol 18 (2-4) ◽

pp. 209-220

Author(s):

Michael Hadjimichael ◽

Anita Wasilewska

Keyword(s):

Machine Learning ◽

Rough Set ◽

Presidential Election ◽

Predictive Accuracy ◽

Learning Algorithm ◽

Inductive Learning ◽

Real Data ◽

Semantic Content ◽

Learning System ◽

Voter Preferences

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.

Download Full-text

BORDERS AND FINITE AUTOMATA

International Journal of Foundations of Computer Science ◽

10.1142/s0129054107005029 ◽

2007 ◽

Vol 18 (04) ◽

pp. 859-871

Author(s):

MARTIN ŠIMŮNEK ◽

BOŘIVOJ MELICHAR

Keyword(s):

Pattern Matching ◽

Hamming Distance ◽

Finite Automata ◽

Music Analysis ◽

Theoretical Description ◽

Specific Form ◽

Distance Measures ◽

Computer Assisted ◽

Finite State ◽

Finite State Transducer

A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper offers the automata-theoretical description of Iliopoulos's ALL_BORDERS algorithm. The algorithm finds all borders of a string with don't care symbols. We show that ALL_BORDERS algorithm is an implementation of a finite state transducer of specific form. We describe how such a transducer can be constructed and what should be the input string like. The described transducer finds a set of lengths of all borders. Last but not least, we define approximate borders and show how to find all approximate borders of a string when we concern Hamming distance definition. Our solution of this problem is based on transducers again. This allows us to use analogy with automata-based pattern matching methods. Finally we discuss conditions under which the same principle can be used for other distance measures.

Download Full-text

Modelling nonlinear dynamics of interacting tipping elements on complex networks: the PyCascades package

The European Physical Journal Special Topics ◽

10.1140/epjs/s11734-021-00155-4 ◽

2021 ◽

Author(s):

Nico Wunderling ◽

Jonathan Krönke ◽

Valentin Wohlfarth ◽

Jan Kohler ◽

Jobst Heitzig ◽

...

Keyword(s):

Dynamical Systems ◽

Open Source Software ◽

Software Package ◽

Model Systems ◽

Stochastic Dynamical Systems ◽

Moisture Recycling ◽

Modelling Framework ◽

Different Types ◽

Open Source Software Package ◽

The Individual

AbstractTipping elements occur in various systems such as in socio-economics, ecology and the climate system. In many cases, the individual tipping elements are not independent of each other, but they interact across scales in time and space. To model systems of interacting tipping elements, we here introduce the PyCascades open source software package for studying interacting tipping elements (10.5281/zenodo.4153102). PyCascades is an object-oriented and easily extendable package written in the programming language Python. It allows for investigating under which conditions potentially dangerous cascades can emerge between interacting dynamical systems, with a focus on tipping elements. With PyCascades it is possible to use different types of tipping elements such as double-fold and Hopf types and interactions between them. PyCascades can be applied to arbitrary complex network structures and has recently been extended to stochastic dynamical systems. This paper provides an overview of the functionality of PyCascades by introducing the basic concepts and the methodology behind it. In the end, three examples are discussed, showing three different applications of the software package. First, the moisture recycling network of the Amazon rainforest is investigated. Second, a model of interacting Earth system tipping elements is discussed. And third, the PyCascades modelling framework is applied to a global trade network.

Download Full-text

RULE-BASED SYLLABIFICATION OF KOREAN WORDS WRITTEN IN LATIN USING DETERMINISTIC FINITE AUTOMATA MODELS

Jurnal Terapan Teknologi Informasi ◽

10.21460/jutei.2018.21.77 ◽

2018 ◽

Vol 2 (1) ◽

pp. 75-85

Author(s):

Rouly Doharma Sihite ◽

Aditya Wikan Mahastama

Keyword(s):

Success Rate ◽

Statistical Approach ◽

Markov Models ◽

Finite Automata ◽

Writing System ◽

Test Results ◽

Writing Systems ◽

Research Focus ◽

Finite State ◽

Catch Up

Transliteration is still a challenge in helping people to read or write from one to another writing systems. Korean transliteration has been a topic of research to automate the conversion between Hangul (Korean writing system) and Latin characters. Previous works have been done in transliterating Hangul to Latin, using statistical approach (72.2% accuracy) and Extended Markov Models (54.9% accuracy). This research focus on transliterating Latin (romanised) Korean words into Hangul, as many learners of Korean began using Latin first. Selected method is modeling the probable vowel and consonant forms and problable vowel and consonant sequences using Finite State Automata to avoid training. These models are then coded into rules which applied and tested to 100 random Korean words. Initial test results only 40% success rate in transliterating due to the nature that consonants have to be labeled as initial or final of a syllable, and some consonants missed the modeled rules. Additional rules are then added to catch-up and merge these consonants into existing proper syllables, which increased the success rate to 92%. This result is analysed further and it is found that certain consonants sequence caused syllabification problem if exist in a certain position. Other additional rules was inserted and yields 99% final success rate which also is the accuracy of transliterating Korean words written in Latin into Hangul characters in compund syllables.

Download Full-text

Bayesian inversion of magnetotelluric data considering dimensionality discrepancies

Geophysical Journal International ◽

10.1093/gji/ggaa391 ◽

2020 ◽

Vol 223 (3) ◽

pp. 1565-1583

Author(s):

Hoël Seillé ◽

Gerhard Visser

Keyword(s):

Learning Algorithm ◽

Synthetic Data ◽

Real Data ◽

Error Model ◽

Bayesian Inversion ◽

Magnetotelluric Data ◽

Data Set ◽

Likelihood Functions ◽

Training Images ◽

Phase Tensor

SUMMARY Bayesian inversion of magnetotelluric (MT) data is a powerful but computationally expensive approach to estimate the subsurface electrical conductivity distribution and associated uncertainty. Approximating the Earth subsurface with 1-D physics considerably speeds-up calculation of the forward problem, making the Bayesian approach tractable, but can lead to biased results when the assumption is violated. We propose a methodology to quantitatively compensate for the bias caused by the 1-D Earth assumption within a 1-D trans-dimensional Markov chain Monte Carlo sampler. Our approach determines site-specific likelihood functions which are calculated using a dimensionality discrepancy error model derived by a machine learning algorithm trained on a set of synthetic 3-D conductivity training images. This is achieved by exploiting known geometrical dimensional properties of the MT phase tensor. A complex synthetic model which mimics a sedimentary basin environment is used to illustrate the ability of our workflow to reliably estimate uncertainty in the inversion results, even in presence of strong 2-D and 3-D effects. Using this dimensionality discrepancy error model we demonstrate that on this synthetic data set the use of our workflow performs better in 80 per cent of the cases compared to the existing practice of using constant errors. Finally, our workflow is benchmarked against real data acquired in Queensland, Australia, and shows its ability to detect the depth to basement accurately.

Download Full-text

Preface

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001497000676 ◽

1997 ◽

Vol 11 (07) ◽

pp. 1023-1024

Author(s):

Serge Miguet ◽

Annick Montanvert ◽

P. S. P. Wang

Keyword(s):

Finite State Machine ◽

Finite Automata ◽

Upper Bounds ◽

Two Dimensional ◽

Turing Machines ◽

Closure Properties ◽

Working Space ◽

Finite State ◽

The Individual ◽

Alternating Turing Machines

Several nonclosure properties of each class of sets accepted by two-dimensional alternating one-marker automata, alternating one-marker automata with only universal states, nondeterministic one-marker automata, deterministic one-marker automata, alternating finite automata, and alternating finite automata with only universal states are shown. To do this, we first establish the upper bounds of the working space used by "three-way" alternating Turing machines with only universal states to simulate those "four-way" non-storage machines. These bounds provide us a simplified and unified proof method for the whole variants of one-marker and/or alternating finite state machine, without directly analyzing the complex behavior of the individual four-way machine on two-dimensional rectangular input tapes. We also summarize the known closure properties including Boolean closures for all the variants of two-dimensional alternating one-marker automata.

Download Full-text

Streaming feature-based causal structure learning algorithm with symmetrical uncertainty

Information Sciences ◽

10.1016/j.ins.2018.04.076 ◽

2018 ◽

Vol 467 ◽

pp. 708-724 ◽

Cited By ~ 2

Author(s):

Jing Yang ◽

Xiaoxue Guo ◽

Ning An ◽

Aiguo Wang ◽

Kui Yu

Keyword(s):

Structure Learning ◽

Learning Algorithm ◽

Causal Structure ◽

Symmetrical Uncertainty ◽

Feature Based

Download Full-text

Fast Incremental SVDD Learning Algorithm with the Gaussian Kernel

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013991 ◽

2019 ◽

Vol 33 ◽

pp. 3991-3998 ◽

Cited By ~ 1

Author(s):

Hansi Jiang ◽

Haoyu Wang ◽

Wenhao Hu ◽

Deovrat Kakde ◽

Arin Chaudhuri

Keyword(s):

Outlier Detection ◽

Learning Algorithm ◽

Large Data ◽

Real Data ◽

Gaussian Kernel ◽

Support Vector ◽

Support Vector Data Description ◽

Detection Accuracy ◽

Single Class ◽

Support Vectors

Support vector data description (SVDD) is a machine learning technique that is used for single-class classification and outlier detection. The idea of SVDD is to find a set of support vectors that defines a boundary around data. When dealing with online or large data, existing batch SVDD methods have to be rerun in each iteration. We propose an incremental learning algorithm for SVDD that uses the Gaussian kernel. This algorithm builds on the observation that all support vectors on the boundary have the same distance to the center of sphere in a higher-dimensional feature space as mapped by the Gaussian kernel function. Each iteration involves only the existing support vectors and the new data point. Moreover, the algorithm is based solely on matrix manipulations; the support vectors and their corresponding Lagrange multiplier αi’s are automatically selected and determined in each iteration. It can be seen that the complexity of our algorithm in each iteration is only O(k2), where k is the number of support vectors. Experimental results on some real data sets indicate that FISVDD demonstrates significant gains in efficiency with almost no loss in either outlier detection accuracy or objective function value.

Download Full-text

LIMITED AUTOMATA AND REGULAR LANGUAGES

International Journal of Foundations of Computer Science ◽

10.1142/s0129054114400140 ◽

2014 ◽

Vol 25 (07) ◽

pp. 897-916 ◽

Cited By ~ 15

Author(s):

GIOVANNI PIGHIZZINI ◽

ANDREA PISONI

Keyword(s):

Finite Automata ◽

Upper And Lower Bounds ◽

Finite State Automata ◽

Turing Machines ◽

Double Exponential ◽

Finite State ◽

A Cell ◽

Fixed Constant ◽

The One ◽

Context Free

Limited automata are one-tape Turing machines that are allowed to rewrite the content of any tape cell only in the first d visits, for a fixed constant d. In the case d = 1, namely, when a rewriting is possible only during the first visit to a cell, these models have the same power of finite state automata. We prove state upper and lower bounds for the conversion of 1-limited automata into finite state automata. In particular, we prove a double exponential state gap between nondeterministic 1-limited automata and one-way deterministic finite automata. The gap reduces to a single exponential in the case of deterministic 1-limited automata. This also implies an exponential state gap between nondeterministic and deterministic 1-limited automata. Another consequence is that 1-limited automata can have less states than equivalent two-way nondeterministic finite automata. We show that this is true even if we restrict to the case of the one-letter input alphabet. For each d ≥ 2, d-limited automata are known to characterize the class of context-free languages. Using the Chomsky-Schützenberger representation for contextfree languages, we present a new conversion from context-free languages into 2-limited automata.

Download Full-text

A METHOD FOR INFERRING HIERARCHICAL DYNAMICS IN STOCHASTIC PROCESSES

Advances in Complex Systems ◽

10.1142/s0219525908001507 ◽

2008 ◽

Vol 11 (01) ◽

pp. 1-16 ◽

Cited By ~ 13

Author(s):

OLOF GÖRNERUP ◽

MARTIN NILSSON JACOBI

Keyword(s):

Time Series ◽

Stochastic Processes ◽

A Priori ◽

Main Idea ◽

A Priori Knowledge ◽

Finite State Automaton ◽

State Spaces ◽

Markovian Dynamics ◽

Finite State ◽

Operational Algorithm

Complex systems may often be characterized by their hierarchical dynamics. In this paper we present a method and an operational algorithm that automatically infer this property in a broad range of systems — discrete stochastic processes. The main idea is to systematically explore the set of projections from the state space of a process to smaller state spaces, and to determine which of the projections impose Markovian dynamics on the coarser level. These projections, which we call Markov projections, then constitute the hierarchical dynamics of the system. The algorithm operates on time series or other statistics, so a priori knowledge of the intrinsic workings of a system is not required in order to determine its hierarchical dynamics. We illustrate the method by applying it to two simple processes — a finite state automaton and an iterated map.

Download Full-text