scholarly journals Abductive learning of quantized stochastic processes with probabilistic finite automata

Author(s):  
Ishanu Chattopadhyay ◽  
Hod Lipson

We present an unsupervised learning algorithm ( GenESeSS ) to infer the causal structure of quantized stochastic processes, defined as stochastic dynamical systems evolving over discrete time, and producing quantized observations. Assuming ergodicity and stationarity, GenESeSS infers probabilistic finite state automata models from a sufficiently long observed trace. Our approach is abductive; attempting to infer a simple hypothesis, consistent with observations and modelling framework that essentially fixes the hypothesis class. The probabilistic automata we infer have no initial and terminal states, have no structural restrictions and are shown to be probably approximately correct-learnable. Additionally, we establish rigorous performance guarantees and data requirements, and show that GenESeSS correctly infers long-range dependencies. Modelling and prediction examples on simulated and real data establish relevance to automated inference of causal stochastic structures underlying complex physical phenomena.

1993 ◽  
Vol 18 (2-4) ◽  
pp. 209-220
Author(s):  
Michael Hadjimichael ◽  
Anita Wasilewska

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.


2007 ◽  
Vol 18 (04) ◽  
pp. 859-871
Author(s):  
MARTIN ŠIMŮNEK ◽  
BOŘIVOJ MELICHAR

A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper offers the automata-theoretical description of Iliopoulos's ALL_BORDERS algorithm. The algorithm finds all borders of a string with don't care symbols. We show that ALL_BORDERS algorithm is an implementation of a finite state transducer of specific form. We describe how such a transducer can be constructed and what should be the input string like. The described transducer finds a set of lengths of all borders. Last but not least, we define approximate borders and show how to find all approximate borders of a string when we concern Hamming distance definition. Our solution of this problem is based on transducers again. This allows us to use analogy with automata-based pattern matching methods. Finally we discuss conditions under which the same principle can be used for other distance measures.


Author(s):  
Nico Wunderling ◽  
Jonathan Krönke ◽  
Valentin Wohlfarth ◽  
Jan Kohler ◽  
Jobst Heitzig ◽  
...  

AbstractTipping elements occur in various systems such as in socio-economics, ecology and the climate system. In many cases, the individual tipping elements are not independent of each other, but they interact across scales in time and space. To model systems of interacting tipping elements, we here introduce the PyCascades open source software package for studying interacting tipping elements (10.5281/zenodo.4153102). PyCascades is an object-oriented and easily extendable package written in the programming language Python. It allows for investigating under which conditions potentially dangerous cascades can emerge between interacting dynamical systems, with a focus on tipping elements. With PyCascades it is possible to use different types of tipping elements such as double-fold and Hopf types and interactions between them. PyCascades can be applied to arbitrary complex network structures and has recently been extended to stochastic dynamical systems. This paper provides an overview of the functionality of PyCascades by introducing the basic concepts and the methodology behind it. In the end, three examples are discussed, showing three different applications of the software package. First, the moisture recycling network of the Amazon rainforest is investigated. Second, a model of interacting Earth system tipping elements is discussed. And third, the PyCascades modelling framework is applied to a global trade network.


2018 ◽  
Vol 2 (1) ◽  
pp. 75-85
Author(s):  
Rouly Doharma Sihite ◽  
Aditya Wikan Mahastama

Transliteration is still a challenge in helping people to read or write from one to another writing systems. Korean transliteration has been a topic of research to automate the conversion between Hangul (Korean writing system) and Latin characters. Previous works have been done in transliterating Hangul to Latin, using statistical approach (72.2% accuracy) and Extended Markov Models (54.9% accuracy). This research focus on transliterating Latin (romanised) Korean words into Hangul, as many learners of Korean began using Latin first. Selected method is modeling the probable vowel and consonant forms and problable vowel and consonant sequences using Finite State Automata to avoid training. These models are then coded into rules which applied and tested to 100 random Korean words. Initial test results only 40% success rate in transliterating due to the nature that consonants have to be labeled as initial or final of a syllable, and some consonants missed the modeled rules. Additional rules are then added to catch-up and merge these consonants into existing proper syllables, which increased the success rate to 92%. This result is analysed further and it is found that certain consonants sequence caused syllabification problem if exist in a certain position. Other additional rules was inserted and yields 99% final success rate which also is the accuracy of transliterating Korean words written in Latin into Hangul characters in compund syllables.


2020 ◽  
Vol 223 (3) ◽  
pp. 1565-1583
Author(s):  
Hoël Seillé ◽  
Gerhard Visser

SUMMARY Bayesian inversion of magnetotelluric (MT) data is a powerful but computationally expensive approach to estimate the subsurface electrical conductivity distribution and associated uncertainty. Approximating the Earth subsurface with 1-D physics considerably speeds-up calculation of the forward problem, making the Bayesian approach tractable, but can lead to biased results when the assumption is violated. We propose a methodology to quantitatively compensate for the bias caused by the 1-D Earth assumption within a 1-D trans-dimensional Markov chain Monte Carlo sampler. Our approach determines site-specific likelihood functions which are calculated using a dimensionality discrepancy error model derived by a machine learning algorithm trained on a set of synthetic 3-D conductivity training images. This is achieved by exploiting known geometrical dimensional properties of the MT phase tensor. A complex synthetic model which mimics a sedimentary basin environment is used to illustrate the ability of our workflow to reliably estimate uncertainty in the inversion results, even in presence of strong 2-D and 3-D effects. Using this dimensionality discrepancy error model we demonstrate that on this synthetic data set the use of our workflow performs better in 80 per cent of the cases compared to the existing practice of using constant errors. Finally, our workflow is benchmarked against real data acquired in Queensland, Australia, and shows its ability to detect the depth to basement accurately.


Author(s):  
Serge Miguet ◽  
Annick Montanvert ◽  
P. S. P. Wang

Several nonclosure properties of each class of sets accepted by two-dimensional alternating one-marker automata, alternating one-marker automata with only universal states, nondeterministic one-marker automata, deterministic one-marker automata, alternating finite automata, and alternating finite automata with only universal states are shown. To do this, we first establish the upper bounds of the working space used by "three-way" alternating Turing machines with only universal states to simulate those "four-way" non-storage machines. These bounds provide us a simplified and unified proof method for the whole variants of one-marker and/or alternating finite state machine, without directly analyzing the complex behavior of the individual four-way machine on two-dimensional rectangular input tapes. We also summarize the known closure properties including Boolean closures for all the variants of two-dimensional alternating one-marker automata.


2018 ◽  
Vol 467 ◽  
pp. 708-724 ◽  
Author(s):  
Jing Yang ◽  
Xiaoxue Guo ◽  
Ning An ◽  
Aiguo Wang ◽  
Kui Yu

Author(s):  
Hansi Jiang ◽  
Haoyu Wang ◽  
Wenhao Hu ◽  
Deovrat Kakde ◽  
Arin Chaudhuri

Support vector data description (SVDD) is a machine learning technique that is used for single-class classification and outlier detection. The idea of SVDD is to find a set of support vectors that defines a boundary around data. When dealing with online or large data, existing batch SVDD methods have to be rerun in each iteration. We propose an incremental learning algorithm for SVDD that uses the Gaussian kernel. This algorithm builds on the observation that all support vectors on the boundary have the same distance to the center of sphere in a higher-dimensional feature space as mapped by the Gaussian kernel function. Each iteration involves only the existing support vectors and the new data point. Moreover, the algorithm is based solely on matrix manipulations; the support vectors and their corresponding Lagrange multiplier αi’s are automatically selected and determined in each iteration. It can be seen that the complexity of our algorithm in each iteration is only O(k2), where k is the number of support vectors. Experimental results on some real data sets indicate that FISVDD demonstrates significant gains in efficiency with almost no loss in either outlier detection accuracy or objective function value.


2014 ◽  
Vol 25 (07) ◽  
pp. 897-916 ◽  
Author(s):  
GIOVANNI PIGHIZZINI ◽  
ANDREA PISONI

Limited automata are one-tape Turing machines that are allowed to rewrite the content of any tape cell only in the first d visits, for a fixed constant d. In the case d = 1, namely, when a rewriting is possible only during the first visit to a cell, these models have the same power of finite state automata. We prove state upper and lower bounds for the conversion of 1-limited automata into finite state automata. In particular, we prove a double exponential state gap between nondeterministic 1-limited automata and one-way deterministic finite automata. The gap reduces to a single exponential in the case of deterministic 1-limited automata. This also implies an exponential state gap between nondeterministic and deterministic 1-limited automata. Another consequence is that 1-limited automata can have less states than equivalent two-way nondeterministic finite automata. We show that this is true even if we restrict to the case of the one-letter input alphabet. For each d ≥ 2, d-limited automata are known to characterize the class of context-free languages. Using the Chomsky-Schützenberger representation for contextfree languages, we present a new conversion from context-free languages into 2-limited automata.


2008 ◽  
Vol 11 (01) ◽  
pp. 1-16 ◽  
Author(s):  
OLOF GÖRNERUP ◽  
MARTIN NILSSON JACOBI

Complex systems may often be characterized by their hierarchical dynamics. In this paper we present a method and an operational algorithm that automatically infer this property in a broad range of systems — discrete stochastic processes. The main idea is to systematically explore the set of projections from the state space of a process to smaller state spaces, and to determine which of the projections impose Markovian dynamics on the coarser level. These projections, which we call Markov projections, then constitute the hierarchical dynamics of the system. The algorithm operates on time series or other statistics, so a priori knowledge of the intrinsic workings of a system is not required in order to determine its hierarchical dynamics. We illustrate the method by applying it to two simple processes — a finite state automaton and an iterated map.


Sign in / Sign up

Export Citation Format

Share Document