Longest-Match Pattern Matching with Weighted Finite State Automata

Author(s):  
Thomas Hanneforth
2007 ◽  
Vol 18 (04) ◽  
pp. 859-871
Author(s):  
MARTIN ŠIMŮNEK ◽  
BOŘIVOJ MELICHAR

A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper offers the automata-theoretical description of Iliopoulos's ALL_BORDERS algorithm. The algorithm finds all borders of a string with don't care symbols. We show that ALL_BORDERS algorithm is an implementation of a finite state transducer of specific form. We describe how such a transducer can be constructed and what should be the input string like. The described transducer finds a set of lengths of all borders. Last but not least, we define approximate borders and show how to find all approximate borders of a string when we concern Hamming distance definition. Our solution of this problem is based on transducers again. This allows us to use analogy with automata-based pattern matching methods. Finally we discuss conditions under which the same principle can be used for other distance measures.


2015 ◽  
Vol 8 (3) ◽  
pp. 721-730 ◽  
Author(s):  
Shambhu Sharan ◽  
Arun K. Srivastava ◽  
S. P. Tiwari

2021 ◽  
Vol 178 (1-2) ◽  
pp. 59-76
Author(s):  
Emmanuel Filiot ◽  
Pierre-Alain Reynier

Copyless streaming string transducers (copyless SST) have been introduced by R. Alur and P. Černý in 2010 as a one-way deterministic automata model to define transductions of finite strings. Copyless SST extend deterministic finite state automata with a set of variables in which to store intermediate output strings, and those variables can be combined and updated all along the run, in a linear manner, i.e., no variable content can be copied on transitions. It is known that copyless SST capture exactly the class of MSO-definable string-to-string transductions, and are as expressive as deterministic two-way transducers. They enjoy good algorithmic properties. Most notably, they have decidable equivalence problem (in PSpace). On the other hand, HDT0L systems have been introduced for a while, the most prominent result being the decidability of the equivalence problem. In this paper, we propose a semantics of HDT0L systems in terms of transductions, and use it to study the class of deterministic copyful SST. Our contributions are as follows: (i)HDT0L systems and total deterministic copyful SST have the same expressive power, (ii)the equivalence problem for deterministic copyful SST and the equivalence problem for HDT0L systems are inter-reducible, in quadratic time. As a consequence, equivalence of deterministic SST is decidable, (iii)the functionality of non-deterministic copyful SST is decidable, (iv)determining whether a non-deterministic copyful SST can be transformed into an equivalent non-deterministic copyless SST is decidable in polynomial time.


Sign in / Sign up

Export Citation Format

Share Document