A Composite Boyer-Moore Algorithm for the String Matching Problem

A New Algorithm for Subset Matching Problem Based on Set-String Transformation

Encyclopedia of Information Communication Technology ◽

10.4018/978-1-59904-845-1.ch080 ◽

2009 ◽

pp. 607-615

Author(s):

Yangjun Chen

Keyword(s):

Programming Languages ◽

Linear Time ◽

String Matching ◽

Special Problem ◽

Computer Engineering ◽

Data Types ◽

Matching Problem ◽

Matching Problems ◽

Abstract Data ◽

Geometric Pattern Matching

In computer engineering, a number of programming tasks involve a special problem, the so-called tree matching problem (Cole & Hariharan, 1997), as a crucial step, such as the design of interpreters for nonprocedural programming languages, automatic implementation of abstract data types, code optimization in compilers, symbolic computation, context searching in structure editors and automatic theorem proving. Recently, it has been shown that this problem can be transformed in linear time to another problem, the so called subset matching problem (Cole & Hariharan, 2002, 2003), which is to find all occurrences of a pattern string p of length m in a text string t of length n, where each pattern and text position is a set of characters drawn from some alphabet S. The pattern is said to occur at text position i if the set p[j] is a subset of the set t[i + j - 1], for all j (1 = j = m). This is a generalization of the ordinary string matching and is of interest since an efficient algorithm for this problem implies an efficient solution to the tree matching problem. In addition, as shown in (Indyk, 1997), this problem can also be used to solve general string matching and counting matching (Muthukrishan, 1997; Muthukrishan & Palem, 1994), and enables us to design efficient algorithms for several geometric pattern matching problems. In this article, we propose a new algorithm on this issue, which needs only O(n + m) time in the case that the size of S is small and O(n + m·n0.5) time on average in general cases.

Download Full-text

On the Comparison Complexity of the String Prefix-Matching Problem

BRICS Report Series ◽

10.7146/brics.v2i46.19947 ◽

1995 ◽

Vol 2 (46) ◽

Author(s):

Dany Breslauer ◽

Livio Colussi ◽

Laura Toniolo

Keyword(s):

Linear Time ◽

String Matching ◽

Random Access ◽

Upper Bounds ◽

Lower And Upper Bounds ◽

Matching Problem ◽

Machine Model ◽

Worst Case ◽

On Line ◽

Sequential Comparison

In this paper we study the exact comparison complexity of the string prefix-matching problem in the deterministic sequential comparison model with equality tests. We derive almost tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line prefix-matching algorithms for any fixed pattern and variable text. Unlike previous results on the comparison complexity of string-matching and prefix-matching algorithms, our bounds are almost tight for any particular pattern. We also consider the special case where the pattern and the text are the same string. This problem, which we call the string self-prefix problem, is similar to the pattern preprocessing step of the Knuth-Morris-Pratt string-matching algorithm that is used in several comparison efficient string-matching and prefix-matching algorithms, including in our new algorithm. We obtain roughly tight lower and upper bounds on the number of symbol comparisons required in the worst case by on-line self-prefix algorithms. Our algorithms can be implemented in linear time and space in the standard uniform-cost random-access-machine model.

Download Full-text

Approximate Chinese String Matching Techniques Based on Pinyin Input Method

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.513-517.1017 ◽

2014 ◽

Vol 513-517 ◽

pp. 1017-1020

Author(s):

Bing Liu ◽

Dan Han ◽

Shuang Zhang

Keyword(s):

Computer Science ◽

Rapid Development ◽

String Matching ◽

Approximate String Matching ◽

Chinese Characters ◽

Matching Problem ◽

Input Method ◽

Large Size ◽

Matching Techniques ◽

Research And Design

String matching is one of the most typical problems in computer science. Previous studies mainly focused on accurate string matching problem. However, with the rapid development of the computer and Internet as well as the continuously rising of new issues, people find that it has very important theoretical value and practical meaning to research and design efficient approximate string matching algorithms. Approximate string matching is also called string matching that allows errors, which mainly aims to find the pattern string in the text and database and allows k differences between the pattern string and its occurring forms in the text. For the problem of approximate string matching, though a number of algorithms have been proposed, there are fewer studies which focus on large size of alphabet . Most of experts are interested in small or middle size of alphabet . For large size of , especially for Chinese characters and Asian phonetics, there are fewer efficient algorithms. For the above reasons, this paper focuses on the approximate Chinese strings matching problem based on the pinyin input method.

Download Full-text

A Graph Theoretic Model to Solve the Approximate String Matching Problem Allowing for Translocations

Lecture Notes in Computer Science - Combinatorial Algorithms ◽

10.1007/978-3-642-35926-2_20 ◽

2012 ◽

pp. 169-181

Author(s):

Pritom Ahmed ◽

A. S. M. Shohidull Islam ◽

M. Sohel Rahman

Keyword(s):

String Matching ◽

Approximate String Matching ◽

Theoretic Model ◽

Matching Problem ◽

Graph Theoretic

Download Full-text

The exact online string matching problem

ACM Computing Surveys ◽

10.1145/2431211.2431212 ◽

2013 ◽

Vol 45 (2) ◽

pp. 1-42 ◽

Cited By ~ 69

Author(s):

Simone Faro ◽

Thierry Lecroq

Keyword(s):

String Matching ◽

Matching Problem

Download Full-text

Solution for String Matching Problem of Indian Alphabetical Letters

2009 International Conference on Computer Technology and Development ◽

10.1109/icctd.2009.233 ◽

2009 ◽

Author(s):

Adnan I. Al Rabea ◽

Mohammd Jaber ◽

A.V. Senthil Kumar

Keyword(s):

String Matching ◽

Matching Problem

Download Full-text

On the massive string matching problem

2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) ◽

10.1109/fskd.2016.7603199 ◽

2016 ◽

Cited By ~ 1

Author(s):

Yangjun Chen ◽

Yujia Wu

Keyword(s):

String Matching ◽

Massive String ◽

Matching Problem

Download Full-text

Modification of Valiant’s algorithm for the string-matching problem

Proceedings of the Institute for System Programming of RAS ◽

10.15514/ispras-2020-32(2)-11 ◽

2020 ◽

Vol 32 (2) ◽

pp. 135-148

Author(s):

Yuliya Alekseevna Susanina ◽

Anna Nikitichna Yaveyn ◽

Semyon Vyacheslavovich Grigorev

Keyword(s):

String Matching ◽

Matching Problem

Download Full-text

Parallel Algorithms for String Matching Problem on Single and Two Dimensional Reconfigurable Pipelined Bus Systems

Journal of Computer Science ◽

10.3844/jcssp.2007.754.759 ◽

2007 ◽

Vol 3 (9) ◽

pp. 754-759 ◽

Cited By ~ 5

Author(s):

S.Viswanadha Raju ◽

A.Vinaya Babu

Keyword(s):

Parallel Algorithms ◽

String Matching ◽

Two Dimensional ◽

Matching Problem

Download Full-text

Two Dimensional Matching

Pattern Matching Algorithms ◽

10.1093/oso/9780195113679.003.0012 ◽

1997 ◽

Author(s):

A. Amir ◽

M. Farach

Keyword(s):

Pattern Matching ◽

String Matching ◽

Higher Dimensions ◽

Natural Generalization ◽

Theoretical Problem ◽

Two Dimensional ◽

Exact Matching ◽

Matching Problem ◽

Deterministic Algorithms ◽

Special Case

String matching is a basic theoretical problem in computer science, but has been useful in implementating various text editing tasks. The explosion of multimedia requires an appropriate generalization of string matching to higher dimensions. The first natural generalization is that of seeking the occurrences of a pattern in a text where both pattern arid text are rectangles. The last few years saw a tremendous activity in two dimensional pattern matching algorithms. We naturally had to limit the amount of information that entered this chapter. We chose to concentrate on serial deterministic algorithms for some of the basic issues of two dimensional matching. Throughout this chapter we define our problems in terms of squares rather than rectangles, however, all results presented easily generalize to rectangles. The Exact Two Dimensional Matching Problem is defined as follows: . . . INPUT: Text array T[n x n] and pattern array P[m x m]. OUTPUT: All locations [i,j] in T where there is an occurrence of P, i.e. T[i+k+,j+l] = P[k+1,l+1] 0 ≤ k, l ≤ n-1. . . . A natural way of solving any generalized problem is by reducing it to a special case whose solution is known. It is therefore not surprising that most solutions to the two dimensional exact matching problem use exact string matching algorithms in one way or another. In this section, we present an algorithm for two dimensional matching which relies on reducing a matrix of characters into a one dimensional array. Let P' [1 . . .m] be a pattern which is derived from P by setting P' [i] = P[i,l]P[i,2]…P[i,m], that is, the ith character of P' is the ith row of P. Let Ti[l . . .n — m + 1], for 1 ≤ i ≤ n, be a set of arrays such that Ti[j] = T[i, j] T [ i , j + 1 ] • • • T[i, j + m-1]. Clearly, P occurs at T[i, j] iff P' occurs at Ti[j].

Download Full-text