Two Efficient Algorithms for Linear Time Suffix Array Construction

2011 ◽  
Vol 60 (10) ◽  
pp. 1471-1484 ◽  
Author(s):  
Ge Nong ◽  
Sen Zhang ◽  
Wai Hong Chan
2021 ◽  
Vol 11 (2) ◽  
pp. 283-302
Author(s):  
Paul Meurer

I describe several new efficient algorithms for querying large annotated corpora. The search algorithms as they are implemented in several popular corpus search engines are less than optimal in two respects: regular expression string matching in the lexicon is done in linear time, and regular expressions over corpus positions are evaluated starting in those corpus positions that match the constraints of the initial edges of the corresponding network. To address these shortcomings, I have developed an algorithm for regular expression matching on suffix arrays that allows fast lexicon lookup, and a technique for running finite state automata from edges with lowest corpus counts. The implementation of the lexicon as suffix array also lends itself to an elegant and efficient treatment of multi-valued and set-valued attributes. The described techniques have been implemented in a fully functional corpus management system and are also used in a treebank query system.


2015 ◽  
Vol 15 (01n02) ◽  
pp. 1550005
Author(s):  
WENJUN LIU ◽  
CHENG-KUAN LIN

Fault diagnosis is important for the reliability of interconnection networks. This paper addresses the fault diagnosis of n-dimensional pancake graph Pn under the comparison diagnosis model. By the concept of local diagnosability, we first prove that the diagnosabitly of Pn is n − 1, and it has strong local diagnosability property even if there are n − 3 faulty edges. Furthermore, we present efficient algorithms to locate extended star and Hamiltonian path structures in Pn, respectively. According to the works of Li et al. and Lai, the extended star and Hamiltonian path structures can be used to identify all faulty vertices in linear time, provided the number of faulty vertices is no more than n − 1.


1998 ◽  
Vol 07 (02) ◽  
pp. 121-142 ◽  
Author(s):  
ASSEF CHMEISS ◽  
PHILIPPE JEGOU

Recently, efficient algorithms have been proposed to achieve arc- and path-consistencey in constraint networks. For example, for arc-consistency, there are linear time algorithms (in the size of the problem) which are efficient in practice (e.g. AC-6 and AC-7). The best path-consistency algorithm proposed is PC-{5|6} which is a natural generalization of AC-6 to path-consistency. While its theoretical complexity is the best, experimentations show clearly that it is not very efficient in practice. In this paper, we propose two algorithms, one for arc-consistency, AC-8, and the second for path-consistency, PC-8. These algorithms are based on the same principle: to exploit minimal supports as AC-6 and PC-{5|6} do, but without recording them. While for AC-8, this approach is of limited interest, we show that for path-consistency, this new approach allows to outperform significantly existing algorithms.


2015 ◽  
Vol 57 (2) ◽  
pp. 166-174 ◽  
Author(s):  
H. CHARKHGARD ◽  
M. SAVELSBERGH

We investigate two routing problems that arise when order pickers traverse an aisle in a warehouse. The routing problems can be viewed as Euclidean travelling salesman problems with points on two parallel lines. We show that if the order picker traverses only a section of the aisle and then returns, then an optimal solution can be found in linear time, and if the order picker traverses the entire aisle, then an optimal solution can be found in quadratic time. Moreover, we show how to approximate the routing cost in linear time by computing a minimum spanning tree for the points on the parallel lines.


2014 ◽  
Vol 15 (1) ◽  
Author(s):  
Carl Barton ◽  
Alice Heliou ◽  
Laurent Mouchard ◽  
Solon P Pissis

2006 ◽  
Vol 17 (06) ◽  
pp. 1281-1295 ◽  
Author(s):  
FRANTISEK FRANEK ◽  
WILLIAM F. SMYTH

For certain problems (for example, computing repetitions and repeats, data compression applications) it is not necessary that the suffixes of a string represented in a suffix tree or suffix array should occur in lexicographical order (lexorder). It thus becomes of interest to study possible alternate orderings of the suffixes in these data structures, that may be easier to construct or more efficient to use. In this paper we consider the "reconstruction" of a suffix array based on a given reordering of the alphabet, and we describe simple time- and space-efficient algorithms that accomplish it.


2009 ◽  
Vol 20 (06) ◽  
pp. 1109-1133 ◽  
Author(s):  
JIE LIN ◽  
YUE JIANG ◽  
DON ADJEROH

We introduce the VST (virtual suffix tree), an efficient data structure for suffix trees and suffix arrays. Starting from the suffix array, we construct the suffix tree, from which we derive the virtual suffix tree. Later, we remove the intermediate step of suffix tree construction, and build the VST directly from the suffix array. The VST provides the same functionality as the suffix tree, including suffix links, but at a much smaller space requirement. It has the same linear time construction even for large alphabets, Σ, requires O(n) space to store (n is the string length), and allows searching for a pattern of length m to be performed in O(m log |Σ|) time, the same time needed for a suffix tree. Given the VST, we show an algorithm that computes all the suffix links in linear time, independent of Σ. The VST requires less space than other recently proposed data structures for suffix trees and suffix arrays, such as the enhanced suffix array [1], and the linearized suffix tree [17]. On average, the space requirement (including that for suffix arrays and suffix links) is 13.8n bytes for the regular VST, and 12.05n bytes in its compact form.


Sign in / Sign up

Export Citation Format

Share Document