Lyndon Factorization Algorithms for Small Alphabets and Run-Length Encoded Strings

Sukhpal Ghuman; Emanuele Giaquinta; Jorma Tarhio

doi:10.3390/a12060124

Lyndon Factorization Algorithms for Small Alphabets and Run-Length Encoded Strings

Algorithms ◽

10.3390/a12060124 ◽

2019 ◽

Vol 12 (6) ◽

pp. 124

Author(s):

Sukhpal Ghuman ◽

Emanuele Giaquinta ◽

Jorma Tarhio

Keyword(s):

Time Complexity ◽

Linear Time ◽

Experimental Results ◽

The Other ◽

Worst Case ◽

Run Length ◽

Original Algorithm ◽

Factorization Algorithms

We present two modifications of Duval’s algorithm for computing the Lyndon factorization of a string. One of the algorithms has been designed for strings containing runs of the smallest character. It works best for small alphabets and it is able to skip a significant number of characters of the string. Moreover, it can be engineered to have linear time complexity in the worst case. When there is a run-length encoded string R of length ρ , the other algorithm computes the Lyndon factorization of R in O ( ρ ) time and in constant space. It is shown by experimental results that the new variations are faster than Duval’s original algorithm in many scenarios.

Download Full-text

GazeEMD: Detecting Visual Intention in Gaze-Based Human-Robot Interaction

Robotics ◽

10.3390/robotics10020068 ◽

2021 ◽

Vol 10 (2) ◽

pp. 68

Author(s):

Lei Shi ◽

Cosmin Copot ◽

Steve Vanlanduit

Keyword(s):

Cognitive Load ◽

Real World ◽

Robotic Manipulator ◽

Similarity Score ◽

Human Robot Interaction ◽

Experimental Results ◽

The Other ◽

Robot Interaction ◽

Run Length ◽

The Real

In gaze-based Human-Robot Interaction (HRI), it is important to determine human visual intention for interacting with robots. One typical HRI interaction scenario is that a human selects an object by gaze and a robotic manipulator will pick up the object. In this work, we propose an approach, GazeEMD, that can be used to detect whether a human is looking at an object for HRI application. We use Earth Mover’s Distance (EMD) to measure the similarity between the hypothetical gazes at objects and the actual gazes. Then, the similarity score is used to determine if the human visual intention is on the object. We compare our approach with a fixation-based method and HitScan with a run length in the scenario of selecting daily objects by gaze. Our experimental results indicate that the GazeEMD approach has higher accuracy and is more robust to noises than the other approaches. Hence, the users can lessen cognitive load by using our approach in the real-world HRI scenario.

Download Full-text

Further Study on Reverse 1-Center Problem on Trees

Asia Pacific Journal of Operational Research ◽

10.1142/s0217595920500347 ◽

2020 ◽

Vol 37 (06) ◽

pp. 2050034

Author(s):

Ali Reza Sepasian ◽

Javad Tayyebi

Keyword(s):

Dynamic Programming ◽

Time Complexity ◽

Numerical Experiments ◽

Linear Time ◽

Minimum Cost ◽

Fixed Number ◽

The Other ◽

Center Problem ◽

Objective Value ◽

Linear Cost

This paper studies two types of reverse 1-center problems under uniform linear cost function where edge lengths are allowed to reduce. In the first type, the aim is that the objective value is bounded by a prescribed fixed value [Formula: see text] at minimum cost. The aim of the other is to improve the objective value as much as possible within a given budget. An algorithm based on dynamic programming is proposed to solve the first problem in linear time. Then, this algorithm is applied as a subroutine to design an algorithm to solve the second type of the problem in [Formula: see text] time in which [Formula: see text] is a fixed number dependent on the problem parameters. Under the similarity assumption, this algorithm has a better complexity than the Nguyen algorithm (2013) with quadratic-time complexity. Some numerical experiments are conducted to validate this fact in practice.

Download Full-text

d-PBWT: dynamic positional Burrows-Wheeler transform

10.1101/2020.01.14.906487 ◽

2020 ◽

Author(s):

Ahsan Sanaullah ◽

Degui Zhi ◽

Shaojie Zhang

Keyword(s):

Data Structure ◽

Time Complexity ◽

Linear Time ◽

Genotype Imputation ◽

Worst Case ◽

Average Case ◽

Insertion And Deletion ◽

Static Data ◽

Efficient Retrieval ◽

Burrows Wheeler Transform

AbstractDurbin’s PBWT, a scalable data structure for haplotype matching, has been successfully applied to identical by descent (IBD) segment identification and genotype imputation. Once the PBWT of a haplotype panel is constructed, it supports efficient retrieval of all shared long segments among all individuals (long matches) and efficient query between an external haplotype and the panel. However, the standard PBWT is an array-based static data structure and does not support dynamic updates of the panel. Here, we generalize the static PBWT to a dynamic data structure, d-PBWT, where the reverse prefix sorting at each position is represented by linked lists. We developed efficient algorithms for insertion and deletion of individual haplotypes. In addition, we verified that d-PBWT can support all algorithms of PBWT. In doing so, we systematically investigated variations of set maximal match and long match query algorithms: while they all have average case time complexity independent of database size, they have different worst case complexities, linear time complexity with the size of the genome, and dependency on additional data structures.

Download Full-text

Finding Top-k Nodes for Temporal Closeness in Large Temporal Graphs

Algorithms ◽

10.3390/a13090211 ◽

2020 ◽

Vol 13 (9) ◽

pp. 211 ◽

Cited By ~ 1

Author(s):

Pierluigi Crescenzi ◽

Clémence Magnien ◽

Andrea Marino

Keyword(s):

Time Complexity ◽

Search Algorithm ◽

Sampling Technique ◽

Closeness Centrality ◽

Experimental Results ◽

The Other ◽

Breadth First Search ◽

Centrality Measure ◽

Temporal Graphs

The harmonic closeness centrality measure associates, to each node of a graph, the average of the inverse of its distances from all the other nodes (by assuming that unreachable nodes are at infinite distance). This notion has been adapted to temporal graphs (that is, graphs in which edges can appear and disappear during time) and in this paper we address the question of finding the top-k nodes for this metric. Computing the temporal closeness for one node can be done in O(m) time, where m is the number of temporal edges. Therefore computing exactly the closeness for all nodes, in order to find the ones with top closeness, would require O(nm) time, where n is the number of nodes. This time complexity is intractable for large temporal graphs. Instead, we show how this measure can be efficiently approximated by using a “backward” temporal breadth-first search algorithm and a classical sampling technique. Our experimental results show that the approximation is excellent for nodes with high closeness, allowing us to detect them in practice in a fraction of the time needed for computing the exact closeness of all nodes. We validate our approach with an extensive set of experiments.

Download Full-text

Improved Approximation for Fréchet Distance on c-Packed Curves Matching Conditional Lower Bounds

International Journal of Computational Geometry & Applications ◽

10.1142/s0218195917600056 ◽

2017 ◽

Vol 27 (01n02) ◽

pp. 85-119 ◽

Cited By ~ 1

Author(s):

Karl Bringmann ◽

Marvin Künnemann

Keyword(s):

Lower Bounds ◽

Time Complexity ◽

Linear Time ◽

High Dimensions ◽

Fréchet Distance ◽

Worst Case ◽

One Dimensional ◽

Dimension Formula ◽

Frechet Distance ◽

Special Case

The Fréchet distance is a well studied and very popular measure of similarity of two curves. The best known algorithms have quadratic time complexity, which has recently been shown to be optimal assuming the Strong Exponential Time Hypothesis (SETH) [Bringmann, FOCS'14]. To overcome the worst-case quadratic time barrier, restricted classes of curves have been studied that attempt to capture realistic input curves. The most popular such class are [Formula: see text]-packed curves, for which the Fréchet distance has a [Formula: see text]-approximation in time [Formula: see text] [Driemel et al., DCG'12]. In dimension [Formula: see text] this cannot be improved to [Formula: see text] for any [Formula: see text] unless SETH fails [Bringmann, FOCS'14]. In this paper, exploiting properties that prevent stronger lower bounds, we present an improved algorithm with time complexity [Formula: see text]. This improves upon the algorithm by Driemel et al. for any [Formula: see text]. Moreover, our algorithm's dependence on [Formula: see text], [Formula: see text] and [Formula: see text] is optimal in high dimensions apart from lower order factors, unless SETH fails. Our main new ingredients are as follows: For filling the classical free-space diagram we project short subcurves onto a line, which yields one-dimensional separated curves with roughly the same pairwise distances between vertices. Then we tackle this special case in near-linear time by carefully extending a greedy algorithm for the Fréchet distance of one-dimensional separated curves.

Download Full-text

Fast Equilibrium Test and Force Distribution for Multicontact Robotic Systems

Journal of Mechanisms and Robotics ◽

10.1115/1.4001089 ◽

2010 ◽

Vol 2 (2) ◽

Cited By ~ 3

Author(s):

Yu Zheng ◽

Chee-Meng Chew

Keyword(s):

Contact Force ◽

Time Complexity ◽

Linear Time ◽

Optimization Technique ◽

Contact Forces ◽

The Other ◽

Robotic Systems ◽

Force Distribution ◽

Contact Force Distribution ◽

Equilibrium Test

In the research of multicontact robotic systems, the equilibrium test and contact force distribution are two fundamental problems, which need to determine the existence of feasible contact forces subject to the friction constraint, and their optimal values for counterbalancing the other wrenches applied on the system and maintaining the system in equilibrium. All the wrenches, except those generated by the contact forces, can be treated as a whole, called the external wrench. The external wrench is time-varying in a dynamic system and both problems usually must be solved in real time. This paper presents an efficient procedure for solving the two problems. Using the linearized friction model, the resultant wrenches that can be produced by all contacts constitute a polyhedral convex cone in six-dimensional wrench space. Given an external wrench, the procedure computes the minimum distance between the wrench cone and the required equilibrating wrench, which is equal but opposite to the external wrench. The zero distance implies that the equilibrating wrench lies in the wrench cone, and that the external wrench can be resisted by contacts. Then, a set of linearly independent wrench vectors in the wrench cone are also determined, such that the equilibrating wrench can be written as their positive combination. This procedure always terminates in finite iterations and runs very fast, even in six-dimensional wrench space. Based on it, two contact force distribution methods are provided. One combines the procedure with the linear programming technique, yielding optimal contact forces with linear time complexity. The other directly utilizes the procedure without the aid of any general optimization technique, yielding suboptimal contact forces with nearly constant time complexity. Effective strategies are suggested to ensure the solution continuity.

Download Full-text

FROM C-CONTINUATIONS TO NEW QUADRATIC ALGORITHMS FOR AUTOMATON SYNTHESIS

International Journal of Algebra and Computation ◽

10.1142/s0218196701000772 ◽

2001 ◽

Vol 11 (06) ◽

pp. 707-735 ◽

Cited By ~ 18

Author(s):

J.-M. CHAMPARNAUD ◽

D. ZIADI

Keyword(s):

Time Complexity ◽

Regular Expression ◽

Time Algorithm ◽

The Other ◽

Worst Case ◽

Partial Derivatives ◽

Space And Time ◽

Other Hand ◽

Deterministic Automata ◽

The Way

Two classical non-deterministic automata recognize the language denoted by a regular expression: the position automaton which deduces from the position sets defined by Glushkov and McNaughton–Yamada, and the equation automaton which can be computed via Mirkin's prebases or Antimirov's partial derivatives. Let |E| be the size of the expression and ‖E‖ be its alphabetic width, i.e. the number of symbol occurrences. The number of states in the equation automaton is less than or equal to the number of states in the position automaton, which is equal to ‖E‖+1. On the other hand, the worst-case time complexity of Antimirov algorithm is O(‖E‖3· |E|2), while it is only O(‖E‖·|E|) for the most efficient implementations yielding the position automaton (Brüggemann–Klein, Chang and Paige, Champarnaud et al.). We present an O(|E|2) space and time algorithm to compute the equation automaton. It is based on the notion of canonical derivative which makes it possible to efficiently handle sets of word derivatives. By the way, canonical derivatives also lead to a new O(|E|2) space and time algorithm to construct the position automaton.

Download Full-text

Merging Multi-Version Texts: a Generic Solution to the Overlap Problem

Proceedings of Balisage: The Markup Conference 2009 ◽

10.4242/balisagevol3.schmidt01 ◽

2009 ◽

Cited By ~ 4

Author(s):

Desmond Schmidt

Keyword(s):

Time Complexity ◽

Linear Time ◽

Hard Problem ◽

Alignment Quality ◽

Digital Text ◽

Multiple Sequence ◽

Worst Case ◽

Alignment Problem ◽

Alignment Process ◽

Simple Format

Multi-Version Documents or MVDs, as described in Schmidt and Colomb (Schm09), provide a simple format for representing overlapping structures in digital text. They permit the reuse of existing technologies, such as XML, to encode the content of individual versions, while allowing overlapping hierarchies (separate, partial or conditional) and textual variation (insertions, deletions, alternatives and transpositions) to exist within the same document. Most desired operations on MVDs may be performed by simple algorithms in linear time. However, creating and editing MVDs is a much harder and more complex operation that resembles the multiple-sequence alignment problem in biology. The inclusion of the transposition operation into the alignment process makes this a hard problem, with no solutions known to be both optimal and practical. However, a suitable heuristic algorithm can be devised, based in part on the most recent biological alignment programs, whose time complexity is quadratic in the worst case, and is often much faster. The results are satisfactory both in terms of speed and alignment quality. This means that MVDs can be considered as a practical and editable format suitable for representing many cases of overlapping structure in digital text.

Download Full-text

New Algorithms for Bidirectional Singleton Arc Consistency

Mathematical Problems in Engineering ◽

10.1155/2013/904768 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10

Author(s):

Yonggang Zhang ◽

Qian Yin ◽

Xingjun Zhu ◽

Zhanshan Li ◽

Sibo Zhang ◽

...

Keyword(s):

Time Complexity ◽

The Other ◽

Worst Case ◽

Space And Time ◽

Arc Consistency ◽

Wide Range ◽

Order Of Magnitude ◽

Special Circumstances ◽

New Algorithms

Bidirectional singleton arc consistency (BiSAC) which is an extended singleton arc consistency (SAC) has been proposed recently. The first contribution of this paper is to propose and prove two theorems of BiSAC theoretically (one is a property of BiSAC and the other is the property of allowing the deletion of some BiSAC-inconsistent values). Secondly, based on these properties we present two algorithms, denoted by BiSAC-DF and BiSAC-DP, to enforce BiSAC. Also, we prove their correctness and analyze the space and time complexity of them in detail. Besides, for special circumstances, we show that BiSAC-DF admits a worst-case time complexity inO(en2d4)and a best one inO(en2d3)when the problem is an already BiSAC, while BiSAC-DP also has the same best one when the tightness is small. Finally, experiments on a wide range of CSP instances show BiSAC-DF and BiSAC-DP are usually around one order of magnitude faster than the existing BiSAC-1. For some special instances, BiSAC-DP is about two orders of magnitude efficient.

Download Full-text

Hierarchical clustering of maximum parsimony reconciliations

BMC Bioinformatics ◽

10.1186/s12859-019-3223-5 ◽

2019 ◽

Vol 20 (1) ◽

Author(s):

Ross Mawhorter ◽

Ran Libeskind-Hadas

Keyword(s):

Hierarchical Clustering ◽

Maximum Parsimony ◽

General Framework ◽

Clustering Algorithm ◽

Experimental Results ◽

The Other ◽

Worst Case ◽

New Approach ◽

Hierarchical Clustering Algorithm ◽

Symbiont Species

Abstract Background Maximum parsimony reconciliation in the duplication-transfer-loss model is a widely-used method for analyzing the evolutionary histories of pairs of entities such as hosts and parasites, symbiont species, and species and genes. While efficient algorithms are known for finding maximum parsimony reconciliations, the number of such reconciliations can be exponential in the size of the trees. Since these reconciliations can differ substantially from one another, making inferences from any one reconciliation may lead to conclusions that are not supported, or may even be contradicted, by other maximum parsimony reconciliations. Therefore, there is a need to find small sets of best representative reconciliations when the space of solutions is large and diverse. Results We provide a general framework for hierarchical clustering the space of maximum parsimony reconciliations. We demonstrate this framework for two specific linkage criteria, one that seeks to maximize the average support of the events found in the reconciliations in each cluster and the other that seeks to minimize the distance between reconciliations in each cluster. We analyze the asymptotic worst-case running times and provide experimental results that demonstrate the viability and utility of this approach. Conclusions The hierarchical clustering algorithm method proposed here provides a new approach to find a set of representative reconciliations in the potentially vast and diverse space of maximum parsimony reconciliations.

Download Full-text