Optimal Prefix Free Codes with Partial Sorting

Jérémy Barbay

doi:10.3390/a13010012

Optimal Prefix Free Codes with Partial Sorting

Algorithms ◽

10.3390/a13010012 ◽

2019 ◽

Vol 13 (1) ◽

pp. 12 ◽

Cited By ~ 2

Author(s):

Jérémy Barbay

Keyword(s):

Computational Model ◽

State Of The Art ◽

Constant Factor ◽

Minimal Amount ◽

Text Compression ◽

Worst Case ◽

Multiplicative Factor ◽

Analysis Technique ◽

Positive Weights ◽

Free Code

We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within O ( n ( 1 + lg α ) ) ⊆ O ( n lg n ) , where the alternation α ∈ [ 1 . . n − 1 ] approximates the minimal amount of sorting required by the computation. This asymptotical complexity is within a constant factor of the optimal in the algebraic decision tree computational model, in the worst case over all instances of size n and alternation α . Such results refine the state of the art complexity of Θ ( n lg n ) in the worst case over instances of size n in the same computational model, a landmark in compression and coding since 1952. Beside the new analysis technique, such improvement is obtained by combining a new algorithm, inspired by van Leeuwen’s algorithm to compute optimal prefix free codes from sorted weights (known since 1976), with a relatively minor extension of Karp et al.’s deferred data structure to partially sort a multiset accordingly to the queries performed on it (known since 1988). Preliminary experimental results on text compression by words show α to be polynomially smaller than n, which suggests improvements by at most a constant multiplicative factor in the running time for such applications.

Download Full-text

Angle Bisector Algorithm and Modified Dynamic Programming Algorithm for Dubins Traveling Salesman Problem

10.36227/techrxiv.14767593.v1 ◽

2021 ◽

Author(s):

Eswara Venkata Kumar Dhulipala

Keyword(s):

Euclidean Distance ◽

Travelling Salesman Problem ◽

Dynamic Programming Algorithm ◽

Constant Factor ◽

Programming Algorithm ◽

Worst Case ◽

Angle Bisector ◽

Set Of Points ◽

The Given ◽

Tour Length

A Dubin's Travelling Salesman Problem (DTSP) of finding a minimum length tour through a given set of points is considered. DTSP has a Dubins vehicle, which is capable of moving only forward with constant speed. In this paper, first, a worst case upper bound is obtained on DTSP tour length by assuming DTSP tour sequence same as Euclidean Travelling Salesman Problem (ETSP) tour sequence. It is noted that, in the worst case, \emph{any algorithm that uses of ETSP tour sequence} is a constant factor approximation algorithm for DTSP. Next, two new algorithms are introduced, viz., Angle Bisector Algorithm (ABA) and Modified Dynamic Programming Algorithm (MDPA). In ABA, ETSP tour sequence is used as DTSP tour sequence and orientation angle at each point $i_k$ are calculated by using angle bisector of the relative angle formed between the rays $i_{k}i_{k-1}$ and $i_ki_{k+1}$. In MDPA, tour sequence and orientation angles are computed in an integrated manner. It is shown that the ABA and MDPA are constant factor approximation algorithms and ABA provides an improved upper bound as compared to Alternating Algorithm (AA) \cite{savla2008traveling}. Through numerical simulations, we show that ABA provides an improved tour length compared to AA, Single Vehicle Algorithm (SVA) \cite{rathinam2007resource} and Optimized Heading Algorithm (OHA) \cite{babel2020new,manyam2018tightly} when the Euclidean distance between any two points in the given set of points is at least $4\rho$ where $\rho$ is the minimum turning radius. The time complexity of ABA is comparable with AA and SVA and is better than OHA. Also we show that MDPA provides an improved tour length compared to AA and SVA and is comparable with OHA when there is no constraint on Euclidean distance between the points. In particular, ABA gives a tour length which is at most $4\%$ more than the ETSP tour length when the Euclidean distance between any two points in the given set of points is at least $4\rho$.

Download Full-text

Lossless text compression using GPT-2 language model and Huffman coding

SHS Web of Conferences ◽

10.1051/shsconf/202110204013 ◽

2021 ◽

Vol 102 ◽

pp. 04013

Author(s):

Md. Atiqur Rahman ◽

Mohamed Hamada

Keyword(s):

Data Compression ◽

State Of The Art ◽

Language Model ◽

Huffman Coding ◽

Original Text ◽

Text Compression ◽

Compression Technique ◽

Daily Life Activities ◽

Burrows Wheeler Transform ◽

Compressed Data

Modern daily life activities produced lots of information for the advancement of telecommunication. It is a challenging issue to store them on a digital device or transmit it over the Internet, leading to the necessity for data compression. Thus, research on data compression to solve the issue has become a topic of great interest to researchers. Moreover, the size of compressed data is generally smaller than its original. As a result, data compression saves storage and increases transmission speed. In this article, we propose a text compression technique using GPT-2 language model and Huffman coding. In this proposed method, Burrows-Wheeler transform and a list of keys are used to reduce the original text file’s length. Finally, we apply GPT-2 language mode and then Huffman coding for encoding. This proposed method is compared with the state-of-the-art techniques used for text compression. Finally, we show that the proposed method demonstrates a gain in compression ratio compared to the other state-of-the-art methods.

Download Full-text

A Multiscale Chaotic Feature Extraction Method for Speaker Recognition

Complexity ◽

10.1155/2020/8810901 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Jiang Lin ◽

Yi Yumei ◽

Zhang Maosheng ◽

Chen Defeng ◽

Wang Chao ◽

...

Keyword(s):

Feature Extraction ◽

Speaker Recognition ◽

Extraction Method ◽

State Of The Art ◽

Recognition System ◽

Nonlinear Dynamic Model ◽

Feature Extraction Method ◽

Analysis Technique ◽

Recognition Systems ◽

Environment Noise

In speaker recognition systems, feature extraction is a challenging task under environment noise conditions. To improve the robustness of the feature, we proposed a multiscale chaotic feature for speaker recognition. We use a multiresolution analysis technique to capture more finer information on different speakers in the frequency domain. Then, we extracted the speech chaotic characteristics based on the nonlinear dynamic model, which helps to improve the discrimination of features. Finally, we use a GMM-UBM model to develop a speaker recognition system. Our experimental results verified its good performance. Under clean speech and noise speech conditions, the ERR value of our method is reduced by 13.94% and 26.5% compared with the state-of-the-art method, respectively.

Download Full-text

APPROXIMATION ALGORITHMS FOR A VARIANT OF DISCRETE PIERCING SET PROBLEM FOR UNIT DISKS

International Journal of Computational Geometry & Applications ◽

10.1142/s021819591350009x ◽

2013 ◽

Vol 23 (06) ◽

pp. 461-477 ◽

Cited By ~ 7

Author(s):

MINATI DE ◽

GAUTAM K. DAS ◽

PAZ CARMI ◽

SUBHAS C. NANDY

Keyword(s):

Approximation Algorithms ◽

Simple Algorithm ◽

Constant Factor ◽

Performance Ratio ◽

Approximation Result ◽

Worst Case ◽

Approximation Factor ◽

Minimum Number ◽

Unit Disks ◽

Set Of Points

In this paper, we consider constant factor approximation algorithms for a variant of the discrete piercing set problem for unit disks. Here a set of points P is given; the objective is to choose minimum number of points in P to pierce the unit disks centered at all the points in P. We first propose a very simple algorithm that produces 12-approximation result in O(n log n) time. Next, we improve the approximation factor to 4 and then to 3. The worst case running time of these algorithms are O(n8 log n) and O(n15 log n) respectively. Apart from the space required for storing the input, the extra work-space requirement for each of these algorithms is O(1). Finally, we propose a PTAS for the same problem. Given a positive integer k, it can produce a solution with performance ratio [Formula: see text] in nO(k) time.

Download Full-text

Finding an Unknown Acyclic Orientation of a Given Graph

Combinatorics Probability Computing ◽

10.1017/s0963548309990289 ◽

2009 ◽

Vol 19 (1) ◽

pp. 121-131 ◽

Cited By ~ 4

Author(s):

OLEG PIKHURKO

Keyword(s):

Complete Graph ◽

Discrete Mathematics ◽

Np Hard ◽

Worst Case ◽

Acyclic Orientation ◽

Multiplicative Factor ◽

The Given

Let c(G) be the smallest number of edges we have to test in order to determine an unknown acyclic orientation of the given graph G in the worst case. For example, if G is the complete graph on n vertices, then c(G) is the smallest number of comparisons needed to sort n numbers.We prove that c(G) ≤ (1/4 + o(1))n2 for any graph G on n vertices, answering in the affirmative a question of Aigner, Triesch and Tuza [Discrete Mathematics144 (1995) 3–10]. Also, we show that, for every ϵ > 0, it is NP-hard to approximate the parameter c(G) within a multiplicative factor 74/73 − ϵ.

Download Full-text

A novel computational model for predicting potential LncRNA-disease associations based on both direct and indirect features of LncRNA-disease pairs

BMC Bioinformatics ◽

10.1186/s12859-020-03906-7 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Yubin Xiao ◽

Zheng Xiao ◽

Xiang Feng ◽

Zhiping Chen ◽

Linai Kuang ◽

...

Keyword(s):

Computational Model ◽

Cross Validation ◽

State Of The Art ◽

Prediction Methods ◽

Good Prediction ◽

Average Case ◽

Comparison Results ◽

Disease Associations ◽

Fold Cross Validation

Abstract Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.

Download Full-text

Influence Maximization with Priority in Online Social Networks

Algorithms ◽

10.3390/a13080183 ◽

2020 ◽

Vol 13 (8) ◽

pp. 183

Author(s):

Canh V. Pham ◽

Dung K. T. Ha ◽

Quang C. Vu ◽

Anh N. Su ◽

Huan X. Hoang

Keyword(s):

Social Network ◽

Online Social Networks ◽

Sampling Method ◽

State Of The Art ◽

Influence Maximization ◽

Approximation Solution ◽

Worst Case ◽

New Approach ◽

Influence Spread ◽

Seed Collections

The Influence Maximization (IM) problem, which finds a set of k nodes (called seedset) in a social network to initiate the influence spread so that the number of influenced nodes after propagation process is maximized, is an important problem in information propagation and social network analysis. However, previous studies ignored the constraint of priority that led to inefficient seed collections. In some real situations, companies or organizations often prioritize influencing potential users during their influence diffusion campaigns. With a new approach to these existing works, we propose a new problem called Influence Maximization with Priority (IMP) which finds out a set seed of k nodes in a social network to be able to influence the largest number of nodes subject to the influence spread to a specific set of nodes U (called priority set) at least a given threshold T in this paper. We show that the problem is NP-hard under well-known IC model. To find the solution, we propose two efficient algorithms, called Integrated Greedy (IG) and Integrated Greedy Sampling (IGS) with provable theoretical guarantees. IG provides a 1−(1−1k)t-approximation solution with t is an outcome of algorithm and t≥1. The worst-case approximation ratio is obtained when t=1 and it is equal to 1/k. In addition, IGS is an efficient randomized approximation algorithm based on sampling method that provides a 1−(1−1k)t−ϵ-approximation solution with probability at least 1−δ with ϵ>0,δ∈(0,1) as input parameters of the problem. We conduct extensive experiments on various real networks to compare our IGS algorithm to the state-of-the-art algorithms in IM problem. The results indicate that our algorithm provides better solutions interns of influence on the priority sets when approximately give twice to ten times higher than threshold T while running time, memory usage and the influence spread also give considerable results compared to the others.

Download Full-text

RF Performance of Si/SiGe MODFETs: A Simulation Study

VLSI Design ◽

10.1155/1998/29629 ◽

1998 ◽

Vol 8 (1-4) ◽

pp. 325-330

Author(s):

S. Roy ◽

A. Asenov ◽

S. Babiker ◽

J. R. Barker ◽

S. P. Beaumont

Keyword(s):

State Of The Art ◽

Strong Dependence ◽

Intrinsic Noise ◽

Performance Potential ◽

Figures Of Merit ◽

Analysis Technique ◽

Channel Velocity ◽

The Difference ◽

Rf Performance ◽

Gate Region

The microwave performance potential of Si/SiGe pseudomorphic MODFETs are studied, in comparison to state of the art InGaAs pseudomorphic HEMTs. Both devices have equivalent structures corresponding to a physical HEMT used for calibration. We use an RF analysis technique based on transient Monte Carlo simulations to estimate the intrinsic noise figures, the RF figures of merit fT and fmax, and the effect of contact and gate resistances. Both devices exhibit velocity overshoot below the gate region. It is shown that the difference in noise figures and fT values can be mainly attributed to differences in device channel velocity, fmax exhibits a strong dependence on device contact resistance, eroding some of the performance advantage of the pseudomorphic HEMT.

Download Full-text

Modeling of Dive Maneuvers for Executing Autonomous Dives With a Flapping Wing Air Vehicle

Journal of Mechanisms and Robotics ◽

10.1115/1.4037760 ◽

2017 ◽

Vol 9 (6) ◽

Cited By ~ 4

Author(s):

Luke J. Roberts ◽

Hugh A. Bruck ◽

S. K. Gupta

Keyword(s):

Computational Model ◽

Operation Mode ◽

Open Loop ◽

Flapping Wing ◽

Minimal Amount ◽

University Of Maryland ◽

Roll Control ◽

Wind Speeds ◽

Air Vehicle ◽

The University

This paper is focused on design of dive maneuvers that can be performed outdoors on flapping wing air vehicles (FWAVs) with a minimal amount of on-board computing capability. We present a simple computational model that provides accuracy of 5 m in open loop operation mode for outdoor dives under wind speeds of up to 3 m/s. This model is executed using a low power, on-board processor. We have also demonstrated that the platform can independently execute roll control through tail positioning, and dive control through wing positioning to produce safe dive behaviors. These capabilities were used to successfully demonstrate autonomous dive maneuvers on the Robo Raven platform developed at the University of Maryland.

Download Full-text

A worst case timing analysis technique for instruction prefetch buffers

Microprocessing and Microprogramming ◽

10.1016/0165-6074(94)90017-5 ◽

1994 ◽

Vol 40 (10-12) ◽

pp. 681-684 ◽

Cited By ~ 6

Author(s):

Minsuk Lee ◽

Sang Lyul Min ◽

Chong Sang Kim

Keyword(s):

Timing Analysis ◽

Worst Case ◽

Analysis Technique

Download Full-text