scholarly journals A Computable Measure of Algorithmic Probability by Finite Approximations with an Application to Integer Sequences

Complexity ◽  
2017 ◽  
Vol 2017 ◽  
pp. 1-10 ◽  
Author(s):  
Fernando Soler-Toscano ◽  
Hector Zenil

Given the widespread use of lossless compression algorithms to approximate algorithmic (Kolmogorov-Chaitin) complexity and that, usually, generic lossless compression algorithms fall short at characterizing features other than statistical ones not different from entropy evaluations, here we explore an alternative and complementary approach. We study formal properties of a Levin-inspired measure m calculated from the output distribution of small Turing machines. We introduce and justify finite approximations mk that have been used in some applications as an alternative to lossless compression algorithms for approximating algorithmic (Kolmogorov-Chaitin) complexity. We provide proofs of the relevant properties of both m and mk and compare them to Levin’s Universal Distribution. We provide error estimations of mk with respect to m. Finally, we present an application to integer sequences from the On-Line Encyclopedia of Integer Sequences, which suggests that our AP-based measures may characterize nonstatistical patterns, and we report interesting correlations with textual, function, and program description lengths of the said sequences.

Author(s):  
T. Narasimhulu

Computer systems and micro architecture researchers have proposed using hardware data compression units within the memory hierarchies of microprocessors in order to improve performance, energy efficiency, and functionality. However, most past work, and all work on cache compression, has made unsubstantiated assumptions about the performance, power consumption, and area overheads of the proposed compression algorithms and hardware. In this work, I present a lossless compression algorithm that has been designed for fast on-line data compression, and cache compression in particular. The algorithm has a number of novel features tailored for this application, including combining pairs of compressed lines into one cache line and allowing parallel compression of multiple words while using a single dictionary and without degradation in compression ratio. We reduced the proposed algorithm to a register transfer level hardware design, permitting performance, power consumption, and area estimation.


2019 ◽  
Vol 11 (21) ◽  
pp. 2461 ◽  
Author(s):  
Kevin Chow ◽  
Dion Tzamarias ◽  
Ian Blanes ◽  
Joan Serra-Sagristà

This paper proposes a lossless coder for real-time processing and compression of hyperspectral images. After applying either a predictor or a differential encoder to reduce the bit rate of an image by exploiting the close similarity in pixels between neighboring bands, it uses a compact data structure called k 2 -raster to further reduce the bit rate. The advantage of using such a data structure is its compactness, with a size that is comparable to that produced by some classical compression algorithms and yet still providing direct access to its content for query without any need for full decompression. Experiments show that using k 2 -raster alone already achieves much lower rates (up to 55% reduction), and with preprocessing, the rates are further reduced up to 64%. Finally, we provide experimental results that show that the predictor is able to produce higher rates reduction than differential encoding.


2007 ◽  
Author(s):  
Srikanth Gottipati ◽  
Jamal Goddard ◽  
Michael Grossberg ◽  
Irina Gladkova

2016 ◽  
Vol 2016 ◽  
pp. 1-7 ◽  
Author(s):  
Pamela Vinitha Eric ◽  
Gopakumar Gopalakrishnan ◽  
Muralikrishnan Karunakaran

This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms.


2020 ◽  
Author(s):  
Miaoshan Lu ◽  
Shaowei An ◽  
Ruimin Wang ◽  
Jinyin Wang ◽  
Changbin Yu

ABSTRACTWith the precision of mass spectrometer going higher and the emergence of data independence acquisition (DIA), the file size is increasing rapidly. Beyond the widely-used open format mzML (Deutsch 2008), near-lossless or lossless compression algorithms and formats have emerged. The data precision is often related to the instrument and subsequent processing algorithms. Unlike storage-oriented formats, which focusing more on lossless compression and compression rate, computation-oriented formats focus as much on decoding speed and disk read strategy as compression rate. Here we describe “Aird", an opensource and computation-oriented format with controllable precision, flexible indexing strategies and high compression rate. Aird uses JavaScript Object Notation (JSON) for metadata storage, multiple indexing, and reordered storage strategies for higher speed of data randomly reading. Aird also provides a novel compressor called Zlib-Diff-PforDelta (ZDPD) for m/z data compression. Compared with Zlib only, m/z data size is about 65% lower in Aird, and merely takes 33% decoding time.AvailabilityAird SDK is written in Java, which allow scholars to access mass spectrometry data efficiently. It is available at https://github.com/Propro-Studio/Aird-SDK AirdPro can convert vendor files into Aird files, which is available at https://github.com/Propro-Studio/AirdPro


2018 ◽  
Vol 12 (11) ◽  
pp. 387
Author(s):  
Evon Abu-Taieh ◽  
Issam AlHadid

Multimedia is highly competitive world, one of the properties that is reflected is speed of download and upload of multimedia elements: text, sound, pictures, animation. This paper presents CRUSH algorithm which is a lossless compression algorithm. CRUSH algorithm can be used to compress files. CRUSH method is fast and simple with time complexity O(n) where n is the number of elements being compressed.Furthermore, compressed file is independent from algorithm and unnecessary data structures. As the paper will show comparison with other compression algorithms like Shannon–Fano code, Huffman coding, Run Length Encoding, Arithmetic Coding, Lempel-Ziv-Welch (LZW), Run Length Encoding (RLE), Burrows-Wheeler Transform.Move-to-Front (MTF) Transform, Haar, wavelet tree, Delta Encoding, Rice &Golomb Coding, Tunstall coding, DEFLATE algorithm, Run-Length Golomb-Rice (RLGR).


Sign in / Sign up

Export Citation Format

Share Document