inverted index compression Latest Research Papers

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Entropy ◽

10.3390/e23030296 ◽

2021 ◽

Vol 23 (3) ◽

pp. 296

Author(s):

Andrzej Chmielowiec ◽

Paweł Litwin

Keyword(s):

Compression Algorithm ◽

Binary Sequences ◽

Inverted Index ◽

Small Decrease ◽

Database Applications ◽

Main Application ◽

Fixed Length ◽

Number Of Zeros ◽

Index Compression ◽

Inverted Index Compression

This article deals with compression of binary sequences with a given number of ones, which can also be considered as a list of indexes of a given length. The first part of the article shows that the entropy H of random n-element binary sequences with exactly k elements equal one satisfies the inequalities klog2(0.48·n/k)<H<klog2(2.72·n/k). Based on this result, we propose a simple coding using fixed length words. Its main application is the compression of random binary sequences with a large disproportion between the number of zeros and the number of ones. Importantly, the proposed solution allows for a much faster decompression compared with the Golomb-Rice coding with a relatively small decrease in the efficiency of compression. The proposed algorithm can be particularly useful for database applications for which the speed of decompression is much more important than the degree of index list compression.

Download Full-text

Techniques for Inverted Index Compression

ACM Computing Surveys ◽

10.1145/3415148 ◽

2020 ◽

Vol 53 (6) ◽

pp. 1-36

Author(s):

Giulio Ermanno Pibiri ◽

Rossano Venturini

Keyword(s):

Inverted Index ◽

Index Compression ◽

Inverted Index Compression

Download Full-text

An Improved Group Similarity-Based Association Rule Mining Algorithm in Complex Scenes

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001420590053 ◽

2019 ◽

Vol 34 (02) ◽

pp. 2059005

Author(s):

Guiduo Duan ◽

Xiaotong Wang ◽

Tianxi Huang ◽

Jürgen Kurths

Keyword(s):

Association Rule ◽

Comparison Method ◽

Rule Mining ◽

Complex Scene ◽

Index Compression ◽

Complex Scenes ◽

General Data ◽

Inverted Index Compression ◽

Spark Framework ◽

Group Similarity

Association rule (AR) mining in complex scene has attracted extensive attention of researchers in recent years. Typically, many researchers focused on an algorithm itself and ignored a generalization method to improve the performance of AR mining. Tuna et al., presented a general data structure Speeding-Up AR Structure with Inverted Index Compression (SAII) which could be utilized in most of the existing algorithms to improve their performance IEEE Trans. Cybern. 46(12) (2016) 3059–3072. However, we found that this algorithm consumes a lot of time in re-ordering data because a one-to-one comparison method is used in this process, which is the main reason that the speeding-up structure is difficult to establish when coping with much more large amount of data. To overcome these problems, this paper aims to propose an improved speeding-up AR algorithm based on group similarity and Apache Spark framework to further reduce the memory requirements and runtime. Our simulation results on the police business big dataset make clear that our improved approach performs well and is more suitable for a big data environment.

Download Full-text

Optimizing partitioning strategies for faster inverted index compression

Frontiers of Computer Science ◽

10.1007/s11704-016-6252-5 ◽

2019 ◽

Vol 13 (2) ◽

pp. 343-356 ◽

Cited By ~ 2

Author(s):

Xingshen Song ◽

Yuexiang Yang ◽

Yu Jiang ◽

Kun Jiang

Keyword(s):

Inverted Index ◽

Index Compression ◽

Inverted Index Compression

Download Full-text

Inverted Index Compression

Encyclopedia of Big Data Technologies ◽

10.1007/978-3-319-77525-8_52 ◽

2019 ◽

pp. 1051-1058 ◽

Cited By ~ 2

Author(s):

Giulio Ermanno Pibiri ◽

Rossano Venturini

Keyword(s):

Inverted Index ◽

Index Compression ◽

Inverted Index Compression

Download Full-text

New FastPFOR for Inverted File Compression

Handbook of Research on Biomimicry in Information Retrieval and Knowledge Management - Advances in Web Technologies and Engineering ◽

10.4018/978-1-5225-3004-6.ch006 ◽

2018 ◽

pp. 90-102

Author(s):

V. Glory ◽

S. Domnic

Keyword(s):

Information Retrieval ◽

Response Time ◽

Inverted Index ◽

Compression Technique ◽

Storage Structure ◽

Inverted File ◽

Index Compression ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Inverted Index Compression

Inverted index is used in most Information Retrieval Systems (IRS) to achieve the fast query response time. In inverted index, compression schemes are used to improve the efficiency of IRS. In this chapter, the authors study and analyze various compression techniques that are used for indexing. They also present a new compression technique that is based on FastPFOR called New FastPFOR. The storage structure and the integers' representation of the proposed method can improve its performances both in compression and decompression. The study on existing works shows that the recent research works provide good results either in compression or in decoding, but not in both. Hence, their decompression performance is not fair. To achieve better performance in decompression, the authors propose New FastPFOR in this chapter. To evaluate the performance of the proposed method, they experiment with TREC collections. The results show that the proposed method could achieve better decompression performance than the existing techniques.

Download Full-text

Inverted Index Compression

Encyclopedia of Big Data Technologies ◽

10.1007/978-3-319-63962-8_52-1 ◽

2018 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Giulio Ermanno Pibiri ◽

Rossano Venturini

Keyword(s):

Inverted Index ◽

Index Compression ◽

Inverted Index Compression

Download Full-text

Mapping the Semi-Structured Data to the Structured Data for Inverted Index Compression

International Journal of Database Theory and Application ◽

10.14257/ijdta.2017.10.1.22 ◽

2017 ◽

Vol 10 (1) ◽

pp. 235-244

Author(s):

B. Usharani

Keyword(s):

Structured Data ◽

Inverted Index ◽

Index Compression ◽

Inverted Index Compression

Download Full-text

Speeding-Up Association Rule Mining With Inverted Index Compression

IEEE Transactions on Cybernetics ◽

10.1109/tcyb.2015.2496175 ◽

2016 ◽

Vol 46 (12) ◽

pp. 3059-3072 ◽

Cited By ~ 19

Author(s):

Jose Maria Luna ◽

Alberto Cano ◽

Mykola Pechenizkiy ◽

Sebastian Ventura

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Inverted Index ◽

Rule Mining ◽

Index Compression ◽

Inverted Index Compression

Download Full-text

Leveraging Context-Free Grammar for Efficient Inverted Index Compression

Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval - SIGIR '16 ◽

10.1145/2911451.2911518 ◽

2016 ◽

Cited By ~ 9

Author(s):

Zhaohua Zhang ◽

Jiancong Tong ◽

Haibing Huang ◽

Jin Liang ◽

Tianlong Li ◽

...

Keyword(s):

Inverted Index ◽

Context Free Grammar ◽

Index Compression ◽

Inverted Index Compression ◽

Context Free

Download Full-text

inverted index compression
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Techniques for Inverted Index Compression

An Improved Group Similarity-Based Association Rule Mining Algorithm in Complex Scenes

Optimizing partitioning strategies for faster inverted index compression

Inverted Index Compression

New FastPFOR for Inverted File Compression

Inverted Index Compression

Mapping the Semi-Structured Data to the Structured Data for Inverted Index Compression

Speeding-Up Association Rule Mining With Inverted Index Compression

Leveraging Context-Free Grammar for Efficient Inverted Index Compression

Export Citation Format

inverted index compressionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Techniques for Inverted Index Compression

An Improved Group Similarity-Based Association Rule Mining Algorithm in Complex Scenes

Optimizing partitioning strategies for faster inverted index compression

Inverted Index Compression

New FastPFOR for Inverted File Compression

Inverted Index Compression

Mapping the Semi-Structured Data to the Structured Data for Inverted Index Compression

Speeding-Up Association Rule Mining With Inverted Index Compression

Leveraging Context-Free Grammar for Efficient Inverted Index Compression

inverted index compression
Recently Published Documents