index compression Latest Research Papers

Compression technique is basically used to compress the size of table or reduce the storage area. Oracle already gives this feature for the table compression as well as for the index compression. when index is created on particular column of a table then it contain some space, which require some storage or disk space by this technique we can save our disk space because in industry the company have to purchase the disk space according to the size of the their data and pay according to their disk space. To utilize this disk space for useful records data rather than wasting it. In this paper used the data pump utility for the compression of Bitmap index and table. Data pump utility performed for the logical backups in database.in this paper implemented data pump for compression, to release the space and change the index pointing location. It will not release the space even after deletion of records. This is of special interest for the case to compress the bitmap index and table space along with the’S (Data Manipulation Language).

Download Full-text

An Optimization of Bitmap Index Compression Technique in Bulk Data Movement Infrastructure

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/1099/1/012074 ◽

2021 ◽

Vol 1099 (1) ◽

pp. 012074

Author(s):

Manoj Kumar ◽

Tarun Kumar Gupta ◽

Deepak Umrao Sarwe

Keyword(s):

Bitmap Index ◽

Compression Technique ◽

Data Movement ◽

Index Compression ◽

Bulk Data

Download Full-text

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Entropy ◽

10.3390/e23030296 ◽

2021 ◽

Vol 23 (3) ◽

pp. 296

Author(s):

Andrzej Chmielowiec ◽

Paweł Litwin

Keyword(s):

Compression Algorithm ◽

Binary Sequences ◽

Inverted Index ◽

Small Decrease ◽

Database Applications ◽

Main Application ◽

Fixed Length ◽

Number Of Zeros ◽

Index Compression ◽

Inverted Index Compression

This article deals with compression of binary sequences with a given number of ones, which can also be considered as a list of indexes of a given length. The first part of the article shows that the entropy H of random n-element binary sequences with exactly k elements equal one satisfies the inequalities klog2(0.48·n/k)<H<klog2(2.72·n/k). Based on this result, we propose a simple coding using fixed length words. Its main application is the compression of random binary sequences with a large disproportion between the number of zeros and the number of ones. Importantly, the proposed solution allows for a much faster decompression compared with the Golomb-Rice coding with a relatively small decrease in the efficiency of compression. The proposed algorithm can be particularly useful for database applications for which the speed of decompression is much more important than the degree of index list compression.

Download Full-text

Unified Evaluation of Consolidation Parameters for Low to High Plastic Range of Cohesive Soils

Mehran University Research Journal of Engineering and Technology ◽

10.22581/muet1982.2101.09 ◽

2021 ◽

Vol 40 (1) ◽

pp. 93-103

Author(s):

Zia ur Rehman State ◽

Khalid Farooq ◽

Hassan Mujtaba ◽

Usama Khalid

Keyword(s):

Cohesive Soil ◽

Soil Samples ◽

Engineering Properties ◽

Cohesive Soils ◽

High Plasticity ◽

Index Properties ◽

Undisturbed Soil ◽

Index Compression ◽

The Right ◽

Consolidation Parameters

Knowing the engineering properties of geomaterials is imperative to make the right decision while designing and executing any geotechnical project. For the economical and safe geotechnical design, quick characterization of the compressibility properties of the cohesive soil is often desirable; these properties are indeed tedious to determine through actual tests. Therefore, correlating the consolidation parameters of the soils with its index properties has a great significance in the geotechnical engineering field. Several attempts have been made in the past to develop correlations between the consolidation parameters and index properties of the cohesive soils, within certain limitations. However, there is still a need to develop such correlations based on the extensive database, composing of unified plasticity range of soils, i.e., low to high plasticity. In the current study, 148 undisturbed soil specimens were obtained from different areas of Pakistan. Out of which 120 samples were utilized to develop correlations, and 28 samples were used to check the validity of the developed correlations. In order to enhance the index properties database, 30 more bentonite mixed soil samples were prepared and tested accordingly. Correlations to envisage different consolidation parameters such as compression index, compression ratio and coefficient of volume compressibility were developed using 150 cohesive soil samples of low to high plasticity. In addition, the performance of these developed correlations was verified on a set of 40 soil samples and compared with the performance of different correlations available in the literature. The percentage deviation in the prediction of compressibility characteristics through developed correlations in the present study was found to be very less, which endorsed the excellent reliability of the developed correlations.

Download Full-text

Techniques for Inverted Index Compression

ACM Computing Surveys ◽

10.1145/3415148 ◽

2020 ◽

Vol 53 (6) ◽

pp. 1-36

Author(s):

Giulio Ermanno Pibiri ◽

Rossano Venturini

Keyword(s):

Inverted Index ◽

Index Compression ◽

Inverted Index Compression

Download Full-text

Adaptation of Combinatorial Encoding Scheme to Index Compression

2020 28th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu49456.2020.9302061 ◽

2020 ◽

Author(s):

Can Ozbey

Keyword(s):

Encoding Scheme ◽

Index Compression ◽

Combinatorial Encoding

Download Full-text

Design and Analysis of Optimization and Tuning in Data Warehouses Using Bitmap Indexes

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200904171105 ◽

2020 ◽

Vol 13 ◽

Author(s):

Pankaj Dadheech ◽

Dinesh Goyal ◽

Ankit Kumar ◽

Amit Kumar Gupta

Keyword(s):

Internal Representation ◽

Physical Space ◽

Bitmap Index ◽

Disk Space ◽

Index Compression ◽

Insertion And Deletion ◽

Low Degree ◽

Main Index ◽

High Degree ◽

Bitmap Indexes

Introduction: An Index for Bitmaps is a special category that uses bitmaps or bit arrays in a database. Apache stores a bitmap for every index key in a bitmap file. Each main index stores multi-line pointers. Bitmap database management requires several time, but bitmap indexes are only appropriate for tables or tables that have occasionally updates. Method: Each bit of the map corresponds to a possible row id. If the bit is 1, it means that the row id contains this key value. An internal Oracle function converts the bit position to the corresponding row id, so that bitmap indexes offer the same functionality as B-tree indexes, despite the different internal representation. If the number of different values of the index is small, then the bitmap index will become very efficient in terms of the use of physical space. Result: Oracle involves the following compression features which are possible during the various operations in the database. This means we can compress the data on the following modes. There are several types of backup is possible in the database: • Whole Backup or partial backup • Full Backup or incremental backup • Cold or consistent backup • Hot or inconsistent backup Discussion: We study the current compression technologies, and add the compression of the bitmap index via the data pump. The bitmap index is more effective, for a minimum unique value, according to conventional wisdom. But it doesn't need either a bitmap index built on a high degree of cardinality or a low degree of cardinality through the data pump. In this paper, after deletion of documents, we propose data pump utility for releasing disk space in database. Bitmap index points the old location even after the table deletes information, this function does not release disk space. Conclusion: In this paper, we present the experiment evaluation of Bitmap Index Compression and release occupied disk space of database objects like table and indexes after deletion of records. Industrial database frequently allows the bulk data insertion and deletion. In database deletion of millions records from the table doesn't release occupied disk space immediately. Next steps in our research will be to release the disk space along with the deletion of records.

Download Full-text

Weighted minimizer sampling improves long read mapping

Bioinformatics ◽

10.1093/bioinformatics/btaa435 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i111-i118 ◽

Cited By ~ 6

Author(s):

Chirag Jain ◽

Arang Rhie ◽

Haowen Zhang ◽

Claudia Chu ◽

Brian P Walenz ◽

...

Keyword(s):

Unique Feature ◽

Linear Representation ◽

Mapping Accuracy ◽

Read Mapping ◽

Index Compression ◽

Long Read ◽

Reduced Space ◽

Algorithmic Technique ◽

Genomic Regions ◽

Eukaryotic Genomes

Abstract Motivation In this era of exponential data growth, minimizer sampling has become a standard algorithmic technique for rapid genome sequence comparison. This technique yields a sub-linear representation of sequences, enabling their comparison in reduced space and time. A key property of the minimizer technique is that if two sequences share a substring of a specified length, then they can be guaranteed to have a matching minimizer. However, because the k-mer distribution in eukaryotic genomes is highly uneven, minimizer-based tools (e.g. Minimap2, Mashmap) opt to discard the most frequently occurring minimizers from the genome to avoid excessive false positives. By doing so, the underlying guarantee is lost and accuracy is reduced in repetitive genomic regions. Results We introduce a novel weighted-minimizer sampling algorithm. A unique feature of the proposed algorithm is that it performs minimizer sampling while considering a weight for each k-mer; i.e. the higher the weight of a k-mer, the more likely it is to be selected. By down-weighting frequently occurring k-mers, we are able to meet both objectives: (i) avoid excessive false-positive matches and (ii) maintain the minimizer match guarantee. We tested our algorithm, Winnowmap, using both simulated and real long-read data and compared it to a state-of-the-art long read mapper, Minimap2. Our results demonstrate a reduction in the mapping error-rate from 0.14% to 0.06% in the recently finished human X chromosome (154.3 Mbp), and from 3.6% to 0% within the highly repetitive X centromere (3.1 Mbp). Winnowmap improves mapping accuracy within repeats and achieves these results with sparser sampling, leading to better index compression and competitive runtimes. Availability and implementation Winnowmap is built on top of the Minimap2 codebase and is available at https://github.com/marbl/winnowmap.

Download Full-text

Weighted minimizer sampling improves long read mapping

10.1101/2020.02.11.943241 ◽

2020 ◽

Cited By ~ 2

Author(s):

Chirag Jain ◽

Arang Rhie ◽

Haowen Zhang ◽

Claudia Chu ◽

Sergey Koren ◽

...

Keyword(s):

Unique Feature ◽

Linear Representation ◽

Mapping Accuracy ◽

Read Mapping ◽

Index Compression ◽

Long Read ◽

Reduced Space ◽

Algorithmic Technique ◽

Genomic Regions ◽

Eukaryotic Genomes

AbstractMotivationIn this era of exponential data growth, minimizer sampling has become a standard algorithmic technique for rapid genome sequence comparison. This technique yields a sub-linear representation of sequences, enabling their comparison in reduced space and time. A key property of the minimizer technique is that if two sequences share a substring of a specified length, then they can be guaranteed to have a matching minimizer. However, because the k-mer distribution in eukaryotic genomes is highly uneven, minimizer-based tools (e.g., Minimap2, Mashmap) opt to discard the most frequently occurring minimizers from the genome in order to avoid excessive false positives. By doing so, the underlying guarantee is lost and accuracy is reduced in repetitive genomic regions.ResultsWe introduce a novel weighted-minimizer sampling algorithm. A unique feature of the proposed algorithm is that it performs minimizer sampling while taking into account a weight for each k-mer; i.e, the higher the weight of a k-mer, the more likely it is to be selected. By down-weighting frequently occurring k-mers, we are able to meet both objectives: (i) avoid excessive false-positive matches, and (ii) maintain the minimizer match guarantee. We tested our algorithm, Winnowmap, using both simulated and real long-read data and compared it to a state-of-the-art long read mapper, Minimap2. Our results demonstrate a reduction in the mapping error-rate from 0.14% to 0.06% in the recently finished human X chromosome (154.3 Mbp), and from 3.6% to 0% within the highly repetitive X centromere (3.1 Mbp). Winnowmap improves mapping accuracy within repeats and achieves these results with sparser sampling, leading to better index compression and competitive [email protected] is built on top of the Minimap2 codebase (Li, 2018) and is available at https://github.com/marbl/winnowmap.

Download Full-text

Forward Index Compression for Instance Retrieval in an Augmented Reality Application

2019 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata47090.2019.9006023 ◽

2019 ◽

Author(s):

Qi Wang ◽

Michal Siedlaczek ◽

Yen-Yu Chen ◽

Michael Gormish ◽

Torsten Suel

Keyword(s):

Augmented Reality ◽

Index Compression ◽

Augmented Reality Application ◽

Instance Retrieval

Download Full-text

index compression
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

An Improved Framework for Bitmap Indexes and their Use in Data Warehouse Optimization

An Optimization of Bitmap Index Compression Technique in Bulk Data Movement Infrastructure

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Unified Evaluation of Consolidation Parameters for Low to High Plastic Range of Cohesive Soils

Techniques for Inverted Index Compression

Adaptation of Combinatorial Encoding Scheme to Index Compression

Design and Analysis of Optimization and Tuning in Data Warehouses Using Bitmap Indexes

Weighted minimizer sampling improves long read mapping

Weighted minimizer sampling improves long read mapping

Forward Index Compression for Instance Retrieval in an Augmented Reality Application

Export Citation Format

index compressionRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

An Improved Framework for Bitmap Indexes and their Use in Data Warehouse Optimization

An Optimization of Bitmap Index Compression Technique in Bulk Data Movement Infrastructure

Efficient Inverted Index Compression Algorithm Characterized by Faster Decompression Compared with the Golomb-Rice Algorithm

Unified Evaluation of Consolidation Parameters for Low to High Plastic Range of Cohesive Soils

Techniques for Inverted Index Compression

Adaptation of Combinatorial Encoding Scheme to Index Compression

Design and Analysis of Optimization and Tuning in Data Warehouses Using Bitmap Indexes

Weighted minimizer sampling improves long read mapping

Weighted minimizer sampling improves long read mapping

Forward Index Compression for Instance Retrieval in an Augmented Reality Application

index compression
Recently Published Documents