index compression
Recently Published Documents


TOTAL DOCUMENTS

77
(FIVE YEARS 4)

H-INDEX

12
(FIVE YEARS 0)

Author(s):  
Dr. J. Preetha, Et. al.

Compression technique is basically used to compress the size of table or reduce the storage area. Oracle already gives this feature for the table compression as well as for the index compression. when index is created on particular column of a table then it contain some space, which require some storage or disk space by this technique we can save our disk space because in industry the company have to purchase the disk space  according to the size of the their data and pay according to their disk space. To utilize this disk space for useful records data rather than wasting it. In this paper used the data pump utility for the compression of Bitmap index and table. Data pump utility performed for the logical backups in database.in this paper implemented data pump for compression, to release the space and change the index pointing location. It will not release the space even after deletion of records. This is of special interest for the case to compress the bitmap index and table space along with the’S (Data Manipulation Language).


Entropy ◽  
2021 ◽  
Vol 23 (3) ◽  
pp. 296
Author(s):  
Andrzej Chmielowiec ◽  
Paweł Litwin

This article deals with compression of binary sequences with a given number of ones, which can also be considered as a list of indexes of a given length. The first part of the article shows that the entropy H of random n-element binary sequences with exactly k elements equal one satisfies the inequalities klog2(0.48·n/k)<H<klog2(2.72·n/k). Based on this result, we propose a simple coding using fixed length words. Its main application is the compression of random binary sequences with a large disproportion between the number of zeros and the number of ones. Importantly, the proposed solution allows for a much faster decompression compared with the Golomb-Rice coding with a relatively small decrease in the efficiency of compression. The proposed algorithm can be particularly useful for database applications for which the speed of decompression is much more important than the degree of index list compression.


Author(s):  
Zia ur Rehman State ◽  
Khalid Farooq ◽  
Hassan Mujtaba ◽  
Usama Khalid

Knowing the engineering properties of geomaterials is imperative to make the right decision while designing and executing any geotechnical project. For the economical and safe geotechnical design, quick characterization of the compressibility properties of the cohesive soil is often desirable; these properties are indeed tedious to determine through actual tests. Therefore, correlating the consolidation parameters of the soils with its index properties has a great significance in the geotechnical engineering field. Several attempts have been made in the past to develop correlations between the consolidation parameters and index properties of the cohesive soils, within certain limitations. However, there is still a need to develop such correlations based on the extensive database, composing of unified plasticity range of soils, i.e., low to high plasticity. In the current study, 148 undisturbed soil specimens were obtained from different areas of Pakistan. Out of which 120 samples were utilized to develop correlations, and 28 samples were used to check the validity of the developed correlations. In order to enhance the index properties database, 30 more bentonite mixed soil samples were prepared and tested accordingly. Correlations to envisage different consolidation parameters such as compression index, compression ratio and coefficient of volume compressibility were developed using 150 cohesive soil samples of low to high plasticity. In addition, the performance of these developed correlations was verified on a set of 40 soil samples and compared with the performance of different correlations available in the literature. The percentage deviation in the prediction of compressibility characteristics through developed correlations in the present study was found to be very less, which endorsed the excellent reliability of the developed correlations.


2020 ◽  
Vol 53 (6) ◽  
pp. 1-36
Author(s):  
Giulio Ermanno Pibiri ◽  
Rossano Venturini

Author(s):  
Pankaj Dadheech ◽  
Dinesh Goyal ◽  
Ankit Kumar ◽  
Amit Kumar Gupta

Introduction: An Index for Bitmaps is a special category that uses bitmaps or bit arrays in a database. Apache stores a bitmap for every index key in a bitmap file. Each main index stores multi-line pointers. Bitmap database management requires several time, but bitmap indexes are only appropriate for tables or tables that have occasionally updates. Method: Each bit of the map corresponds to a possible row id. If the bit is 1, it means that the row id contains this key value. An internal Oracle function converts the bit position to the corresponding row id, so that bitmap indexes offer the same functionality as B-tree indexes, despite the different internal representation. If the number of different values of the index is small, then the bitmap index will become very efficient in terms of the use of physical space. Result: Oracle involves the following compression features which are possible during the various operations in the database. This means we can compress the data on the following modes. There are several types of backup is possible in the database: • Whole Backup or partial backup • Full Backup or incremental backup • Cold or consistent backup • Hot or inconsistent backup Discussion: We study the current compression technologies, and add the compression of the bitmap index via the data pump. The bitmap index is more effective, for a minimum unique value, according to conventional wisdom. But it doesn't need either a bitmap index built on a high degree of cardinality or a low degree of cardinality through the data pump. In this paper, after deletion of documents, we propose data pump utility for releasing disk space in database. Bitmap index points the old location even after the table deletes information, this function does not release disk space. Conclusion: In this paper, we present the experiment evaluation of Bitmap Index Compression and release occupied disk space of database objects like table and indexes after deletion of records. Industrial database frequently allows the bulk data insertion and deletion. In database deletion of millions records from the table doesn't release occupied disk space immediately. Next steps in our research will be to release the disk space along with the deletion of records.


2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i111-i118 ◽  
Author(s):  
Chirag Jain ◽  
Arang Rhie ◽  
Haowen Zhang ◽  
Claudia Chu ◽  
Brian P Walenz ◽  
...  

Abstract Motivation In this era of exponential data growth, minimizer sampling has become a standard algorithmic technique for rapid genome sequence comparison. This technique yields a sub-linear representation of sequences, enabling their comparison in reduced space and time. A key property of the minimizer technique is that if two sequences share a substring of a specified length, then they can be guaranteed to have a matching minimizer. However, because the k-mer distribution in eukaryotic genomes is highly uneven, minimizer-based tools (e.g. Minimap2, Mashmap) opt to discard the most frequently occurring minimizers from the genome to avoid excessive false positives. By doing so, the underlying guarantee is lost and accuracy is reduced in repetitive genomic regions. Results We introduce a novel weighted-minimizer sampling algorithm. A unique feature of the proposed algorithm is that it performs minimizer sampling while considering a weight for each k-mer; i.e. the higher the weight of a k-mer, the more likely it is to be selected. By down-weighting frequently occurring k-mers, we are able to meet both objectives: (i) avoid excessive false-positive matches and (ii) maintain the minimizer match guarantee. We tested our algorithm, Winnowmap, using both simulated and real long-read data and compared it to a state-of-the-art long read mapper, Minimap2. Our results demonstrate a reduction in the mapping error-rate from 0.14% to 0.06% in the recently finished human X chromosome (154.3 Mbp), and from 3.6% to 0% within the highly repetitive X centromere (3.1 Mbp). Winnowmap improves mapping accuracy within repeats and achieves these results with sparser sampling, leading to better index compression and competitive runtimes. Availability and implementation Winnowmap is built on top of the Minimap2 codebase and is available at https://github.com/marbl/winnowmap.


Author(s):  
Chirag Jain ◽  
Arang Rhie ◽  
Haowen Zhang ◽  
Claudia Chu ◽  
Sergey Koren ◽  
...  

AbstractMotivationIn this era of exponential data growth, minimizer sampling has become a standard algorithmic technique for rapid genome sequence comparison. This technique yields a sub-linear representation of sequences, enabling their comparison in reduced space and time. A key property of the minimizer technique is that if two sequences share a substring of a specified length, then they can be guaranteed to have a matching minimizer. However, because the k-mer distribution in eukaryotic genomes is highly uneven, minimizer-based tools (e.g., Minimap2, Mashmap) opt to discard the most frequently occurring minimizers from the genome in order to avoid excessive false positives. By doing so, the underlying guarantee is lost and accuracy is reduced in repetitive genomic regions.ResultsWe introduce a novel weighted-minimizer sampling algorithm. A unique feature of the proposed algorithm is that it performs minimizer sampling while taking into account a weight for each k-mer; i.e, the higher the weight of a k-mer, the more likely it is to be selected. By down-weighting frequently occurring k-mers, we are able to meet both objectives: (i) avoid excessive false-positive matches, and (ii) maintain the minimizer match guarantee. We tested our algorithm, Winnowmap, using both simulated and real long-read data and compared it to a state-of-the-art long read mapper, Minimap2. Our results demonstrate a reduction in the mapping error-rate from 0.14% to 0.06% in the recently finished human X chromosome (154.3 Mbp), and from 3.6% to 0% within the highly repetitive X centromere (3.1 Mbp). Winnowmap improves mapping accuracy within repeats and achieves these results with sparser sampling, leading to better index compression and competitive [email protected] is built on top of the Minimap2 codebase (Li, 2018) and is available at https://github.com/marbl/winnowmap.


Sign in / Sign up

Export Citation Format

Share Document