Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation

Muhammed Oğuzhan Külekci; Yasin Öztürk

doi:10.3390/a12040078

Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation

Algorithms ◽

10.3390/a12040078 ◽

2019 ◽

Vol 12 (4) ◽

pp. 78

Author(s):

Muhammed Oğuzhan Külekci ◽

Yasin Öztürk

Keyword(s):

Original Data ◽

Data Representation ◽

Code Word ◽

Direct Access ◽

Privacy Concerns ◽

High Entropy ◽

Compressed Data Structures ◽

Encryption Schemes ◽

Boundary Information ◽

Compressed Data

Non-uniquely-decodable (non-UD) codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non–prefix–free codes, where a code-word can be a prefix of other(s), and thus, the code-word boundary information is essential for correct decoding. Due to their inherent unique decodability problem, such non-UD codes have not received much attention except a few studies, in which using compressed data structures to represent the disambiguation information efficiently had been previously proposed. It had been shown before that the compression ratio can get quite close to Huffman/Arithmetic codes with an additional capability of providing direct access in compressed data, which is a missing feature in the regular Huffman codes. In this study we investigate non-UD codes in another dimension addressing the privacy of the high-entropy data. We particularly focus on such massive volumes, where typical examples are encoded video or similar multimedia files. Representation of such a volume with non–UD coding creates two elements as the disambiguation information and the payload, where decoding the original data from these elements becomes hard when one of them is missing. We make use of this observation for privacy concerns. and study the space consumption as well as the hardness of that decoding. We conclude that non-uniquely-decodable codes can be an alternative to selective encryption schemes that aim to secure only part of the data when data is huge. We provide a freely available software implementation of the proposed scheme as well.

Download Full-text

Encryption of Stereo Images after Compression by Advanced Encryption Standard (AES)

Al-Mustansiriyah Journal of Science ◽

10.23851/mjs.v28i2.511 ◽

2018 ◽

Vol 28 (2) ◽

pp. 156 ◽

Cited By ~ 2

Author(s):

Marwah K Hussien

Keyword(s):

Search Algorithm ◽

Signal To Noise Ratio ◽

Original Data ◽

Advanced Encryption Standard ◽

Stereo Images ◽

Partial Encryption ◽

Compression Performance ◽

Encryption Schemes ◽

Disparity Vector ◽

Compressed Data

New partial encryption schemes are proposed, in which a secure encryption algorithm is used to encrypt only part of the compressed data. Partial encryption applied after application of image compression algorithm. Only 0.0244%-25% of the original data isencrypted for two pairs of dif-ferent grayscale imageswiththe size (256 ´ 256) pixels. As a result, we see a significant reduction of time in the stage of encryption and decryption. In the compression step, the Orthogonal Search Algorithm (OSA) for motion estimation (the dif-ferent between stereo images) is used. The resulting disparity vector and the remaining image were compressed by Discrete Cosine Transform (DCT), Quantization and arithmetic encoding. The image compressed was encrypted by Advanced Encryption Standard (AES). The images were then decoded and were compared with the original images. Experimental results showed good results in terms of Peak Signal-to-Noise Ratio (PSNR), Com-pression Ratio (CR) and processing time. The proposed partial encryption schemes are fast, se-cure and do not reduce the compression performance of the underlying selected compression methods

Download Full-text

Compressed data structures for bi-objective {0,1}-knapsack problems

Computers & Operations Research ◽

10.1016/j.cor.2017.08.008 ◽

2018 ◽

Vol 89 ◽

pp. 82-93

Author(s):

Pedro Correia ◽

Luís Paquete ◽

José Rui Figueira

Keyword(s):

Data Structures ◽

Knapsack Problems ◽

Compressed Data Structures ◽

Compressed Data

Download Full-text

An open source database for the synthesis of soil radiocarbon data: ISRaD version 1.0

10.5194/essd-2019-55 ◽

2019 ◽

Cited By ~ 2

Author(s):

Corey R. Lawrence ◽

Jeffery Beem-Miller ◽

Alison M. Hoyt ◽

Grey Monroe ◽

Carlos A. Sierra ◽

...

Keyword(s):

Soil Carbon ◽

Open Source ◽

Soil Surface ◽

Original Data ◽

R Package ◽

Direct Access ◽

Temporal Scales ◽

Spatial And Temporal Scales ◽

Starting Point ◽

Radiocarbon Data

Abstract. Radiocarbon is a critical constraint on our estimates of the timescales of soil carbon cycling that can aid in identifying mechanisms of carbon stabilization and destabilization, and improve forecast of soil carbon response to management or environmental change. Despite the wealth of soil radiocarbon data that has been reported over the past 75 years, the ability to apply these data to global scale questions is limited by our capacity to synthesis and compare measurements generated using a variety of methods. Here we describe the International Soil Radiocarbon Database (ISRaD, soilradiocarbon.org), an open-source archive of soils data that include data from bulk soils, or whole-soils; distinct soil carbon pools isolated in the laboratory by a variety of soil fractionation methods; samples of soil gas or water collected interstitially from within an intact soil profile; CO2 gas isolated from laboratory soil incubations; and fluxes collected in situ from a soil surface. The core of ISRaD is a relational database structured around individual datasets (entries) and organized hierarchically to report soil radiocarbon data, measured at different physical and temporal scales, as well as other soil or environmental properties that may also be measured at one or more levels of the hierarchy that may assist with interpretation and context. Anyone may contribute their own data to the database by entering it into the ISRaD template and subjecting it to quality assurance protocols. ISRaD can be accessed through: (1) a web-based interface, (2) an R package (ISRaD), or (3) direct access to code and data through the GitHub repository, which hosts both code and data. The design of ISRaD allows for participants to become directly involved in the management, design, and application of ISRaD data. The synthesized dataset is available in two forms: the original data as reported by the authors of the datasets; and an enhanced dataset that includes ancillary geospatial data calculated within the ISRaD framework. ISRaD also provides data management tools in the ISRaD-R package that provide a starting point for data analysis. This community-based dataset and platform for soil radiocarbon and a wide array of additional soils data information in soils where data are easy to contribute and the community is invited to add tools and ideas for improvement. As a whole, ISRaD provides resources that can aid our evaluation of soil dynamics and improve our understanding of controls on soil carbon dynamics across a range of spatial and temporal scales. The ISRaD v1.0 dataset (Lawrence et al., 2019) is archived and freely available at https://doi.org/10.5281/zenodo.2613911.

Download Full-text

DIMENSIONALITY REDUCTION VIA AN ORTHOGONAL AUTOENCODER APPROACH FOR HYPERSPECTRAL IMAGE CLASSIFICATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2020-357-2020 ◽

2020 ◽

Vol XLIII-B3-2020 ◽

pp. 357-362

Author(s):

V. H. Ayma ◽

V. A. Ayma ◽

J. Gutierrez

Keyword(s):

Dimensionality Reduction ◽

Hyperspectral Image ◽

Original Data ◽

Data Representation ◽

Hyperspectral Data ◽

Attractive Alternative ◽

Kennedy Space Center ◽

Classification Tasks ◽

Multiple Levels ◽

Kennedy Space

Abstract. Nowadays, the increasing amount of information provided by hyperspectral sensors requires optimal solutions to ease the subsequent analysis of the produced data. A common issue in this matter relates to the hyperspectral data representation for classification tasks. Existing approaches address the data representation problem by performing a dimensionality reduction over the original data. However, mining complementary features that reduce the redundancy from the multiple levels of hyperspectral images remains challenging. Thus, exploiting the representation power of neural networks based techniques becomes an attractive alternative in this matter. In this work, we propose a novel dimensionality reduction implementation for hyperspectral imaging based on autoencoders, ensuring the orthogonality among features to reduce the redundancy in hyperspectral data. The experiments conducted on the Pavia University, the Kennedy Space Center, and Botswana hyperspectral datasets evidence such representation power of our approach, leading to better classification performances compared to traditional hyperspectral dimensionality reduction algorithms.

Download Full-text

Compressed Data Structures for Range Searching

Language and Automata Theory and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-15579-1_45 ◽

2015 ◽

pp. 577-586 ◽

Cited By ~ 1

Author(s):

Philip Bille ◽

Inge Li Gørtz ◽

Søren Vind

Keyword(s):

Data Structures ◽

Range Searching ◽

Compressed Data Structures ◽

Compressed Data

Download Full-text

Robust hypergraph regularized non-negative matrix factorization for sample clustering and feature selection in multi-view gene expression data

Human Genomics ◽

10.1186/s40246-019-0222-6 ◽

2019 ◽

Vol 13 (S1) ◽

Cited By ~ 1

Author(s):

Na Yu ◽

Ying-Lian Gao ◽

Jin-Xing Liu ◽

Juan Wang ◽

Junliang Shang

Keyword(s):

Feature Selection ◽

Matrix Factorization ◽

Gene Selection ◽

Matrix Decomposition ◽

Original Data ◽

Data Representation ◽

Abnormal Expression ◽

Sample Points ◽

Laplacian Regularization ◽

Non Negative Matrix Factorization

Abstract Background As one of the most popular data representation methods, non-negative matrix decomposition (NMF) has been widely concerned in the tasks of clustering and feature selection. However, most of the previously proposed NMF-based methods do not adequately explore the hidden geometrical structure in the data. At the same time, noise and outliers are inevitably present in the data. Results To alleviate these problems, we present a novel NMF framework named robust hypergraph regularized non-negative matrix factorization (RHNMF). In particular, the hypergraph Laplacian regularization is imposed to capture the geometric information of original data. Unlike graph Laplacian regularization which captures the relationship between pairwise sample points, it captures the high-order relationship among more sample points. Moreover, the robustness of the RHNMF is enhanced by using the L2,1-norm constraint when estimating the residual. This is because the L2,1-norm is insensitive to noise and outliers. Conclusions Clustering and common abnormal expression gene (com-abnormal expression gene) selection are conducted to test the validity of the RHNMF model. Extensive experimental results on multi-view datasets reveal that our proposed model outperforms other state-of-the-art methods.

Download Full-text

Compressed Data Structures for Binary Relations in Practice

IEEE Access ◽

10.1109/access.2020.2970983 ◽

2020 ◽

Vol 8 ◽

pp. 25949-25963

Author(s):

Carlos Quijada Fuentes ◽

Miguel R. Penabad ◽

Susana Ladra ◽

Gilberto Gutierrez Retamal

Keyword(s):

Data Structures ◽

Binary Relations ◽

Compressed Data Structures ◽

Compressed Data

Download Full-text

A fast sequence assembly method based on compressed data structures

2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society ◽

10.1109/embc.2014.6943595 ◽

2014 ◽

Author(s):

Peifeng Liang ◽

Yancong Zhang ◽

Kui Lin ◽

Jinglu Hu

Keyword(s):

Data Structures ◽

Sequence Assembly ◽

Assembly Method ◽

Compressed Data Structures ◽

Compressed Data

Download Full-text

Compressed data structures for annotated web search

Proceedings of the 21st international conference on World Wide Web - WWW '12 ◽

10.1145/2187836.2187854 ◽

2012 ◽

Cited By ~ 7

Author(s):

Soumen Chakrabarti ◽

Sasidhar Kasturi ◽

Bharath Balakrishnan ◽

Ganesh Ramakrishnan ◽

Rohit Saraf

Keyword(s):

Data Structures ◽

Web Search ◽

Compressed Data Structures ◽

Compressed Data

Download Full-text

The PlaNet Consortium: A Network of European Plant Databases Connecting Plant Genome Data in an Integrated Biological Knowledge Resource

Comparative and Functional Genomics ◽

10.1002/cfg.374 ◽

2004 ◽

Vol 5 (2) ◽

pp. 184-189 ◽

Cited By ~ 3

Author(s):

H. Schoof ◽

R. Ernst ◽

K. F. X. Mayer

Keyword(s):

Data Exchange ◽

Data Representation ◽

Data Models ◽

Biological Data ◽

Plant Genome ◽

Data Sources ◽

Direct Access ◽

Biological Knowledge ◽

Database Integration ◽

Complex Data

The completion of theArabidopsisgenome and the large collections of other plant sequences generated in recent years have sparked extensive functional genomics efforts. However, the utilization of this data is inefficient, as data sources are distributed and heterogeneous and efforts at data integration are lagging behind. PlaNet aims to overcome the limitations of individual efforts as well as the limitations of heterogeneous, independent data collections. PlaNet is a distributed effort among European bioinformatics groups and plant molecular biologists to establish a comprehensive integrated database in a collaborative network. Objectives are the implementation of infrastructure and data sources to capture plant genomic information into a comprehensive, integrated platform. This will facilitate the systematic exploration ofArabidopsisand other plants. New methods for data exchange, database integration and access are being developed to create a highly integrated, federated data resource for research. The connection between the individual resources is realized with BioMOBY. BioMOBY provides an architecture for the discovery and distribution of biological data through web services. While knowledge is centralized, data is maintained at its primary source without a need for warehousing. To standardize nomenclature and data representation, ontologies and generic data models are defined in interaction with the relevant communities.Minimal data models should make it simple to allow broad integration, while inheritance allows detail and depth to be added to more complex data objects without losing integration. To allow expert annotation and keep databases curated, local and remote annotation interfaces are provided. Easy and direct access to all data is key to the project.

Download Full-text