scholarly journals Applications of Non-Uniquely Decodable Codes to Privacy-Preserving High-Entropy Data Representation

Algorithms ◽  
2019 ◽  
Vol 12 (4) ◽  
pp. 78
Author(s):  
Muhammed Oğuzhan Külekci ◽  
Yasin Öztürk

Non-uniquely-decodable (non-UD) codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non–prefix–free codes, where a code-word can be a prefix of other(s), and thus, the code-word boundary information is essential for correct decoding. Due to their inherent unique decodability problem, such non-UD codes have not received much attention except a few studies, in which using compressed data structures to represent the disambiguation information efficiently had been previously proposed. It had been shown before that the compression ratio can get quite close to Huffman/Arithmetic codes with an additional capability of providing direct access in compressed data, which is a missing feature in the regular Huffman codes. In this study we investigate non-UD codes in another dimension addressing the privacy of the high-entropy data. We particularly focus on such massive volumes, where typical examples are encoded video or similar multimedia files. Representation of such a volume with non–UD coding creates two elements as the disambiguation information and the payload, where decoding the original data from these elements becomes hard when one of them is missing. We make use of this observation for privacy concerns. and study the space consumption as well as the hardness of that decoding. We conclude that non-uniquely-decodable codes can be an alternative to selective encryption schemes that aim to secure only part of the data when data is huge. We provide a freely available software implementation of the proposed scheme as well.

2018 ◽  
Vol 28 (2) ◽  
pp. 156 ◽  
Author(s):  
Marwah K Hussien

New partial encryption schemes are proposed, in which a secure encryption algorithm is used to encrypt only part of the compressed data. Partial encryption applied after application of image compression algorithm. Only 0.0244%-25% of the original data isencrypted for two pairs of dif-ferent grayscale imageswiththe size (256 ´ 256) pixels. As a result, we see a significant reduction of time in the stage of encryption and decryption. In the compression step, the Orthogonal Search Algorithm (OSA) for motion estimation (the dif-ferent between stereo images) is used. The resulting disparity vector and the remaining image were compressed by Discrete Cosine Transform (DCT), Quantization and arithmetic encoding. The image compressed was encrypted by Advanced Encryption Standard (AES). The images were then decoded and were compared with the original images. Experimental results showed good results in terms of Peak Signal-to-Noise Ratio (PSNR), Com-pression Ratio (CR) and processing time. The proposed partial encryption schemes are fast, se-cure and do not reduce the compression performance of the underlying selected compression methods


2018 ◽  
Vol 89 ◽  
pp. 82-93
Author(s):  
Pedro Correia ◽  
Luís Paquete ◽  
José Rui Figueira

2019 ◽  
Author(s):  
Corey R. Lawrence ◽  
Jeffery Beem-Miller ◽  
Alison M. Hoyt ◽  
Grey Monroe ◽  
Carlos A. Sierra ◽  
...  

Abstract. Radiocarbon is a critical constraint on our estimates of the timescales of soil carbon cycling that can aid in identifying mechanisms of carbon stabilization and destabilization, and improve forecast of soil carbon response to management or environmental change. Despite the wealth of soil radiocarbon data that has been reported over the past 75 years, the ability to apply these data to global scale questions is limited by our capacity to synthesis and compare measurements generated using a variety of methods. Here we describe the International Soil Radiocarbon Database (ISRaD, soilradiocarbon.org), an open-source archive of soils data that include data from bulk soils, or whole-soils; distinct soil carbon pools isolated in the laboratory by a variety of soil fractionation methods; samples of soil gas or water collected interstitially from within an intact soil profile; CO2 gas isolated from laboratory soil incubations; and fluxes collected in situ from a soil surface. The core of ISRaD is a relational database structured around individual datasets (entries) and organized hierarchically to report soil radiocarbon data, measured at different physical and temporal scales, as well as other soil or environmental properties that may also be measured at one or more levels of the hierarchy that may assist with interpretation and context. Anyone may contribute their own data to the database by entering it into the ISRaD template and subjecting it to quality assurance protocols. ISRaD can be accessed through: (1) a web-based interface, (2) an R package (ISRaD), or (3) direct access to code and data through the GitHub repository, which hosts both code and data. The design of ISRaD allows for participants to become directly involved in the management, design, and application of ISRaD data. The synthesized dataset is available in two forms: the original data as reported by the authors of the datasets; and an enhanced dataset that includes ancillary geospatial data calculated within the ISRaD framework. ISRaD also provides data management tools in the ISRaD-R package that provide a starting point for data analysis. This community-based dataset and platform for soil radiocarbon and a wide array of additional soils data information in soils where data are easy to contribute and the community is invited to add tools and ideas for improvement. As a whole, ISRaD provides resources that can aid our evaluation of soil dynamics and improve our understanding of controls on soil carbon dynamics across a range of spatial and temporal scales. The ISRaD v1.0 dataset (Lawrence et al., 2019) is archived and freely available at https://doi.org/10.5281/zenodo.2613911.


Author(s):  
V. H. Ayma ◽  
V. A. Ayma ◽  
J. Gutierrez

Abstract. Nowadays, the increasing amount of information provided by hyperspectral sensors requires optimal solutions to ease the subsequent analysis of the produced data. A common issue in this matter relates to the hyperspectral data representation for classification tasks. Existing approaches address the data representation problem by performing a dimensionality reduction over the original data. However, mining complementary features that reduce the redundancy from the multiple levels of hyperspectral images remains challenging. Thus, exploiting the representation power of neural networks based techniques becomes an attractive alternative in this matter. In this work, we propose a novel dimensionality reduction implementation for hyperspectral imaging based on autoencoders, ensuring the orthogonality among features to reduce the redundancy in hyperspectral data. The experiments conducted on the Pavia University, the Kennedy Space Center, and Botswana hyperspectral datasets evidence such representation power of our approach, leading to better classification performances compared to traditional hyperspectral dimensionality reduction algorithms.


2019 ◽  
Vol 13 (S1) ◽  
Author(s):  
Na Yu ◽  
Ying-Lian Gao ◽  
Jin-Xing Liu ◽  
Juan Wang ◽  
Junliang Shang

Abstract Background As one of the most popular data representation methods, non-negative matrix decomposition (NMF) has been widely concerned in the tasks of clustering and feature selection. However, most of the previously proposed NMF-based methods do not adequately explore the hidden geometrical structure in the data. At the same time, noise and outliers are inevitably present in the data. Results To alleviate these problems, we present a novel NMF framework named robust hypergraph regularized non-negative matrix factorization (RHNMF). In particular, the hypergraph Laplacian regularization is imposed to capture the geometric information of original data. Unlike graph Laplacian regularization which captures the relationship between pairwise sample points, it captures the high-order relationship among more sample points. Moreover, the robustness of the RHNMF is enhanced by using the L2,1-norm constraint when estimating the residual. This is because the L2,1-norm is insensitive to noise and outliers. Conclusions Clustering and common abnormal expression gene (com-abnormal expression gene) selection are conducted to test the validity of the RHNMF model. Extensive experimental results on multi-view datasets reveal that our proposed model outperforms other state-of-the-art methods.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 25949-25963
Author(s):  
Carlos Quijada Fuentes ◽  
Miguel R. Penabad ◽  
Susana Ladra ◽  
Gilberto Gutierrez Retamal

Author(s):  
Soumen Chakrabarti ◽  
Sasidhar Kasturi ◽  
Bharath Balakrishnan ◽  
Ganesh Ramakrishnan ◽  
Rohit Saraf

2004 ◽  
Vol 5 (2) ◽  
pp. 184-189 ◽  
Author(s):  
H. Schoof ◽  
R. Ernst ◽  
K. F. X. Mayer

The completion of theArabidopsisgenome and the large collections of other plant sequences generated in recent years have sparked extensive functional genomics efforts. However, the utilization of this data is inefficient, as data sources are distributed and heterogeneous and efforts at data integration are lagging behind. PlaNet aims to overcome the limitations of individual efforts as well as the limitations of heterogeneous, independent data collections. PlaNet is a distributed effort among European bioinformatics groups and plant molecular biologists to establish a comprehensive integrated database in a collaborative network. Objectives are the implementation of infrastructure and data sources to capture plant genomic information into a comprehensive, integrated platform. This will facilitate the systematic exploration ofArabidopsisand other plants. New methods for data exchange, database integration and access are being developed to create a highly integrated, federated data resource for research. The connection between the individual resources is realized with BioMOBY. BioMOBY provides an architecture for the discovery and distribution of biological data through web services. While knowledge is centralized, data is maintained at its primary source without a need for warehousing. To standardize nomenclature and data representation, ontologies and generic data models are defined in interaction with the relevant communities.Minimal data models should make it simple to allow broad integration, while inheritance allows detail and depth to be added to more complex data objects without losing integration. To allow expert annotation and keep databases curated, local and remote annotation interfaces are provided. Easy and direct access to all data is key to the project.


Sign in / Sign up

Export Citation Format

Share Document