Sensitivity Levels: Optimizing the Performance of Privacy Preserving DNA Alignment

Mapping Intimacies ◽

10.1101/292227 ◽

2018 ◽

Author(s):

Maria Fernandes ◽

Jérémie Decouchant ◽

Marcus Völp ◽

Francisco M Couto ◽

Paulo Esteves-Veríssimo

Keyword(s):

Dna Sequences ◽

High Performance ◽

Privacy Preserving ◽

Sensitive Information ◽

Alignment Algorithms ◽

Private And Public ◽

Genomic Studies ◽

Next Generation Sequencing Ngs ◽

Performance Gains ◽

Dna Alignment

AbstractThe advent of high throughput next-generation sequencing (NGS) machines made DNA sequencing cheaper, but also put pressure on the genomic life-cycle, which includes aligning millions of short DNA sequences, called reads, to a reference genome. On the performance side, efficient algorithms have been developed, and parallelized on public clouds. On the privacy side, since genomic data are utterly sensitive, several cryptographic mechanisms have been proposed to align reads securely, with a lower performance than the former, which in turn are not secure. This manuscript proposes a novel contribution to improving the privacy performance product in current genomic studies. Building on recent works that argue that genomics data needs to be × treated according to a threat-risk analysis, we introduce a multi-level sensitivity classification of genomic variations. Our classification prevents the amplification of possible privacy attacks, thanks to promoting and partitioning mechanisms among sensitivity levels. Thanks to this classification, reads can be aligned, stored, and later accessed, using different security levels. We then extend a recent filter, which detects the reads that carry sensitive information, to classify reads into sensitivity levels. Finally, based on a review of the existing alignment methods, we show that adapting alignment algorithms to reads sensitivity allows high performance gains, whilst enforcing high privacy levels. Our results indicate that using sensitivity levels is feasible to optimize the performance of privacy preserving alignment, if one combines the advantages of private and public clouds.

Download Full-text

A Robust Privacy Preserving of Multiple and Binary Attribute by Using Super Modularity with Perturbation

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit183838 ◽

2018 ◽

pp. 121-132

Author(s):

Priya Ranjan ◽

Raj Kumar Paul

Keyword(s):

Data Mining ◽

Data Security ◽

High Performance ◽

Perturbation Technique ◽

Threshold Value ◽

Privacy Preserving ◽

Digital Data ◽

Sensitive Information ◽

Data Mining Algorithms ◽

Mining Algorithms

With the increase of digital data on servers different approach of data mining is applied for the retrieval of interesting information in decision making. A major social concern of data mining is the issue of privacy and data security. So privacy preserving mining come in existence, as it validates those data mining algorithms that do not disclose sensitive information. This work provides privacy for sensitive rules that discriminate data on the basis of community, gender, country, etc. Rules are obtained by aprior algorithm of association rule mining. Those rules which contain sensitive item set with minimum threshold value are considered as sensitive. Perturbation technique is used for the hiding of sensitive rules. The age of large database is now a big issue. So researchers try to develop a high performance platform to efficiently secure these kind of data before publishing. Here proposed work has resolve this issue of digital data security by finding the relation between the columns of the dataset which is based on the highly relative association patterns. Here use of super modularity is also done which balance the risk and utilization of the data. Experiment is done on large dataset which have all kind of attribute for implementing proposed work features. The experiments showed that the proposed algorithms perform well on large databases. It work better as the Maximum lost pattern percentage is zero a certain value of support.

Download Full-text

Privacy Preserving: Stochastic Channel-Based Federated Learning with Neural Network Pruning (Preprint)

10.2196/preprints.17111 ◽

2019 ◽

Author(s):

Rulin Shao ◽

Hongyu He ◽

Hui Liu ◽

Dianbo Liu

Keyword(s):

Neural Network ◽

Distributed System ◽

High Performance ◽

Averaging Method ◽

Privacy Preserving ◽

Performance Model ◽

Sensitive Information ◽

Privacy Concerns ◽

Network Pruning ◽

Validation Set

BACKGROUND Artificial neural network has achieved unprecedented success in a wide variety of domains such as classifying, predicting and recognizing objects. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns and people want to take control over their sensitive information during both training and using processes. OBJECTIVE To address this problem, we propose a privacy-preserving method for the distributed system. The proposed method, Stochastic Channel-Based Federated Learning (SCBF), enables the participants to train a high-performance model cooperatively without sharing their inputs. METHODS Specifically, we design, implement and evaluate a channel-based update algorithm for the central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop and upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, is applied to the algorithm based on the validation set. RESULTS We construct a distributed system consisting of 5 clients and 1 server. Our trials show that the Stochastic Channel-Based Federated Learning method can achieve an AUCROC of 0.9776 and an AUCPR of 0.9695 with 10% channels shared with the server. Compared with Federated Averaging algorithm, the proposed method achieves 0.05388 higher in AUCROC and 0.09695 higher in AUCPR. In addition, our experiment shows that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUCROC performance and a reduction of 0.0068 in AUCPR. CONCLUSIONS In the experiment, our model presents better performances and higher saturating speed than the Federated Averaging method, which reveals all the parameters of local models to the server. We also demonstrate that the saturating rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate.

Download Full-text

Differential Privacy under Dependent Tuples - The Case of Genomic Privacy

Bioinformatics ◽

10.1093/bioinformatics/btz837 ◽

2019 ◽

Author(s):

Nour Almadhoun ◽

Erman Ayday ◽

Özgür Ulusoy

Keyword(s):

Differential Privacy ◽

Genomic Data ◽

Privacy Preserving ◽

Supplementary Information ◽

Sensitive Information ◽

Genomic Databases ◽

Privacy Concerns ◽

Rigorous Approach ◽

Genomic Studies ◽

Inference Attack

Abstract Motivation The rapid progress in genome sequencing has led to high availability of genomic data. However, due to growing privacy concerns about the participant’s sensitive information, accessing results and data of genomic studies is restricted to only trusted individuals. On the other hand, paving the way to biomedical discoveries requires granting open access to genomic databases. Privacy-preserving mechanisms can be a solution for granting wider access to such data while protecting their owners. In particular, there has been growing interest in applying the concept of differential privacy (DP) while sharing summary statistics about genomic data. DP provides a mathematically rigorous approach but it does not consider the dependence between tuples in a database, which may degrade the privacy guarantees offered by the DP. Results In this work, focusing on genomic databases, we show this drawback of DP and we propose techniques to mitigate it. First, using a real-world genomic dataset, we demonstrate the feasibility of an inference attack on differentially private query results by utilizing the correlations between the tuples in the dataset. The results show that the adversary can infer sensitive genomic data about a user from the differentially private query results by exploiting correlations between genomes of family members. Second, we propose a mechanism for privacy-preserving sharing of statistics from genomic datasets to attain privacy guarantees while taking into consideration the dependence between tuples. By evaluating our mechanism on different genomic datasets, we empirically demonstrate that our proposed mechanism can achieve up to 50% better privacy than traditional DP-based solutions. Availability https://github.com/nourmadhoun/Differential-privacy-genomic-inference-attack. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Comparison of multiple DNA alignment algorithms for Labiatae molecular phylogeny inferences

Planta Medica ◽

10.1055/s-0031-1282272 ◽

2011 ◽

Vol 77 (12) ◽

Author(s):

AG Ince ◽

M Karaca ◽

A Aydın

Keyword(s):

Molecular Phylogeny ◽

Alignment Algorithms ◽

Dna Alignment

Download Full-text

Denaturing High Performance Liquid Chromatography and Bioinformatics - Two Modern Tools for Extracellular Superoxide Dismutase (SOD3) Gene Promoter Analysis

Revista de Chimie ◽

10.37358/rc.08.7.1893 ◽

2008 ◽

Vol 59 (7) ◽

Author(s):

Corina Samoila ◽

Alfa Xenia Lupea ◽

Andrei Anghel ◽

Marilena Motoc ◽

Gabriela Otiman ◽

...

Keyword(s):

High Performance Liquid Chromatography ◽

Transcription Factors ◽

Liquid Chromatography ◽

Dna Sequences ◽

High Performance ◽

Zinc Finger Protein ◽

High Capacity ◽

Gene Promoter ◽

Experimental Approaches ◽

Myeloid Zinc Finger

Denaturing High Performance Liquid Chromatography (DHPLC) is a relatively new method used for screening DNA sequences, characterized by high capacity to detect mutations/polymorphisms. This study is focused on the Transgenomic WAVETM DNA Fragment Analysis (based on DHPLC separation method) of a 485 bp fragment from human EC-SOD gene promoter in order to detect single nucleotide polymorphism (SNPs) associated with atherosclerosis and risk factors of cardiovascular disease. The fragment of interest was amplified by PCR reaction and analyzed by DHPLC in 100 healthy subjects and 70 patients characterized by atheroma. No different melting profiles were detected for the analyzed DNA samples. A combination of computational methods was used to predict putative transcription factors in the fragment of interest. Several putative transcription factors binding sites from the Ets-1 oncogene family: ETS member Elk-1, polyomavirus enhancer activator-3 (PEA3), protein C-Ets-1 (Ets-1), GABP: GA binding protein (GABP), Spi-1 and Spi-B/PU.1 related transcription factors, from the Krueppel-like family: Gut-enriched Krueppel-like factor (GKLF), Erythroid Krueppel-like factor (EKLF), Basic Krueppel-like factor (BKLF), GC box and myeloid zinc finger protein MZF-1 were identified in the evolutionary conserved regions. The bioinformatics results need to be investigated further in others studies by experimental approaches.

Download Full-text

Hardware-assisted High-performance DNA Alignment System

Proceedings of the 2020 5th International Conference on Intelligent Information Technology ◽

10.1145/3385209.3385223 ◽

2020 ◽

Author(s):

Binh Kieu-Do-Nguyen ◽

Cuong Pham-Quoc ◽

Cong-Kha Pham

Keyword(s):

High Performance ◽

Alignment System ◽

Dna Alignment

Download Full-text

A high sensitivity wireless mass-loading surface acoustic wave DNA biosensor

Modern Physics Letters B ◽

10.1142/s0217984914500560 ◽

2014 ◽

Vol 28 (07) ◽

pp. 1450056 ◽

Cited By ~ 9

Author(s):

Hua-Lin Cai ◽

Yi Yang ◽

Yi-Han Zhang ◽

Chang-Jian Zhou ◽

Cang-Ran Guo ◽

...

Keyword(s):

Acoustic Wave ◽

Surface Acoustic Wave ◽

Dna Sequences ◽

Biological Treatment ◽

High Performance ◽

Processing System ◽

High Sensitivity ◽

Treatment Method ◽

Saw Sensor ◽

Target Dna

In this paper, a surface acoustic wave (SAW) biosensor with gold delay area on LiNbO 3 substrate detecting DNA sequences is proposed. By well-designed device parameters of the SAW sensor, it achieves a high performance for highly sensitive detection of target DNA. In addition, an effective biological treatment method for DNA immobilization and abundant experimental verification of the sensing effect have made it a reliable device in DNA detection. The loading mass of the probe and target DNA sequences is obtained from the frequency shifts, which are big enough in this work due to an effective biological treatment. The experimental results show that the biosensor has a high sensitivity of 1.2 pg/ml/Hz and high selectivity characteristic is also verified by the few responses of other substances. In combination with wireless transceiver, we develop a wireless receiving and processing system that can directly display the detection results.

Download Full-text

Electrochemically Active DNA Probes: Detection of Target DNA Sequences at Femtomole Level by High-Performance Liquid Chromatography with Electrochemical Detection

Analytical Biochemistry ◽

10.1006/abio.1994.1203 ◽

1994 ◽

Vol 218 (2) ◽

pp. 436-443 ◽

Cited By ~ 69

Author(s):

S. Takenaka ◽

Y. Uto ◽

H. Kondo ◽

T. Ihara ◽

M. Takagi

Keyword(s):

High Performance Liquid Chromatography ◽

Liquid Chromatography ◽

Electrochemical Detection ◽

Dna Sequences ◽

High Performance ◽

Dna Probes ◽

Target Dna ◽

Electrochemically Active

Download Full-text

Anonymization Based on Improved Bucketization (AIB): A Privacy-Preserving Data Publishing Technique for Improving Data Utility in Healthcare Data

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3901 ◽

2021 ◽

Vol 11 (12) ◽

pp. 3164-3173

Author(s):

R. Indhumathi ◽

S. Sathiya Devi

Keyword(s):

Medical Information ◽

Threshold Value ◽

Privacy Preserving ◽

Data Publishing ◽

Published Data ◽

Sensitive Information ◽

Data Utility ◽

Healthcare Data ◽

Privacy Preserving Data Publishing ◽

Horizontal Partitioning

Data sharing is essential in present biomedical research. A large quantity of medical information is gathered and for different objectives of analysis and study. Because of its large collection, anonymity is essential. Thus, it is quite important to preserve privacy and prevent leakage of sensitive information of patients. Most of the Anonymization methods such as generalisation, suppression and perturbation are proposed to overcome the information leak which degrades the utility of the collected data. During data sanitization, the utility is automatically diminished. Privacy Preserving Data Publishing faces the main drawback of maintaining tradeoff between privacy and data utility. To address this issue, an efficient algorithm called Anonymization based on Improved Bucketization (AIB) is proposed, which increases the utility of published data while maintaining privacy. The Bucketization technique is used in this paper with the intervention of the clustering method. The proposed work is divided into three stages: (i) Vertical and Horizontal partitioning (ii) Assigning Sensitive index to attributes in the cluster (iii) Verifying each cluster against privacy threshold (iv) Examining for privacy breach in Quasi Identifier (QI). To increase the utility of published data, the threshold value is determined based on the distribution of elements in each attribute, and the anonymization method is applied only to the specific QI element. As a result, the data utility has been improved. Finally, the evaluation results validated the design of paper and demonstrated that our design is effective in improving data utility.

Download Full-text

A Comparative study of Multi Agent Based and High-Performance Privacy Preserving Data Mining

International Journal of Computer Applications ◽

10.5120/876-1247 ◽

2010 ◽

Vol 4 (12) ◽

pp. 23-26

Author(s):

Md Faizan Farooqui ◽

Md Muqeem ◽

Dr. Md Rizwan Beg

Keyword(s):

Data Mining ◽

Comparative Study ◽

High Performance ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

Agent Based ◽

Multi Agent

Download Full-text