scholarly journals Private Genomes and Public SNPs: Homomorphic Encryption of Genotypes and Phenotypes for Shared Quantitative Genetics

Genetics ◽  
2020 ◽  
Vol 215 (2) ◽  
pp. 359-372
Author(s):  
Richard Mott ◽  
Christian Fischer ◽  
Pjotr Prins ◽  
Robert William Davies

Sharing human genotype and phenotype data is essential to discover otherwise inaccessible genetic associations, but is a challenge because of privacy concerns. Here, we present a method of homomorphic encryption that obscures individuals’ genotypes and phenotypes, and is suited to quantitative genetic association analysis. Encrypted ciphertext and unencrypted plaintext are analytically interchangeable. The encryption uses a high-dimensional random linear orthogonal transformation key that leaves the likelihood of quantitative trait data unchanged under a linear model with normally distributed errors. It also preserves linkage disequilibrium between genetic variants and associations between variants and phenotypes. It scrambles relationships between individuals: encrypted genotype dosages closely resemble Gaussian deviates, and can be replaced by quantiles from a Gaussian with negligible effects on accuracy. Likelihood-based inferences are unaffected by orthogonal encryption. These include linear mixed models to control for unequal relatedness between individuals, heritability estimation, and including covariates when testing association. Orthogonal transformations can be applied in a modular fashion for multiparty federated mega-analyses where the parties first agree to share a common set of genotype sites and covariates prior to encryption. Each then privately encrypts and shares their own ciphertext, and analyses all parties’ ciphertexts. In the absence of private variants, or knowledge of the key, we show that it is infeasible to decrypt ciphertext using existing brute-force or noise-reduction attacks. We present the method as a challenge to the community to determine its security.

2020 ◽  
Author(s):  
Richard Mott ◽  
Christian Fischer ◽  
Pjotr Prins ◽  
Robert William Davies

AbstractSharing human genotype and phenotype data presents a challenge because of privacy concerns, but is essential in order to discover otherwise inaccessible genetic associations. Here we present a method of homomorphic encryption that obscures individuals’ genotypes and phenotypes and is suited to quantitative genetic association analysis. Encrypted ciphertext and unencrypted plaintext are interchangeable from an analytical perspective. This allows one to store ciphertext on public web services and share data across multiple studies, while maintaining privacy. The encryption method uses as its key a high-dimensional random linear orthogonal transformation that leaves the likelihood of quantitative trait data unchanged under a linear model with normally distributed errors. It also preserves linkage disequilibrium between genetic variants and associations between variants and phenotypes. It scrambles relationships between individuals: encrypted genotype dosages closely resemble Gaussian deviates, and in fact can be replaced by quantiles from a Gaussian with only negligible effects on accuracy. Standard likelihood-based inferences are unaffected by orthogonal encryption. These include the use of mixed linear models to control for unequal relatedness between individuals, the estimation of heritability, and the inclusion of covariates when testing for association. Orthogonal transformations can also be applied in a modular fashion that permits multi-party federated mega-analyses. Under this scheme any number of parties first agree to share a common set of genotype sites and covariates prior to encryption. Each party then privately encrypts and shares their own ciphertext, and analyses the other parties’ ciphertexts. In the absence of private variants, or knowledge of the key, we show that it is infeasible to decrypt ciphertext using existing brute-force or noise reduction attacks. Therefore, we present the method as a challenge to the community to determine its security.


2015 ◽  
Vol 9 (2) ◽  
pp. 2099-2129 ◽  
Author(s):  
Anna Bonnet ◽  
Elisabeth Gassiat ◽  
Céline Lévy-Leduc

F1000Research ◽  
2019 ◽  
Vol 6 ◽  
pp. 748 ◽  
Author(s):  
Malgorzata Nowicka ◽  
Carsten Krieg ◽  
Helena L. Crowell ◽  
Lukas M. Weber ◽  
Felix J. Hartmann ◽  
...  

High-dimensional mass and flow cytometry (HDCyto) experiments have become a method of choice for high-throughput interrogation and characterization of cell populations. Here, we present an updated R-based pipeline for differential analyses of HDCyto data, largely based on Bioconductor packages. We computationally define cell populations using FlowSOM clustering, and facilitate an optional but reproducible strategy for manual merging of algorithm-generated clusters. Our workflow offers different analysis paths, including association of cell type abundance with a phenotype or changes in signalling markers within specific subpopulations, or differential analyses of aggregated signals. Importantly, the differential analyses we show are based on regression frameworks where the HDCyto data is the response; thus, we are able to model arbitrary experimental designs, such as those with batch effects, paired designs and so on. In particular, we apply generalized linear mixed models or linear mixed models to analyses of cell population abundance or cell-population-specific analyses of signaling markers, allowing overdispersion in cell count or aggregated signals across samples to be appropriately modeled. To support the formal statistical analyses, we encourage exploratory data analysis at every step, including quality control (e.g., multi-dimensional scaling plots), reporting of clustering results (dimensionality reduction, heatmaps with dendrograms) and differential analyses (e.g., plots of aggregated signals).


2014 ◽  
Author(s):  
Karl W Broman

Every data visualization can be improved with some level of interactivity. Interactive graphics hold particular promise for the exploration of high-dimensional data. R/qtlcharts is an R package to create interactive graphics for experiments to map quantitative trait loci (QTL; genetic loci that influence quantitative traits). R/qtlcharts serves as a companion to the R/qtl package, providing interactive versions of R/qtl's static graphs, as well as additional interactive graphs for the exploration of high-dimensional genotype and phenotype data.


Genetics ◽  
2016 ◽  
Vol 204 (3) ◽  
pp. 1281-1294 ◽  
Author(s):  
Pierre de Villemereuil ◽  
Holger Schielzeth ◽  
Shinichi Nakagawa ◽  
Michael Morrissey

2020 ◽  
Vol 12 (21) ◽  
pp. 3530
Author(s):  
Libin Jiao ◽  
Lianzhi Huo ◽  
Changmiao Hu ◽  
Ping Tang

Cloud and shadow detection is an essential prerequisite for further remote sensing processing, whereas edge-precise segmentation remains a challenging issue. In Refined UNet, we considered the aforementioned task and proposed a two-stage pipeline to achieve the edge-precise segmentation. The isolated segmentation regions in Refined UNet, however, bring inferior visualization and should be sufficiently eliminated. Moreover, an end-to-end model is also expected to jointly predict and refine the segmentation results. In this paper, we propose the end-to-end Refined UNet v2 to achieve joint prediction and refinement of cloud and shadow segmentation, which is capable of visually neutralizing redundant segmentation pixels or regions. To this end, we inherit the pipeline of Refine UNet, revisit the bilateral message passing in the inference of conditional random field (CRF), and then develop a novel bilateral strategy derived from the Guided Gaussian filter. Derived from a local linear model of denoising, our v2 can considerably remove isolated segmentation pixels or regions, which is able to yield “cleaner” results. Compared to the high-dimensional Gaussian filter, the Guided Gaussian filter-based message-passing strategy is quite straightforward and easy to implement so that a brute-force implementation can be easily given in GPU frameworks, which is potentially efficient and facilitates embedding. Moreover, we prove that Guided Gaussian filter-based message passing is highly relevant to the Gaussian bilateral term in Dense CRF. Experiments and results demonstrate that our v2 is quantitatively comparable to Refined UNet, but can visually outperform that from the noise-free segmentation perspective. The comparison of time consumption also supports the potential efficiency of our v2.


Author(s):  
Xun Wang ◽  
Tao Luo ◽  
Jianfeng Li

Information retrieval in the cloud is common and convenient. Nevertheless, privacy concerns should not be ignored as the cloud is not fully trustable. Fully Homomorphic Encryption (FHE) allows arbitrary operations to be performed on encrypted data, where the decryption of the result of ciphertext operation equals that of the corresponding plaintext operation. Thus, FHE schemes can be utilized for private information retrieval (PIR) on encrypted data. In the FHE scheme proposed by Ducas and Micciancio (DM), only a single homomorphic NOT AND (NAND) operation is allowed between consecutive ciphertext refreshings. Aiming at this problem, an improved FHE scheme is proposed for efficient PIR where homomorphic additions and multiplications are based on linear operations on ciphertext vectors. Theoretical analysis shows that when compared with the DM scheme, the proposed scheme allows multiple homomorphic additions and a single homomorphic multiplication to be performed. The number of allowed homomorphic additions is determined by the ratio of the ciphertext modulus to the upper bound of initial ciphertext noise. Moreover, simulation results show that the proposed scheme is significantly faster than the DM scheme in the homomorphic evaluation for a series of algorithms.


Sign in / Sign up

Export Citation Format

Share Document