Testing Disjointness of Private Datasets

Author(s):  
Aggelos Kiayias ◽  
Antonina Mitrofanova
Keyword(s):  
Author(s):  
Qingsong Ye ◽  
Huaxiong Wang ◽  
Josef Pieprzyk ◽  
Xian-Mo Zhang
Keyword(s):  

Sensors ◽  
2019 ◽  
Vol 19 (4) ◽  
pp. 852 ◽  
Author(s):  
Eric Psota ◽  
Mateusz Mittek ◽  
Lance Pérez ◽  
Ty Schmidt ◽  
Benny Mote

Computer vision systems have the potential to provide automated, non-invasive monitoring of livestock animals, however, the lack of public datasets with well-defined targets and evaluation metrics presents a significant challenge for researchers. Consequently, existing solutions often focus on achieving task-specific objectives using relatively small, private datasets. This work introduces a new dataset and method for instance-level detection of multiple pigs in group-housed environments. The method uses a single fully-convolutional neural network to detect the location and orientation of each animal, where both body part locations and pairwise associations are represented in the image space. Accompanying this method is a new dataset containing 2000 annotated images with 24,842 individually annotated pigs from 17 different locations. The proposed method achieves over 99% precision and over 96% recall when detecting pigs in environments previously seen by the network during training. To evaluate the robustness of the trained network, it is also tested on environments and lighting conditions unseen in the training set, where it achieves 91% precision and 67% recall. The dataset is publicly available for download.


Author(s):  
Kangwook Lee ◽  
Hoon Kim ◽  
Kyungmin Lee ◽  
Changho Suh ◽  
Kannan Ramchandran

2021 ◽  
Vol 51 (2) ◽  
pp. 1-17
Author(s):  
Nicoleta González Cancelas ◽  
Beatriz Molina Serrano ◽  
Francisco Soler Flores

Abstract The Spanish Port System is immersed in the process of digital transformation towards the concept of Ports 4.0. This entails new regulatory and connectivity requirements, making it necessary to implement the new technologies offered by the market towards digitalization. The digitalization of the individual processes in a first step helps the exchange of digital information between the members of the port community. The next step will mean that the information flow between the participants of a port community is done in a reliable, efficient, paperless way, and thanks to technologies. However, for the Spanish port sector, data exchange has a competitive disadvantage. That is why Federated Learning is proposed. This approach allows several organizations in the port sector to collaborate in the development of models, but without the need to directly share sensitive port data among themselves. Instead of gathering data on a single server, the data remains locked on your server, and the algorithms and predictive models travel between them. The goal of this approach is to benefit from a large set of data, which contributes to increased Machine Learning performance while respecting data ownership and privacy. Through an Inter-institution or “Cross-silo FL” model, different institutions contribute to the training with their local datasets in which different companies collaborate in training a learning machine for the discovery of patterns in private datasets of high sensitivity and high content. This environment is characterized by a smaller number of participants than the mobile case, with typically better bandwidth and less intermittency.


Author(s):  
T. Shiva Rama Krishna

Malicious software or malware continues to pose a major security concern in this digital age as computer users, corporations, and governments witness an exponential growth in malware attacks. Current malware detection solutions adopt Static and Dynamic analysis of malware signatures and behaviour patterns that are time consuming and ineffective in identifying unknown malwares. Recent malwares use polymorphic, metamorphic and other evasive techniques to change the malware behaviour’s quickly and to generate large number of malwares. Since new malwares are predominantly variants of existing malwares, machine learning algorithms are being employed recently to conduct an effective malware analysis. This requires extensive feature engineering, feature learning and feature representation. By using the advanced MLAs such as deep learning, the feature engineering phase can be completely avoided. Though some recent research studies exist in this direction, the performance of the algorithms is biased with the training data. There is a need to mitigate bias and evaluate these methods independently in order to arrive at new enhanced methods for effective zero-day malware detection. To fill the gap in literature, this work evaluates classical MLAs and deep learning architectures for malware detection, classification and categorization with both public and private datasets. The train and test splits of public and private datasets used in the experimental analysis are disjoint to each other’s and collected in different timescales. In addition, we propose a novel image processing technique with optimal parameters for MLAs and deep learning architectures. A comprehensive experimental evaluation of these methods indicate that deep learning architectures outperform classical MLAs. Overall, this work proposes an effective visual detection of malware using a scalable and hybrid deep learning framework for real-time deployments. The visualization and deep learning architectures for static, dynamic and image processing-based hybrid approach in a big data environment is a new enhanced method for effective zero-day malware detection.


Author(s):  
Cuong Tran ◽  
Ferdinando Fioretto ◽  
Pascal Van Hentenryck ◽  
Zhiyan Yao

Many agencies release datasets and statistics about groups of individuals that are used as input to a number of critical decision processes. To conform with privacy and confidentiality requirements, these agencies are often required to release privacy-preserving versions of the data. This paper studies the release of differentially private datasets and analyzes their impact on some critical resource allocation tasks under a fairness perspective. The paper shows that, when the decisions take as input differentially private data, the noise added to achieve privacy disproportionately impacts some groups over others. The paper analyzes the reasons for these disproportionate impacts and proposes guidelines to mitigate these effects. The proposed approaches are evaluated on critical decision problems that use differentially private census data.


Sensors ◽  
2021 ◽  
Vol 21 (17) ◽  
pp. 5848
Author(s):  
Mohamed Chouai ◽  
Petr Dolezel ◽  
Dominik Stursa ◽  
Zdenek Nemec

In the field of computer vision, object detection consists of automatically finding objects in images by giving their positions. The most common fields of application are safety systems (pedestrian detection, identification of behavior) and control systems. Another important application is head/person detection, which is the primary material for road safety, rescue, surveillance, etc. In this study, we developed a new approach based on two parallel Deeplapv3+ to improve the performance of the person detection system. For the implementation of our semantic segmentation model, a working methodology with two types of ground truths extracted from the bounding boxes given by the original ground truths was established. The approach has been implemented in our two private datasets as well as in a public dataset. To show the performance of the proposed system, a comparative analysis was carried out on two deep learning semantic segmentation state-of-art models: SegNet and U-Net. By achieving 99.14% of global accuracy, the result demonstrated that the developed strategy could be an efficient way to build a deep neural network model for semantic segmentation. This strategy can be used, not only for the detection of the human head but also be applied in several semantic segmentation applications.


2022 ◽  
Vol 3 (1) ◽  
pp. 1-15
Author(s):  
Divya Jyothi Gaddipati ◽  
Jayanthi Sivaswamy

Early detection and treatment of glaucoma is of interest as it is a chronic eye disease leading to an irreversible loss of vision. Existing automated systems rely largely on fundus images for assessment of glaucoma due to their fast acquisition and cost-effectiveness. Optical Coherence Tomographic ( OCT ) images provide vital and unambiguous information about nerve fiber loss and optic cup morphology, which are essential for disease assessment. However, the high cost of OCT is a deterrent for deployment in screening at large scale. In this article, we present a novel CAD solution wherein both OCT and fundus modality images are leveraged to learn a model that can perform a mapping of fundus to OCT feature space. We show how this model can be subsequently used to detect glaucoma given an image from only one modality (fundus). The proposed model has been validated extensively on four public andtwo private datasets. It attained an AUC/Sensitivity value of 0.9429/0.9044 on a diverse set of 568 images, which is superior to the figures obtained by a model that is trained only on fundus features. Cross-validation was also done on nearly 1,600 images drawn from a private (OD-centric) and a public (macula-centric) dataset and the proposed model was found to outperform the state-of-the-art method by 8% (public) to 18% (private). Thus, we conclude that fundus to OCT feature space mapping is an attractive option for glaucoma detection.


2021 ◽  
Vol 14 (2) ◽  
pp. 183-200
Author(s):  
Vito Walter Anelli ◽  
Yashar Deldjoo ◽  
Tommaso Di Noia ◽  
Antonio Ferrara

In Machine Learning scenarios, privacy is a crucial concern when models have to be trained with private data coming from users of a service, such as a recommender system, a location-based mobile service, a mobile phone text messaging service providing next word prediction, or a face image classification system. The main issue is that, often, data are collected, transferred, and processed by third parties. These transactions violate new regulations, such as GDPR. Furthermore, users usually are not willing to share private data such as their visited locations, the text messages they wrote, or the photo they took with a third party. On the other hand, users appreciate services that work based on their behaviors and preferences. In order to address these issues, Federated Learning (FL) has been recently proposed as a means to build ML models based on private datasets distributed over a large number of clients, while preventing data leakage. A federation of users is asked to train a same global model on their private data, while a central coordinating server receives locally computed updates by clients and aggregate them to obtain a better global model, without the need to use clients’ actual data. In this work, we extend the FL approach by pushing forward the state-of-the-art approaches in the aggregation step of FL, which we deem crucial for building a high-quality global model. Specifically, we propose an approach that takes into account a suite of client-specific criteria that constitute the basis for assigning a score to each client based on a priority of criteria defined by the service provider. Extensive experiments on two publicly available datasets indicate the merits of the proposed approach compared to standard FL baseline.


2009 ◽  
Vol 1 (3) ◽  
pp. 225 ◽  
Author(s):  
Qingsong Ye ◽  
Huaxiong Wang ◽  
Josef Pieprzyk ◽  
Xian Mo Zhang
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document