Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation

Armando Maya-López; Fran Casino; Agusti Solanas

doi:10.3390/sym13060916

Improving Multivariate Microaggregation through Hamiltonian Paths and Optimal Univariate Microaggregation

Symmetry ◽

10.3390/sym13060916 ◽

2021 ◽

Vol 13 (6) ◽

pp. 916

Author(s):

Armando Maya-López ◽

Fran Casino ◽

Agusti Solanas

Keyword(s):

Location Privacy ◽

Hamiltonian Path ◽

Real Life ◽

Personal Data ◽

Heuristic Solution ◽

Hamiltonian Paths ◽

Life Trajectories ◽

Individual Privacy ◽

Benchmark Datasets ◽

The Traveling Salesman Problem

The collection of personal data is exponentially growing and, as a result, individual privacy is endangered accordingly. With the aim to lessen privacy risks whilst maintaining high degrees of data utility, a variety of techniques have been proposed, being microaggregation a very popular one. Microaggregation is a family of perturbation methods, in which its principle is to aggregate personal data records (i.e., microdata) in groups so as to preserve privacy through k-anonymity. The multivariate microaggregation problem is known to be NP-Hard; however, its univariate version could be optimally solved in polynomial time using the Hansen-Mukherjee (HM) algorithm. In this article, we propose a heuristic solution to the multivariate microaggregation problem inspired by the Traveling Salesman Problem (TSP) and the optimal univariate microaggregation solution. Given a multivariate dataset, first, we apply a TSP-tour construction heuristic to generate a Hamiltonian path through all dataset records. Next, we use the order provided by this Hamiltonian path (i.e., a given permutation of the records) as input to the Hansen-Mukherjee algorithm, virtually transforming it into a multivariate microaggregation solver we call Multivariate Hansen-Mukherjee (MHM). Our intuition is that good solutions to the TSP would yield Hamiltonian paths allowing the Hansen-Mukherjee algorithm to find good solutions to the multivariate microaggregation problem. We have tested our method with well-known benchmark datasets. Moreover, with the aim to show the usefulness of our approach to protecting location privacy, we have tested our solution with real-life trajectories datasets, too. We have compared the results of our algorithm with those of the best performing solutions, and we show that our proposal reduces the information loss resulting from the microaggregation. Overall, results suggest that transforming the multivariate microaggregation problem into its univariate counterpart by ordering microdata records with a proper Hamiltonian path and applying an optimal univariate solution leads to a reduction of the perturbation error whilst keeping the same privacy guarantees.

Download Full-text

Roulette Wheel Selection based Heuristic Algorithm for the Orienteering Problem

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v13i1.2933 ◽

2014 ◽

Vol 13 (1) ◽

pp. 4127-4145

Author(s):

Madhushi Verma ◽

Mukul Gupta ◽

Bijeeta Pal ◽

Prof. K. K. Shukla

Keyword(s):

Triangle Inequality ◽

Time Budget ◽

Control Point ◽

Hamiltonian Path ◽

Real Life ◽

Tourism Industry ◽

Orienteering Problem ◽

Complete Graphs ◽

Roulette Wheel Selection ◽

Roulette Wheel

Orienteering problem (OP) is an NP-Hard graph problem. The nodes of the graph are associated with scores or rewards and the edges with time delays. The goal is to obtain a Hamiltonian path connecting the two necessary check points, i.e. the source and the target along with a set of control points such that the total collected score is maximized within a specified time limit. OP finds application in several fields like logistics, transportation networks, tourism industry, etc. Most of the existing algorithms for OP can only be applied on complete graphs that satisfy the triangle inequality. Real-life scenario does not guarantee that there exists a direct link between all control point pairs or the triangle inequality is satisfied. To provide a more practical solution, we propose a stochastic greedy algorithm (RWS_OP) that uses the roulette wheel selectionmethod, does not require that the triangle inequality condition is satisfied and is capable of handling both complete as well as incomplete graphs. Based on several experiments on standard benchmark data we show that RWS_OP is faster, more efficient in terms of time budget utilization and achieves a better performance in terms of the total collected score ascompared to a recently reported algorithm for incomplete graphs.

Download Full-text

An expandable approach for design and personalization of digital, just-in-time adaptive interventions

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy160 ◽

2018 ◽

Vol 26 (3) ◽

pp. 198-210 ◽

Cited By ~ 3

Author(s):

Suat Gonul ◽

Tuncay Namli ◽

Sasja Huisman ◽

Gokce Banu Laleci Erturkmen ◽

Ismail Hakki Toroslu ◽

...

Keyword(s):

Real Life ◽

Personal Data ◽

Just In Time ◽

Intervention Design ◽

Intervention Delivery ◽

Life Conditions ◽

Triggering Conditions ◽

Delivery Strategies ◽

Design Mechanism ◽

Personalized Intervention

AbstractObjectiveWe aim to deliver a framework with 2 main objectives: 1) facilitating the design of theory-driven, adaptive, digital interventions addressing chronic illnesses or health problems and 2) producing personalized intervention delivery strategies to support self-management by optimizing various intervention components tailored to people’s individual needs, momentary contexts, and psychosocial variables.Materials and MethodsWe propose a template-based digital intervention design mechanism enabling the configuration of evidence-based, just-in-time, adaptive intervention components. The design mechanism incorporates a rule definition language enabling experts to specify triggering conditions for interventions based on momentary and historical contextual/personal data. The framework continuously monitors and processes personal data space and evaluates intervention-triggering conditions. We benefit from reinforcement learning methods to develop personalized intervention delivery strategies with respect to timing, frequency, and type (content) of interventions. To validate the personalization algorithm, we lay out a simulation testbed with 2 personas, differing in their various simulated real-life conditions.ResultsWe evaluate the design mechanism by presenting example intervention definitions based on behavior change taxonomies and clinical guidelines. Furthermore, we provide intervention definitions for a real-world care program targeting diabetes patients. Finally, we validate the personalized delivery mechanism through a set of hypotheses, asserting certain ways of adaptation in the delivery strategy, according to the differences in simulation related to personal preferences, traits, and lifestyle patterns.ConclusionWhile the design mechanism is sufficiently expandable to meet the theoretical and clinical intervention design requirements, the personalization algorithm is capable of adapting intervention delivery strategies for simulated real-life conditions.

Download Full-text

Overcome the Brightness and Jitter Noises in Video Inter-Frame Tampering Detection

Sensors ◽

10.3390/s21123953 ◽

2021 ◽

Vol 21 (12) ◽

pp. 3953

Author(s):

Han Pu ◽

Tianqiang Huang ◽

Bin Weng ◽

Feng Ye ◽

Chenbin Zhao

Keyword(s):

Detection Method ◽

Real Life ◽

Vital Role ◽

Video Forensics ◽

Flow Algorithm ◽

Benchmark Datasets ◽

Media Reports ◽

Intensity Normalization ◽

Inter Frame ◽

Stable Feature

Digital video forensics plays a vital role in judicial forensics, media reports, e-commerce, finance, and public security. Although many methods have been developed, there is currently no efficient solution to real-life videos with illumination noises and jitter noises. To solve this issue, we propose a detection method that adapts to brightness and jitter for video inter-frame forgery. For videos with severe brightness changes, we relax the brightness constancy constraint and adopt intensity normalization to propose a new optical flow algorithm. For videos with large jitter noises, we introduce motion entropy to detect the jitter and extract the stable feature of texture changes fraction for double-checking. Experimental results show that, compared with previous algorithms, the proposed method is more accurate and robust for videos with significant brightness variance or videos with heavy jitter on public benchmark datasets.

Download Full-text

Towards Faster Mining of Disjunction-Based Concise Representations of Frequent Patterns

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213014500018 ◽

2014 ◽

Vol 23 (02) ◽

pp. 1450001

Author(s):

T. Hamrouni ◽

S. Ben Yahia ◽

E. Mephu Nguifo

Keyword(s):

Empirical Study ◽

Real Life ◽

Search Space ◽

Frequent Patterns ◽

Memory Consumption ◽

Efficient Tool ◽

Condensed Representation ◽

Benchmark Datasets ◽

Condensed Representations ◽

Amount Of Knowledge

In many real-life datasets, the number of extracted frequent patterns was shown to be huge, hampering the effective exploitation of such amount of knowledge by human experts. To overcome this limitation, exact condensed representations were introduced in order to offer a small-sized set of elements from which the faithful retrieval of all frequent patterns is possible. In this paper, we introduce a new exact condensed representation only based on particular elements from the disjunctive search space. In this space, a pattern is characterized by its disjunctive support, i.e., the frequency of complementary occurrences – instead of the ubiquitous co-occurrence link – of its items. For several benchmark datasets, this representation has been shown interesting in compactness terms compared to the pioneering approaches of the literature. In this respect, we mainly focus here on proposing an efficient tool for mining this representation. For this purpose, we introduce an algorithm, called DSSRM, dedicated to this task. We also propose several techniques to optimize its mining time as well as its memory consumption. The carried out empirical study on benchmark datasets shows that DSSRM is faster by several orders of magnitude than the MEP algorithm.

Download Full-text

Go to More Parties? Social Occasions as Home to Unexpected Turning Points in Life Trajectories

Social Psychology Quarterly ◽

10.1177/0190272518812010 ◽

2018 ◽

Vol 82 (1) ◽

pp. 51-74 ◽

Cited By ~ 3

Author(s):

Alice Goffman

Keyword(s):

Life Course ◽

Social Inequality ◽

Qualitative Data ◽

Turning Points ◽

Real Life ◽

Emotional Energy ◽

Life Trajectories ◽

Time Out ◽

Collective Effervescence ◽

The Life Course

Reviving classical attention to gathering times as sites of transformation and building on more recent microsociological work, this paper uses qualitative data to show how social occasions open up unexpected bursts of change in the lives of those attending. They do this by pulling people into a special realm apart from normal life, generating collective effervescence and emotional energy, bringing usually disparate people together, forcing public rankings, and requiring complex choreography, all of which combine to make occasions sites of inspiration and connection as well as sites of offense and violation. Rather than a time out from “real” life, social occasions hold an outsized potential to unexpectedly shift the course that real life takes. Implications for microsociology, social inequality, and the life course are considered.

Download Full-text

SSID Oracle Attack on Undisclosed Wi-Fi Preferred Network Lists

Wireless Communications and Mobile Computing ◽

10.1155/2018/5153265 ◽

2018 ◽

Vol 2018 ◽

pp. 1-15 ◽

Cited By ~ 2

Author(s):

Ante Dagelić ◽

Toni Perković ◽

Bojan Vujatović ◽

Mario Čagalj

Keyword(s):

Location Privacy ◽

Real Life ◽

User Mobility ◽

Window Of Opportunity ◽

Privacy Concerns ◽

Private Location ◽

Life Tests ◽

Active Attacks ◽

Recommender Algorithm ◽

Dictionary Attacks

User’s location privacy concerns have been further raised by today’s Wi-Fi technology omnipresence. Preferred Network Lists (PNLs) are a particularly interesting source of private location information, as devices are storing a list of previously used hotspots. Privacy implications of a disclosed PNL have been covered by numerous papers, mostly focusing on passive monitoring attacks. Nowadays, however, more and more devices no longer transmit their PNL in clear, thus mitigating passive attacks. Hidden PNLs are still vulnerable against active attacks whereby an attacker mounts a fake SSID hotspot set to one likely contained within targeted PNL. If the targeted device has this SSID in the corresponding PNL, it will automatically initiate a connection with the fake hotspot thus disclosing this information to the attacker. By iterating through different SSIDs (from a predefined dictionary) the attacker can eventually reveal a big part of the hidden PNL. Considering user mobility, executing active attacks usually has to be done within a short opportunity window, while targeting nontrivial SSIDs from user’s PNL. The existing work on active attacks against hidden PNLs often neglects both of these challenges. In this paper we propose a simple mathematical model for analyzing active SSID dictionary attacks, allowing us to optimize the effectiveness of the attack under the above constraints (limited window of opportunity and targeting nontrivial SSIDs). Additionally, we showcase an example method for building an effective SSID dictionary using top-N recommender algorithm and validate our model through simulations and extensive real-life tests.

Download Full-text

Unpacking Privacy: Willingness to pay to protect personal data

10.31234/osf.io/ahwe4 ◽

2019 ◽

Cited By ~ 1

Author(s):

Anya Skatova ◽

Rebecca Louise McDonald ◽

Sinong Ma ◽

Carsten Maple

Keyword(s):

Business Models ◽

Personal Data ◽

Digital Economy ◽

Privacy Preferences ◽

Individual Privacy ◽

Free Service ◽

Different Types ◽

The Government ◽

Evaluation Techniques ◽

Future Business

Data is key for the digital economy, underpinning business models and service provision, and a lot of these valuable datasets are personal in nature. Information about individual behaviour is collected regularly by organisations. This information has value to businesses, the government and third parties. It is not clear what value this personal data has to consumers themselves. Much of the digital economy is predicated on people sharing personal data, however if individuals value their privacy, they may choose to withhold this data unless the perceived benefits of sharing outweigh the perceived value of keeping the data private. Further, they might be willing to pay for an otherwise free service if paying allowed them to avoid sharing personal data. We used five evaluation techniques to study preferences for protecting personal data online and found that consumers assign a positive value to keeping a variety of types of personal data private. We show that participants are prepared to pay different amounts to protect different types of data, suggesting there is no simple function to assign monetary value that can be identified for individual privacy in the digital economy. The majority of participants displayed remarkable consistency in their rankings of the importance of different types of data, a finding that indicates the existence of stable individual privacy preferences in protecting personal data. We discuss our findings in the context of research on the value of privacy and privacy preferences, and in terms of implications for future business models and consumer protection.

Download Full-text

Location Privacy in the Wake of the GDPR

10.20944/preprints201902.0227.v1 ◽

2019 ◽

Author(s):

Yola Georgiadou ◽

Rolf de By ◽

Ourania Kounadi

Keyword(s):

Data Protection ◽

Location Privacy ◽

Cultural Theory ◽

Personal Data ◽

The European Union ◽

General Data Protection Regulation ◽

Digital Ecosystem ◽

Protection Legislation ◽

General Data ◽

Conducting Research

The General Data Protection Regulation (GDPR) protects the personal data of natural persons and at the same time allows the free movement of such data within the European Union (EU). Hailed as majestic by admirers and dismissed as protectionist by critics, the Regulation is expected to have a profound impact around the world, including in the African Union (AU). For European–African consortia conducting research that may affect the privacy of African citizens, the question is ‘how to protect personal data of data subjects while at the same time ensuring a just distribution of the benefits of a global digital ecosystem?’ We use location privacy as a point of departure, because information about an individual’s location is different from other kinds of personally identifiable information. We analyse privacy at two levels, individual and cultural. Our perspective is interdisciplinary: we draw from computer science to describe three scenarios of transformation of volunteered/observed information to inferred information about a natural person and from cultural theory to distinguish four privacy cultures emerging within the EU in the wake of GDPR. We highlight recent data protection legislation in the AU and discuss factors that may accelerate or inhibit the alignment of data protection legislation in the AU with the GDPR.

Download Full-text

A Combination of Spatial Pyramid and Inverted Index for Large-Scale Image Retrieval

Computer Vision ◽

10.4018/978-1-5225-5204-8.ch054 ◽

2018 ◽

pp. 1307-1321

Author(s):

Vinh-Tiep Nguyen ◽

Thanh Duc Ngo ◽

Minh-Triet Tran ◽

Duy-Dinh Le ◽

Duc Anh Duong

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Spatial Information ◽

Real Life ◽

Inverted Index ◽

Bag Of Words ◽

Visual Words ◽

Benchmark Datasets ◽

Large Scale Image Retrieval ◽

Inverted Indexing

Large-scale image retrieval has been shown remarkable potential in real-life applications. The standard approach is based on Inverted Indexing, given images are represented using Bag-of-Words model. However, one major limitation of both Inverted Index and Bag-of-Words presentation is that they ignore spatial information of visual words in image presentation and comparison. As a result, retrieval accuracy is decreased. In this paper, the authors investigate an approach to integrate spatial information into Inverted Index to improve accuracy while maintaining short retrieval time. Experiments conducted on several benchmark datasets (Oxford Building 5K, Oxford Building 5K+100K and Paris 6K) demonstrate the effectiveness of our proposed approach.

Download Full-text

A Combination of Spatial Pyramid and Inverted Index for Large-Scale Image Retrieval

International Journal of Multimedia Data Engineering and Management ◽

10.4018/ijmdem.2015040103 ◽

2015 ◽

Vol 6 (2) ◽

pp. 37-51 ◽

Cited By ~ 2

Author(s):

Vinh-Tiep Nguyen ◽

Thanh Duc Ngo ◽

Minh-Triet Tran ◽

Duy-Dinh Le ◽

Duc Anh Duong

Keyword(s):

Image Retrieval ◽

Large Scale ◽

Spatial Information ◽

Real Life ◽

Inverted Index ◽

Bag Of Words ◽

Visual Words ◽

Benchmark Datasets ◽

Large Scale Image Retrieval ◽

Inverted Indexing

Download Full-text