Personal Information Classification on Aggregated Android Application’s Permissions

Md Mehedi Hassan Onik; Chul-Soo Kim; Nam-Yong Lee; Jinhong Yang

doi:10.3390/app9193997

Personal Information Classification on Aggregated Android Application’s Permissions

Applied Sciences ◽

10.3390/app9193997 ◽

2019 ◽

Vol 9 (19) ◽

pp. 3997

Author(s):

Md Mehedi Hassan Onik ◽

Chul-Soo Kim ◽

Nam-Yong Lee ◽

Jinhong Yang

Keyword(s):

Large Scale ◽

Risk Model ◽

Personal Information ◽

Personal Data ◽

User Profile ◽

Application Programming Interface ◽

User Profiling ◽

Privacy Leakage ◽

Information Classification ◽

Google Play

Android is offering millions of apps on Google Play-store by the application publishers. However, those publishers do have a parent organization and share information with them. Through the ‘Android permission system’, a user permits an app to access sensitive personal data. Large-scale personal data integration can reveal user identity, enabling new insights and earn revenue for the organizations. Similarly, aggregation of Android app permissions by the app owning parent organizations can also cause privacy leakage by revealing the user profile. This work classifies risky personal data by proposing a threat model on the large-scale app permission aggregation by the app publishers and associated owners. A Google-play application programming interface (API) assisted web app is developed that visualizes all the permissions an app owner can collectively gather through multiple apps released via several publishers. The work empirically validates the performance of the risk model with two case studies. The top two Korean app owners, seven publishers, 108 apps and 720 sets of permissions are studied. With reasonable accuracy, the study finds the contact number, biometric ID, address, social graph, human behavior, email, location and unique ID as frequently exposed data. Finally, the work concludes that the real-time tracking of aggregated permissions can limit the odds of user profiling.

Download Full-text

Public attitudes towards algorithmic personalization and use of personal data online: evidence from Germany, Great Britain, and the United States

Humanities and Social Sciences Communications ◽

10.1057/s41599-021-00787-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Anastasia Kozyreva ◽

Philipp Lorenz-Spreen ◽

Ralph Hertwig ◽

Stephan Lewandowsky ◽

Stefan M. Herzog

Keyword(s):

United States ◽

Great Britain ◽

Data Privacy ◽

Large Scale ◽

Personal Information ◽

Public Attitudes ◽

Personal Data ◽

The United States ◽

Individual Level ◽

Personalized Services

AbstractPeople rely on data-driven AI technologies nearly every time they go online, whether they are shopping, scrolling through news feeds, or looking for entertainment. Yet despite their ubiquity, personalization algorithms and the associated large-scale collection of personal data have largely escaped public scrutiny. Policy makers who wish to introduce regulations that respect people’s attitudes towards privacy and algorithmic personalization on the Internet would greatly benefit from knowing how people perceive personalization and personal data collection. To contribute to an empirical foundation for this knowledge, we surveyed public attitudes towards key aspects of algorithmic personalization and people’s data privacy concerns and behavior using representative online samples in Germany (N = 1065), Great Britain (N = 1092), and the United States (N = 1059). Our findings show that people object to the collection and use of sensitive personal information and to the personalization of political campaigning and, in Germany and Great Britain, to the personalization of news sources. Encouragingly, attitudes are independent of political preferences: People across the political spectrum share the same concerns about their data privacy and show similar levels of acceptance regarding personalized digital services and the use of private data for personalization. We also found an acceptability gap: People are more accepting of personalized services than of the collection of personal data and information required for these services. A large majority of respondents rated, on average, personalized services as more acceptable than the collection of personal information or data. The acceptability gap can be observed at both the aggregate and the individual level. Across countries, between 64% and 75% of respondents showed an acceptability gap. Our findings suggest a need for transparent algorithmic personalization that minimizes use of personal data, respects people’s preferences on personalization, is easy to adjust, and does not extend to political advertising.

Download Full-text

Feature weighted clustering for user profiling

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962317500568 ◽

2017 ◽

Vol 08 (04) ◽

pp. 1750056 ◽

Cited By ~ 1

Author(s):

Ayse Cufoglu ◽

Mahi Lohi ◽

Colin Everiss

Keyword(s):

Clustering Algorithm ◽

Personal Information ◽

Clustering Algorithms ◽

User Profile ◽

User Profiling ◽

User Profiles ◽

Related Information ◽

Weighted Clustering ◽

Feature Weight ◽

Feature Values

Personalization is the adaptation of the services to fit the user’s interests, characteristics and needs. The key to effective personalization is user profiling. Apart from traditional collaborative and content-based approaches, a number of classification and clustering algorithms have been used to classify user related information to create user profiles. However, they are not able to achieve accurate user profiles. In this paper, we present a new clustering algorithm, namely Multi-Dimensional Clustering (MDC), to determine user profiling. The MDC is a version of the Instance-Based Learner (IBL) algorithm that assigns weights to feature values and considers these weights for the clustering. Three feature weight methods are proposed for the MDC and, all three, have been tested and evaluated. Simulations were conducted with using two sets of user profile datasets, which are the training (includes 10,000 instances) and test (includes 1000 instances) datasets. These datasets reflect each user’s personal information, preferences and interests. Additional simulations and comparisons with existing weighted and non-weighted instance-based algorithms were carried out in order to demonstrate the performance of proposed algorithm. Experimental results using the user profile datasets demonstrate that the proposed algorithm has better clustering accuracy performance compared to other algorithms. This work is based on the doctoral thesis of the corresponding author.

Download Full-text

A Utility-Theoretic Approach to Privacy in Online Services

Journal of Artificial Intelligence Research ◽

10.1613/jair.3089 ◽

2010 ◽

Vol 39 ◽

pp. 633-662 ◽

Cited By ~ 22

Author(s):

A. Krause ◽

E. Horvitz

Keyword(s):

Large Scale ◽

Web Search ◽

Personal Information ◽

Personal Data ◽

Theoretic Approach ◽

Search Activity ◽

Efficient Manner ◽

Special Knowledge ◽

Limit Access

Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user's demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users.

Download Full-text

Improving User Choice Through Better Mobile Apps Transparency and Permissions Analysis

Journal of Privacy and Confidentiality ◽

10.29012/jpc.v5i2.630 ◽

2014 ◽

Vol 5 (2) ◽

Cited By ~ 10

Author(s):

Ilaria Liccardi ◽

Joseph Pato ◽

Daniel J. Weitzner

Keyword(s):

Mobile Applications ◽

Personal Information ◽

Mobile Apps ◽

Information Access ◽

Personal Data ◽

Quantitative Measure ◽

User Choice ◽

Individual Control ◽

Sensitivity Score ◽

Google Play

Our personal information, habits, likes and dislikes can be all deduced from our mobile devices. Safeguarding mobile privacy is therefore of great concern. Transparency and individual control are bedrock principles of privacy but making informed choices about which mobile apps to use has been shown to be difficult. In order to understand the dynamics of information collection in mobile apps and to demonstrate the value of transparent access to the details of mobile applications information access permissions, we have gathered information about 528,433 apps on Google Play, and analyzed the permissions requested by each app. We develop a quantitative measure of the risk posed by apps by devising a ‘sensitivity score’ to represent the number of occurrences of permissions that read personal information about users where network communication is possible. We found that 54% of apps do not access any personal data. The remaining 46% collect between 1 to 20 sensitive permissions and have the ability to transmit it outside the phone. The sensitivity of apps differs greatly between free and paid apps as well as between categories and content rating. Sensitive permissions are often mixed with a large amount of low-risk permissions and hence are difficult to identify. Easily available sensitivity scores could help users making more informed decision about choosing an app that could pose less risk in collecting personal information. Even though an app is “self-described” to be suitable for a certain subset of users (i.e children) it might contain content ratings and permission requests that are not appropriate or expected. Our experience in doing this research shows that it is difficult to obtain information about how personal data collected from apps is used or analyzed. In fact only 0.37% (1,991) of the collected apps show to have declared a “privacy policy”. Therefore, in order to make real control available to mobile users, apps distribution platforms should provide more detailed information about how their data if accessed is used. To achieve greater transparency and individual control, apps distribution platforms which do not currently make raw permission description accessible for analysis could change their design and operating policies to make this data available prior to installation.

Download Full-text

Public attitudes towards algorithmic personalization and use of personal data online: Evidence from Germany, Great Britain, and the US

10.31234/osf.io/3q4mg ◽

2020 ◽

Author(s):

Anastasia Kozyreva ◽

Philipp Lorenz-Spreen ◽

Ralph Hertwig ◽

Stephan Lewandowsky ◽

Stefan Michael Herzog

Keyword(s):

Great Britain ◽

Data Privacy ◽

Large Scale ◽

Personal Information ◽

Public Attitudes ◽

Personal Data ◽

The United States ◽

Individual Level ◽

People’S Attitudes ◽

The Individual

Despite their ubiquity online, personalization algorithms and the associated large-scale collection of personal data have largely escaped public scrutiny. Yet policy makers who wish to introduce regulations that respect people's attitudes towards privacy and algorithmic personalization on the Internet would greatly benefit from knowing how people perceive different aspects of personalization and data collection. To contribute to an empirical foundation for this knowledge, we surveyed public attitudes using representative online samples in Germany, Great Britain, and the United States on key aspects of algorithmic personalization and on people's data privacy concerns and behavior. Our findings show that people object to the collection and use of sensitive personal information and to the personalization of political campaigning and, in Germany and Great Britain, to the personalization of news sources. Encouragingly, attitudes are independent of political preferences: People across the political spectrum share the same concerns about their data privacy and the effects of personalization on news and politics. We also found that people are more accepting of personalized services than of the collection of personal data and information currently collected for these services. This acceptability gap---the difference between the acceptability of personalized online services and the acceptability of the collection and use of data and information---in people's attitudes can be observed at both the aggregate and the individual level. Our findings suggest a need for transparent algorithmic personalization that respects people’s data privacy, can be easily adjusted, and does not extend to political advertising.

Download Full-text

A review on smartphone usage data for user identification and user profiling

Communications in Science and Technology ◽

10.21924/cst.6.1.2021.363 ◽

2021 ◽

Vol 6 (1) ◽

pp. 25-34

Author(s):

Syafira Auliya ◽

Lukito Edi Nugroho ◽

Noor Akhmad Setiawan

Keyword(s):

Artificial Intelligence ◽

Personal Information ◽

Personal Data ◽

Research Topic ◽

User Profiling ◽

Data Sources ◽

User Identification ◽

Comprehensive Review ◽

Personal Data Protection ◽

Usage Data

The amount of retrievable smartphone data is escalating; while some apps on the smartphone are evidently exploiting and leaking users’ data. These phenomena potentially violate privacy and personal data protection laws as various studies have showed that technologies such as artificial intelligence could transform smartphone data into personal data by generating user identification and user profiling. User identification identifies specific users among the data based upon the users’ characteristics and users profiling generates users’ traits (e.g. age and personality) by exploring how data is correlated with personal information. Nevertheless, the comprehensive review papers discussing both of the topics are limited. This paper thus aims to provide a comprehensive review of user identification and user profiling using smartphone data. Compared to the existing review papers, this paper has a broader lens by reviewing the general applications of smartphone data before focusing on smartphone usage data. This paper also discusses some possible data sources that can be used in this research topic.

Download Full-text

Angel or Devil? A Privacy Study of Mobile Parental Control Apps

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2020-0029 ◽

2020 ◽

Vol 2020 (2) ◽

pp. 314-335 ◽

Cited By ~ 3

Author(s):

Álvaro Feal ◽

Paolo Calciati ◽

Narseo Vallina-Rodriguez ◽

Carmela Troncoso ◽

Alessandra Gorla

Keyword(s):

Personal Information ◽

Mobile Apps ◽

Online Advertising ◽

Parental Control ◽

Personal Data ◽

Point Of View ◽

Sensitive Data ◽

Control Applications ◽

Static And Dynamic Analysis ◽

Google Play

AbstractAndroid parental control applications are used by parents to monitor and limit their children’s mobile behaviour (e.g., mobile apps usage, web browsing, calling, and texting). In order to offer this service, parental control apps require privileged access to system resources and access to sensitive data. This may significantly reduce the dangers associated with kids’ online activities, but it raises important privacy concerns. These concerns have so far been overlooked by organizations providing recommendations regarding the use of parental control applications to the public.We conduct the first in-depth study of the Android parental control app’s ecosystem from a privacy and regulatory point of view. We exhaustively study 46 apps from 43 developers which have a combined 20M installs in the Google Play Store. Using a combination of static and dynamic analysis we find that: these apps are on average more permissions-hungry than the top 150 apps in the Google Play Store, and tend to request more dangerous permissions with new releases; 11% of the apps transmit personal data in the clear; 34% of the apps gather and send personal information without appropriate consent; and 72% of the apps share data with third parties (including online advertising and analytics services) without mentioning their presence in their privacy policies. In summary, parental control applications lack transparency and lack compliance with regulatory requirements. This holds even for those applications recommended by European and other national security centers.

Download Full-text

The Amendment of Personal Information Protection Act of Korea - Focusing on the Grounds for Processing Personal Data without Data Subject’s Consent -

Ewha Law Journal ◽

10.32632/elj.2020.24.3.249 ◽

2020 ◽

Vol 24 (3) ◽

pp. 249-286

Author(s):

So-Eun LEE

Keyword(s):

Personal Information ◽

Personal Data ◽

Information Protection ◽

Personal Information Protection

Download Full-text

SpecTalk: Conforming IoT Implementations to Sensor Specifications

Sensors ◽

10.3390/s21165260 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5260

Author(s):

Yi-Bing Lin ◽

Sheng-Lin Chou

Keyword(s):

Large Scale ◽

Application Programming Interface ◽

Signal Frequency ◽

Visual Test ◽

Gray Area ◽

Control Signals ◽

Information And Communication ◽

Monitoring Camera ◽

Self Test ◽

Iot Devices

Due to the fast evolution of Sensor and Internet of Things (IoT) technologies, several large-scale smart city applications have been commercially developed in recent years. In these developments, the contracts are often disputed in the acceptance due to the fact that the contract specification is not clear, resulting in a great deal of discussion of the gray area. Such disputes often occur in the acceptance processes of smart buildings, mainly because most intelligent building systems are expensive and the operations of the sub-systems are very complex. This paper proposes SpecTalk, a platform that automatically generates the code to conform IoT applications to the Taiwan Association of Information and Communication Standards (TAICS) specifications. SpecTalk generates a program to accommodate the application programming interface of the IoT devices under test (DUTs). Then, the devices can be tested by SpecTalk following the TAICS data formats. We describe three types of tests: self-test, mutual-test, and visual test. A self-test involves the sensors and the actuators of the same DUT. A mutual-test involves the sensors and the actuators of different DUTs. A visual-test uses a monitoring camera to investigate the actuators of multiple DUTs. We conducted these types of tests in commercially deployed applications of smart campus constructions. Our experiments in the tests proved that SpecTalk is feasible and can effectively conform IoT implementations to TACIS specifications. We also propose a simple analytic model to select the frequency of the control signals for the input patterns in a SpecTalk test. Our study indicates that it is appropriate to select the control signal frequency, such that the inter-arrival time between two control signals is larger than 10 times the activation delay of the DUT.

Download Full-text

Limits of data anonymity: lack of public awareness risks trust in health system activities

Life Sciences Society and Policy ◽

10.1186/s40504-021-00115-9 ◽

2021 ◽

Vol 17 (1) ◽

Author(s):

Felix Gille ◽

Caroline Brall

Keyword(s):

Large Scale ◽

Public Awareness ◽

Personal Data ◽

Public Trust ◽

Online News ◽

Healthcare Management ◽

Contact Tracing ◽

Common Denominator ◽

The Public ◽

Privacy And Anonymity

AbstractPublic trust is paramount for the well functioning of data driven healthcare activities such as digital health interventions, contact tracing or the build-up of electronic health records. As the use of personal data is the common denominator for these healthcare activities, healthcare actors have an interest to ensure privacy and anonymity of the personal data they depend on. Maintaining privacy and anonymity of personal data contribute to the trustworthiness of these healthcare activities and are associated with the public willingness to trust these activities with their personal data. An analysis of online news readership comments about the failed care.data programme in England revealed that parts of the public have a false understanding of anonymity in the context of privacy protection of personal data as used for healthcare management and medical research. Some of those commenting demanded complete anonymity of their data to be willing to trust the process of data collection and analysis. As this demand is impossible to fulfil and trust is built on a false understanding of anonymity, the inability to meet this demand risks undermining public trust. Since public concerns about anonymity and privacy of personal data appear to be increasing, a large-scale information campaign about the limits and possibilities of anonymity with respect to the various uses of personal health data is urgently needed to help the public to make better informed choices about providing personal data.

Download Full-text