scholarly journals Efficient Sensitive Information Classification and Topic Tracking Based on Tibetan Web Pages

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 55643-55652 ◽  
Author(s):  
Guixian Xu ◽  
Ziheng Yu ◽  
Qi Qi
2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Efthimios Alepis ◽  
Constantinos Patsakis

The extensive adoption of mobile devices in our everyday lives, apart from facilitating us through their various enhanced capabilities, has also raised serious privacy concerns. While mobile devices are equipped with numerous sensors which offer context-awareness to their installed apps, they can also be exploited to reveal sensitive information when correlated with other data or sources. Companies have introduced a plethora of privacy invasive methods to harvest users’ personal data for profiling and monetizing purposes. Nonetheless, up till now, these methods were constrained by the environment they operate, e.g., browser versus mobile app, and since only a handful of businesses have actual access to both of these environments, the conceivable risks could be calculated and the involved enterprises could be somehow monitored and regulated. This work introduces some novel user deanonymization approaches for device and user fingerprinting in Android. Having Android AOSP as our baseline, we prove that web pages, by using several inherent mechanisms, can cooperate with installed mobile apps to identify which sessions operate in specific devices and consequently further expose users’ privacy.


2020 ◽  
Vol 64 (3) ◽  
pp. 2057-2073
Author(s):  
Jinlin Wang ◽  
Xing Wang ◽  
Hongli Zhang ◽  
Binxing Fang ◽  
Yuchen Yang ◽  
...  

Author(s):  
Snehalata K. Funde ◽  
Gandharba Swain

These days e-medical services frameworks are getting famous for taking care of patients from far-off spots, so a lot of medical services information like the patient’s name, area, contact number, states of being are gathered distantly to treat the patients. A lot of information gathered from the different assets is named big data. The enormous sensitive information about the patient contains delicate data like systolic BP, pulse, temperature, the current state of being, and contact number of patients that should be recognized and sorted appropriately to shield it from abuse. This article presents a weightbased similarity (WBS) strategy to characterize the enormous information of health care data into two classifications like sensitive information and normal information. In the proposed method, the training dataset is utilized to sort information and it comprises of three fundamental advances like information extraction, mapping of information with the assistance of the training dataset, evaluation of the weight of input data with the threshold value to classify the data. The proposed strategy produces better outcomes with various assessment boundaries like precision, recall, F1 score, and accuracy value 92% to categorize the big data. Weka tool is utilized for examination among WBS and different existing order procedures.


It is important to get users’ privacy requirements through data or information classification during the system design. Currently, the citizen-centric perspective of privacy requirement is not well understood. To fill this gap a study with the objectives of to investigate citizens’ privacy requirements and need through their privacy preferences has been done. From the data analysis, the citizen-centric preferences’ set was developed based on the classification of personal and sensitive information that has been obtained through a survey of 350 respondents. The result is configured into a reference table and sensitivity classification tool respectively. Therefore, we suggested the tool to be used as a classifying method to classify sensitive and personal information for system design.


2019 ◽  
Vol 2019 ◽  
pp. 1-15 ◽  
Author(s):  
Weiping Wang ◽  
Feng Zhang ◽  
Xi Luo ◽  
Shigeng Zhang

Through well-designed counterfeit websites, phishing induces online users to visit forged web pages to obtain their private sensitive information, e.g., account number and password. Existing antiphishing approaches are mostly based on page-related features, which require to crawl content of web pages as well as accessing third-party search engines or DNS services. This not only leads to their low efficiency in detecting phishing but also makes them rely on network environment and third-party services heavily. In this paper, we propose a fast phishing website detection approach called PDRCNN that relies only on the URL of the website. PDRCNN neither needs to retrieve content of the target website nor uses any third-party services as previous approaches do. It encodes the information of an URL into a two-dimensional tensor and feeds the tensor into a novelly designed deep learning neural network to classify the original URL. We first use a bidirectional LSTM network to extract global features of the constructed tensor and give all string information to each character in the URL. After that, we use a CNN to automatically judge which characters play key roles in phishing detection, capture the key components of the URL, and compress the extracted features into a fixed length vector space. By combining the two types of networks, PDRCNN achieves better performance than just using either one of them. We built a dataset containing nearly 500,000 URLs which are obtained through Alexa and PhishTank. Experimental results show that PDRCNN achieves a detection accuracy of 97% and an AUC value of 99%, which is much better than state-of-the-art approaches. Furthermore, the recognition process is very fast: on the trained PDRCNN model, the average per URL detection time only cost 0.4 ms.


Abstract Personal information, as well as web pages security are important for everyone because attackers used to steel our sensitive information or damaged that websites. Cross Site Scripting XSS is one type of the methods that is used by attackers. Since web browser supports the execution of scripting commands embedded in the retrieved content, attacker can exploit this feature maliciously to violate the client security. Content Management Systems CMSs give web developer an easy way to have personal websites, for those people without security prior experience, and who would be under great hunting of attackers. They believe that Content Management System just a plug-in, but it is really a website.In this paper, we concentrate on crossing site scripting attacks problem, as one of the most common attacks in the recent World Wide Web. In this research, experiments are limited to Joomla and WordPress websites. At the end, we extracted some security guidance and rules in general for all Content Management Systems designers. Some of these rules are beneficial; especially for Joomla and WordPress developers. In this work, we trained a group of amateurs to develop their websites using Joomla and WordPress through our extracted security guidance. We believe that this work was not done before.


AI & Society ◽  
2022 ◽  
Author(s):  
Lise Jaillant ◽  
Annalina Caputo

AbstractCo-authored by a Computer Scientist and a Digital Humanist, this article examines the challenges faced by cultural heritage institutions in the digital age, which have led to the closure of the vast majority of born-digital archival collections. It focuses particularly on cultural organizations such as libraries, museums and archives, used by historians, literary scholars and other Humanities scholars. Most born-digital records held by cultural organizations are inaccessible due to privacy, copyright, commercial and technical issues. Even when born-digital data are publicly available (as in the case of web archives), users often need to physically travel to repositories such as the British Library or the Bibliothèque Nationale de France to consult web pages. Provided with enough sample data from which to learn and train their models, AI, and more specifically machine learning algorithms, offer the opportunity to improve and ease the access to digital archives by learning to perform complex human tasks. These vary from providing intelligent support for searching the archives to automate tedious and time-consuming tasks.  In this article, we focus on sensitivity review as a practical solution to unlock digital archives that would allow archival institutions to make non-sensitive information available. This promise to make archives more accessible does not come free of warnings for potential pitfalls and risks: inherent errors, "black box" approaches that make the algorithm inscrutable, and risks related to bias, fake, or partial information. Our central argument is that AI can deliver its promise to make digital archival collections more accessible, but it also creates new challenges - particularly in terms of ethics. In the conclusion, we insist on the importance of fairness, accountability and transparency in the process of making digital archives more accessible.


Crisis ◽  
2018 ◽  
Vol 39 (3) ◽  
pp. 197-204 ◽  
Author(s):  
Hajime Sueki ◽  
Jiro Ito

Abstract. Background: Gatekeeper training is an effective suicide prevention strategy. However, the appropriate targets of online gatekeeping have not yet been clarified. Aim: We examined the association between the outcomes of online gatekeeping using the Internet and the characteristics of consultation service users. Method: An advertisement to encourage the use of e-mail-based psychological consultation services among viewers was placed on web pages that showed the results of searches using suicide-related keywords. All e-mails received between October 2014 and December 2015 were replied to as part of gatekeeping, and the obtained data (responses to an online questionnaire and the content of the received e-mails) were analyzed. Results: A total of 154 consultation service users were analyzed, 35.7% of whom were male. The median age range was 20–29 years. Online gatekeeping was significantly more likely to be successful when such users faced financial/daily life or workplace problems, or revealed their names (including online names). By contrast, the activity was more likely to be unsuccessful when it was impossible to assess the problems faced by consultation service users. Conclusion: It may be possible to increase the success rate of online gatekeeping by targeting individuals facing financial/daily life or workplace problems with marked tendencies for self-disclosure.


Sign in / Sign up

Export Citation Format

Share Document