hidden databases Latest Research Papers

AbstractMuch data on the web is available in hidden databases. Users browse their contents by sending search queries to form-based interfaces or APIs. Yet, hidden databases just return the top-k result entries and limit the number of queries per time interval. Such access restrictions constrict those tasks that require many/specific queries or need to access many/all data entries. For a temporary solution, an unrestricted local snapshot can be created by crawling the hidden database. Yet, keeping the snapshot permanently consistent is challenging due to the access restrictions of its origin. In this paper, we propose a replication approach providing permanent unrestricted access to the local copy of a hidden database with dynamic changes. To this end, we present an algorithm to effectively crawl hidden databases that outperforms the state of the art. Furthermore, we propose a new way to continuously control the consistency of the replicated database in an efficient manner. We also introduce the cloud-based architecture of a replication service for hidden databases. We show the effectiveness of the approach through a variety of reproducible experimental evaluations.

Download Full-text

Optimized Processing of a Batch of Aggregate Queries over Hidden Databases

2017 International Conference on Computer and Applications (ICCA) ◽

10.1109/comapp.2017.8079754 ◽

2017 ◽

Author(s):

Eman Rezk ◽

Aboubakr Aqle ◽

Ali Jaoua ◽

Gautam Das ◽

Nan Zhang

Keyword(s):

Aggregate Queries ◽

Hidden Databases

Download Full-text

Breaking the Top-k Restriction of the kNN Hidden Databases

2016 7th International Conference on Cloud Computing and Big Data (CCBD) ◽

10.1109/ccbd.2016.066 ◽

2016 ◽

Author(s):

Honglin Li ◽

Zhiguo Gong

Keyword(s):

Hidden Databases

Download Full-text

Aggregate Estimation in Hidden Databases with Checkbox Interfaces

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2014.2365800 ◽

2015 ◽

Vol 27 (5) ◽

pp. 1192-1204 ◽

Cited By ~ 1

Author(s):

Hui Yan ◽

Zhiguo Gong ◽

Nan Zhang ◽

Tao Huang ◽

Hua Zhong ◽

...

Keyword(s):

Aggregate Estimation ◽

Hidden Databases

Download Full-text

RETRIEVING DEEP WEB DATA THROUGH MULTI-ATTRIBUTES INTERFACES WITH STRUCTURED QUERIES

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194011005396 ◽

2011 ◽

Vol 21 (04) ◽

pp. 523-542 ◽

Cited By ~ 9

Author(s):

JIAN-WEI TIAN ◽

WEN-HUI QI ◽

XIAO-XIAO LIU

Keyword(s):

Web Sites ◽

Web Search ◽

Deep Web ◽

Structured Data ◽

Web Data ◽

Rule Based ◽

Web Interfaces ◽

Novel Approach ◽

The Web ◽

Hidden Databases

A great deal of data on the Web lies in the hidden databases, or the deep Web. Most of the deep Web data is not directly available and can only be accessed through the query interfaces. Current research on deep Web search has focused on crawling the deep Web data via Web interfaces with keywords queries. However, these keywords-based methods have inherent limitations because of the multi-attributes and top-k features of the deep Web. In this paper we propose a novel approach for siphoning structured data with structured queries. Firstly, in order to retrieve all the data non-repeatedly in hidden databases, we model the hidden database as a hierarchy tree. Under this theoretical framework, data retrieving is transformed into the traversing problem in a tree. We also propose techniques to narrow the query space by using heuristic rule, based on mutual information, to guide the traversal process. We conduct extensive experiments over real deep Web sites and controlled databases to illustrate the coverage and efficiency of our techniques.

Download Full-text

Dual Layer Privacy Model for Hidden Databases

International Journal of Computer Applications ◽

10.5120/287-449 ◽

2010 ◽

Vol 1 (13) ◽

pp. 22-25

Author(s):

Richa Jindal ◽

Chander Kiran

Keyword(s):

Privacy Model ◽

Hidden Databases

Download Full-text

Leveraging COUNT Information in Sampling Hidden Databases

2009 IEEE 25th International Conference on Data Engineering ◽

10.1109/icde.2009.112 ◽

2009 ◽

Cited By ~ 22

Author(s):

Arjun Dasgupta ◽

Nan Zhang ◽

Gautam Das

Keyword(s):

Hidden Databases

Download Full-text

Privacy preservation of aggregates in hidden databases

Proceedings of the 35th SIGMOD international conference on Management of data - SIGMOD '09 ◽

10.1145/1559845.1559863 ◽

2009 ◽

Cited By ~ 10

Author(s):

Arjun Dasgupta ◽

Nan Zhang ◽

Gautam Das ◽

Surajit Chaudhuri

Keyword(s):

Privacy Preservation ◽

Hidden Databases

Download Full-text

Probability Model Based Hidden Databases Sampling Approach

2008 4th International Conference on Wireless Communications, Networking and Mobile Computing ◽

10.1109/wicom.2008.2575 ◽

2008 ◽

Author(s):

Jian-Wei Tian ◽

Shi-Jun Li ◽

Qi Lu

Keyword(s):

Probability Model ◽

Model Based ◽

Sampling Approach ◽

Hidden Databases

Download Full-text

hidden databases
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Publisher Correction to: A third-party replication service for dynamic hidden databases

A third-party replication service for dynamic hidden databases

Optimized Processing of a Batch of Aggregate Queries over Hidden Databases

Breaking the Top-k Restriction of the kNN Hidden Databases

Aggregate Estimation in Hidden Databases with Checkbox Interfaces

RETRIEVING DEEP WEB DATA THROUGH MULTI-ATTRIBUTES INTERFACES WITH STRUCTURED QUERIES

Dual Layer Privacy Model for Hidden Databases

Leveraging COUNT Information in Sampling Hidden Databases

Privacy preservation of aggregates in hidden databases

Probability Model Based Hidden Databases Sampling Approach

Export Citation Format

hidden databasesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Publisher Correction to: A third-party replication service for dynamic hidden databases

A third-party replication service for dynamic hidden databases

Optimized Processing of a Batch of Aggregate Queries over Hidden Databases

Breaking the Top-k Restriction of the kNN Hidden Databases

Aggregate Estimation in Hidden Databases with Checkbox Interfaces

RETRIEVING DEEP WEB DATA THROUGH MULTI-ATTRIBUTES INTERFACES WITH STRUCTURED QUERIES

Dual Layer Privacy Model for Hidden Databases

Leveraging COUNT Information in Sampling Hidden Databases

Privacy preservation of aggregates in hidden databases

Probability Model Based Hidden Databases Sampling Approach

hidden databases
Recently Published Documents