Natural Cities Generated from All Building Locations in America

Bin Jiang

doi:10.3390/data4020059

Natural Cities Generated from All Building Locations in America

Data ◽

10.3390/data4020059 ◽

2019 ◽

Vol 4 (2) ◽

pp. 59 ◽

Cited By ~ 2

Author(s):

Bin Jiang

Keyword(s):

Big Data ◽

Visualization Tool ◽

Top Down ◽

Human Settlements ◽

Valuable Data ◽

Related Research ◽

Heavy Tailed Distribution ◽

Data Source ◽

Heavy Tailed ◽

Natural Cities

Authorities define cities—or human settlements in general—through imposing top-down rules in terms of whether buildings belong to cities. Emerging geospatial big data makes it possible to define cities from the bottom up, i.e., buildings determine themselves whether they belong to a city using the notion of natural cities and based on head/tail breaks, which is a classification and visualization tool for data with a heavy-tailed distribution. In this paper, we used 125 million building locations—all building footprints of America (mainland) or their centroids more precisely—to generate 2.1 million natural cities in the country (see the URL as shown in the note of Figure 1). In contrast to government defined city boundaries, these natural cities constitute a valuable data source for city-related research.

Download Full-text

Natural Cities Generated from All Building Locations in America

10.20944/preprints201904.0283.v1 ◽

2019 ◽

Author(s):

Bin Jiang

Keyword(s):

Big Data ◽

Visualization Tool ◽

Top Down ◽

Human Settlements ◽

Valuable Data ◽

Related Research ◽

Heavy Tailed Distribution ◽

Data Source ◽

Heavy Tailed ◽

Natural Cities

Authorities define cities – or human settlements in general – through imposing top-down rules in terms of whether buildings belong to cities. Emerging geospatial big data makes it possible to define cities from the bottom up, i.e., buildings determine themselves whether they belong to a city based on the notion of natural cities that is defined based on head/tail breaks, a classification and visualization tool for data with a heavy-tailed distribution. In this paper, we used 125 million building locations – all building footprints of America (mainland) or their centroids more precisely – to derive 2.1 million natural cities in the country (http://lifegis.hig.se/uscities/). These natural cities – in contrast to government defined city boundaries – constitute a valuable data source for city-related research.

Download Full-text

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

International Journal of Distributed and Cloud Computing ◽

10.21863/ijdcc/2015.3.2.009 ◽

2015 ◽

Vol 3 (2) ◽

Author(s):

Shalin Eliabeth S. ◽

Sarju S.

Keyword(s):

Big Data ◽

Data Privacy ◽

Privacy Preservation ◽

Experimental Result ◽

Map Reduce ◽

Distributed Environment ◽

Top Down ◽

Two Phase ◽

Data Anonymization ◽

Big Data Privacy

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Download Full-text

Personalized Augmented Reality Based Tourism System: Big Data and User Demographic Contexts

Applied Sciences ◽

10.3390/app11136047 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6047

Author(s):

Soheil Rezaee ◽

Abolghasem Sadeghi-Niaraki ◽

Maryam Shakeri ◽

Soo-Mi Choi

Keyword(s):

Big Data ◽

Augmented Reality ◽

User Study ◽

Spatial Information ◽

Ease Of Use ◽

Tourist Attraction ◽

Gender And Education ◽

Time Distance ◽

Data Source ◽

The Right

A lack of required data resources is one of the challenges of accepting the Augmented Reality (AR) to provide the right services to the users, whereas the amount of spatial information produced by people is increasing daily. This research aims to design a personalized AR that is based on a tourist system that retrieves the big data according to the users’ demographic contexts in order to enrich the AR data source in tourism. This research is conducted in two main steps. First, the type of the tourist attraction where the users interest is predicted according to the user demographic contexts, which include age, gender, and education level, by using a machine learning method. Second, the correct data for the user are extracted from the big data by considering time, distance, popularity, and the neighborhood of the tourist places, by using the VIKOR and SWAR decision making methods. By about 6%, the results show better performance of the decision tree by predicting the type of tourist attraction, when compared to the SVM method. In addition, the results of the user study of the system show the overall satisfaction of the participants in terms of the ease-of-use, which is about 55%, and in terms of the systems usefulness, about 56%.

Download Full-text

Data Source Selection in Big Data Context

Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services ◽

10.1145/3366030.3366121 ◽

2019 ◽

Author(s):

Hicham Moad Safhi ◽

Bouchra Frikh ◽

Brahim Ouhbi

Keyword(s):

Big Data ◽

Source Selection ◽

Data Source ◽

Data Context

Download Full-text

Mapping the United Nations Fundamental Principles of Official Statistics against new and big data sources

Statistical Journal of the IAOS ◽

10.3233/sji-210789 ◽

2021 ◽

Vol 37 (1) ◽

pp. 161-169

Author(s):

Dominik Rozkrut ◽

Olga Świerkot-Strużewska ◽

Gemma Van Halderen

Keyword(s):

Big Data ◽

Public Information ◽

Fundamental Principle ◽

Data Sources ◽

Official Statistics ◽

Development Agenda ◽

Data Gaps ◽

Data Source ◽

Exciting Time ◽

Statistical Systems

Never has there been a more exciting time to be an official statistician. The data revolution is responding to the demands of the CoVID-19 pandemic and a complex sustainable development agenda to improve how data is produced and used, to close data gaps to prevent discrimination, to build capacity and data literacy, to modernize data collection systems and to liberate data to promote transparency and accountability. But can all data be liberated in the production and communication of official statistics? This paper explores the UN Fundamental Principles of Official Statistics in the context of eight new and big data sources. The paper concludes each data source can be used for the production of official statistics in adherence with the Fundamental Principles and argues these data sources should be used if National Statistical Systems are to adhere to the first Fundamental Principle of compiling and making available official statistics that honor citizen’s entitlement to public information.

Download Full-text

Big data actionable intelligence architecture

Journal Of Big Data ◽

10.1186/s40537-020-00378-7 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Tian J. Ma ◽

Rudy J. Garcia ◽

Forest Danford ◽

Laura Patrizi ◽

Jennifer Galasso ◽

...

Keyword(s):

Big Data ◽

Real Time ◽

Digital Media ◽

Decision Makers ◽

Time Data ◽

Human In The Loop ◽

Course Of Action ◽

Insufficient Time ◽

Data Source ◽

Different Sources

AbstractThe amount of data produced by sensors, social and digital media, and Internet of Things (IoTs) are rapidly increasing each day. Decision makers often need to sift through a sea of Big Data to utilize information from a variety of sources in order to determine a course of action. This can be a very difficult and time-consuming task. For each data source encountered, the information can be redundant, conflicting, and/or incomplete. For near-real-time application, there is insufficient time for a human to interpret all the information from different sources. In this project, we have developed a near-real-time, data-agnostic, software architecture that is capable of using several disparate sources to autonomously generate Actionable Intelligence with a human in the loop. We demonstrated our solution through a traffic prediction exemplar problem.

Download Full-text

High Resolution Petrophysics – Wellbore Images as Big Data Source for Reservoir Engineering and Production Technology

10.3997/2214-4609.201900980 ◽

2019 ◽

Author(s):

G. Burmester

Keyword(s):

Big Data ◽

High Resolution ◽

Production Technology ◽

Reservoir Engineering ◽

Data Source

Download Full-text

Multi-Level Relationships between Satellite-Derived Nighttime Lighting Signals and Social Media–Derived Human Population Dynamics

Remote Sensing ◽

10.3390/rs10071128 ◽

2018 ◽

Vol 10 (7) ◽

pp. 1128 ◽

Cited By ~ 17

Author(s):

Ting Ma

Keyword(s):

Social Media ◽

Population Dynamics ◽

Big Data ◽

Human Activities ◽

Human Population ◽

City Size ◽

Human Settlements ◽

Good Opportunity ◽

Nighttime Light ◽

Multi Level

Satellite-based measurements of the artificial nighttime light brightness (NTL) have been extensively used for studying urbanization and socioeconomic dynamics in a temporally consistent and spatially explicit manner. The increasing availability of geo-located big data detailing human population dynamics provides a good opportunity to explore the association between anthropogenic nocturnal luminosity and corresponding human activities, especially at fine time/space scales. In this study, we used Visible Infrared Imaging Radiometer Suite (VIIRS) day/night band (DNB)–derived nighttime light images and the gridded number of location requests (NLR) from China’s largest social media platform to investigate the quantitative relationship between nighttime light radiances and human population dynamics across China at four levels: the provincial, city, county, and pixel levels. Our results show that the linear relationship between the NTL and NLR might vary with the observation level and magnitude. The dispersion between the two variables likely increases with the observation scale, especially at the pixel level. The effect of spatial autocorrelation and other socioeconomic factors on the relationship should be taken into account for nighttime light-based measurements of human activities. Furthermore, the bivariate relationship between the NTL and NLR was employed to generate a partition of human settlements based on the combined features of nighttime lights and human population dynamics. Cross-regional comparisons of the partitioned results indicate a diverse co-distribution of the NTL and NLR across various types of human settlements, which could be related to the city size/form and urbanization level. Our findings may provide new insights into the multi-level responses of nighttime light signals to human activity and the potential application of nighttime light data in association with geo-located big data for investigating the spatial patterns of human settlement.

Download Full-text

A Bias-Reduced Estimator for the Mean of a Heavy-Tailed Distribution with an Infinite Second Moment

SSRN Electronic Journal ◽

10.2139/ssrn.2013083 ◽

2012 ◽

Author(s):

Brahim Brahimi ◽

Djamel Meraghni ◽

Necir Abdelhakim ◽

Yahia Djabrane

Keyword(s):

Second Moment ◽

Heavy Tailed Distribution ◽

The Mean ◽

Heavy Tailed

Download Full-text

A Systematic Study for Big Data Stream Processing Frameworks

Journal on Advances in Theoretical and Applied Informatics ◽

10.26729/jadi.v2i2.1914 ◽

2016 ◽

Vol 2 (2) ◽

pp. 4

Author(s):

Ali Yazici ◽

Ziya Karakaya ◽

Mohammed Alayyoub

Keyword(s):

Big Data ◽

Systematic Study ◽

Data Stream ◽

Stream Processing ◽

Systematic Mapping ◽

Related Research ◽

Research Outcomes ◽

Research Questions ◽

State Of Art ◽

Processing Framework

The choice of the most effective stream processing framework (SPF) for Big Data has been an important issue among the researchers and practioners. Each of the SPFs has different cutting edge technologies in their steps of processing the data in motion that gives them a better advantage over the others. Even though, these technologies used in each SPF might better them, it is still difficult to conclude which framework berforms better under different scenarios and conditions. In this paper, we aim to show trends and differences about several SPFs for Big Data by using the so called Systematic Mapping (SM) approach using the related research outcomes. To achieve this objective, nine research questions (RQs) were raised, in which 91 studies that were conducted between 2010 and 2015 were evaluated. We present the trends by classifying the research on SPFs with respect to the proposed RQs which can direct researchers in getting an state-of-art overview of the field.

Download Full-text