scholarly journals Challenges of Big Data analysis

2014 ◽  
Vol 1 (2) ◽  
pp. 293-314 ◽  
Author(s):  
Jianqing Fan ◽  
Fang Han ◽  
Han Liu

Abstract Big Data bring new opportunities to modern society and challenges to data scientists. On the one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This paper gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogenous assumptions in most statistical methods for Big Data cannot be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.

Author(s):  
M. Govindarajan

Big data brings new opportunities to modern society and challenges to data scientists. On one hand, big data holds great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of big data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. Prior to data analysis, data must be well constructed. However, considering the variety of datasets in big data, the efficient representation, access, and analysis of unstructured or semi-structured data are still challenging. Understanding the method by which data can be preprocessed is important to improve data quality and the analysis results. The purpose of this chapter is to highlight the big data challenges and also provide a brief description of each challenge.


2021 ◽  
Vol 33 (6) ◽  
pp. 1-18
Author(s):  
Jianfei Li ◽  
Juxing Li ◽  
Jin Ji ◽  
Shengjun Meng

The coronavirus disease 2019 (COVID-19) epidemic that began in early 2020 quickly formed a global trend, bringing unprecedented shocks to many countries’ and even the global trade economy. Big data is the main feature of the Internet era, which has transformed the industrial development pattern of modern society and has now flourished in the field of trade economy; therefore, it is of great significance to apply the big data analysis technology to study the impact of the COVID-19 epidemic on the global trade economy. On the basis of summarizing and analyzing previous research works, this paper, expounded the research status and significance of the impact of the COVID-19 epidemic on the global trade economy, elaborated the development background, The study results of this paper provide a reference for further researches on the impact of the impact of the COVID-19 epidemic on the global trade economy based on big data analysis.


Author(s):  
Rupali Ahuja

The data generated today has outgrown the storage as well as computing capabilities of traditional software frameworks. Large volumes of data if aggregated and analyzed properly may provide useful insights to predict human behavior, to increase revenues, get or retain customers, improve operations, combat crime, cure diseases, etc. In conclusion, the results of effective Big Data analysis can be used to provide actionable intelligence for humans, as well as for machine consumption. New tools, techniques, technologies and methods are being developed to store, retrieve, manage, aggregate, correlate and analyze Big Data. Hadoop is a popular software framework for handling Big Data needs. Hadoop provides a distributed framework for processing and storage of large datasets. This chapter discusses in detail the Hadoop framework, its features, applications and popular distributions, and its Storage and Visualization tools.


2021 ◽  
Vol 16 (1) ◽  
pp. 9-32
Author(s):  
Mario Gómez ◽  
Narciso Salvador Tinoco Guerrero ◽  
Luis Manuel Tinoco Guerrero

The main objective of this paper is to analyze the influence that the usage of the Airbnb’s platform has had on hotel occupancy in Mexico during 2007- 2018 period. The Hotel Classification System is considered to know if there are differences in this influence, according to hotels’ category. To obtain the information from Airbnb, an application was created that extracted the public information of each lodging published on the website. Results were estimated by using the panel data econometric methodology, showing that the only negative impact the usage of Airbnb has on hotel occupancy is in 4-star hotels, and that an increase in the price of Airbnb’s lodgings produces a rise in hotel occupancy. In other hotel categories there is no negative effect. An implication is that the usage of platforms like the one studied can be moderately regulated in Mexico.


2021 ◽  
Vol 33 (6) ◽  
pp. 0-0

The coronavirus disease 2019 (COVID-19) epidemic that began in early 2020 quickly formed a global trend, bringing unprecedented shocks to many countries’ and even the global trade economy. Big data is the main feature of the Internet era, which has transformed the industrial development pattern of modern society and has now flourished in the field of trade economy; therefore, it is of great significance to apply the big data analysis technology to study the impact of the COVID-19 epidemic on the global trade economy. On the basis of summarizing and analyzing previous research works, this paper, expounded the research status and significance of the impact of the COVID-19 epidemic on the global trade economy, elaborated the development background, The study results of this paper provide a reference for further researches on the impact of the impact of the COVID-19 epidemic on the global trade economy based on big data analysis.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Lei Hu ◽  
Xianling Xia

The application degree and application scope of 5G Internet of Things technology and big data analysis technology are becoming wider and wider, bringing opportunities for the development of traditional enterprises and providing technological innovation support for the development of new enterprises. Based on 5G Internet of Things technology and big data technology, this paper designs and studies an intelligent agricultural monitoring platform. We collect crop growth data and monitor crop growth status through this platform to study the 5G-oriented IoT big data analysis method system. This paper studies the data collection and storage issues involved in the huge agricultural IoT data environment. This article analyzes the specific sources of agricultural big data, the specific methods of data collection, and the methods of various database storage technologies. Combining wireless sensor network technology, large-source data processing technology, and distributed data storage technology, a method is proposed to solve the problem of rural Internet data collection and storage in the big data environment. This paper proposes a spatiotemporal block processing TSBPS to store the first detection data. The method uses spatiotemporal preblocking, data compression, and caching to significantly improve the recording speed of near real-time storage and microdetection data. In the experimental part of this article, experiments are carried out on the key parts of the IOT-HSQM system model that may limit storage or query performance. Experimental results show that this article compares TSBPS and direct writing methods. The maximum write speed increased by 79%, and the average write speed increased by 42%. The IOT-HSQM system model can meet the requirements of compiling and query performance and statistical analysis.


2019 ◽  
Vol 9 (1) ◽  
pp. 01-12 ◽  
Author(s):  
Kristy F. Tiampo ◽  
Javad Kazemian ◽  
Hadi Ghofrani ◽  
Yelena Kropivnitskaya ◽  
Gero Michel

Sign in / Sign up

Export Citation Format

Share Document