Storage and Database Management for Big Data

Big Data ◽  
2016 ◽  
pp. 15-41 ◽  
Author(s):  
Vijay Gadepally ◽  
Jeremy Kepner ◽  
Albert Reuther
Keyword(s):  
Big Data ◽  
Big Data ◽  
2016 ◽  
pp. 1495-1518
Author(s):  
Mohammad Alaa Hussain Al-Hamami

Big Data is comprised systems, to remain competitive by techniques emerging due to Big Data. Big Data includes structured data, semi-structured and unstructured. Structured data are those data formatted for use in a database management system. Semi-structured and unstructured data include all types of unformatted data including multimedia and social media content. Among practitioners and applied researchers, the reaction to data available through blogs, Twitter, Facebook, or other social media can be described as a “data rush” promising new insights about consumers' choices and behavior and many other issues. In the past Big Data has been used just by very large organizations, governments and large enterprises that have the ability to create its own infrastructure for hosting and mining large amounts of data. This chapter will show the requirements for the Big Data environments to be protected using the same rigorous security strategies applied to traditional database systems.


Author(s):  
H. Wu ◽  
K. Fu

Abstract. As a kind of information carrier which is high capacity, remarkable reliability, easy to obtain and the other features,remote sensing image data is widely used in the fields of natural resources survey, monitoring, planning, disaster prevention and the others (Huang, Jie, et al, 2008). Considering about the daily application scenario for the remote sensing image in professional departments, the demand of usage and management of remote sensing big data is about to be analysed in this paper.In this paper, by combining professional department scenario, the application of remote sensing image analysis of remote sensing data in the use and management of professional department requirements, on the premise of respect the habits, is put forward to remote sensing image metadata standard for reference index, based on remote sensing image files and database management system, large data serialization of time management methods, the method to the realization of the design the metadata standard products, as well as to the standard of metadata content indexed storage of massive remote sensing image database management.


Big data is traditionally associated with distributed systems and this is understandable given that the volume dimension of Big Data appears to be best accommodated by the continuous addition of resources over a distributed network rather than the continuous upgrade of a central storage resource. Based on this implementation context, non- distributed relational database models are considered volume-inefficient and a departure from their usage contemplated by the database community. Distributed systems depend on data partitioning to determine chunks of related data and where in storage they can be accommodated. In existing Database Management Systems (DBMS), data partitioning is automated which in the opinion of this paper does not give the best results since partitioning is an NP-hard problem in terms of algorithmic time complexity. The NP-hardness is shown to be reduced by a partitioning strategy that relies on the discretion of the programmer which is more effective and flexible though requires extra coding effort. NP-hard problems are solved more effectively by a combination of discretion rather than full automation. In this paper, the partitioning process is reviewed and a programmer-based partitioning strategy implemented for an application with a relational DBMS backend. By doing this, the relational DBMS is made adaptive in the volume dimension of big data. The ACID properties (atomicity, consistency, isolation, and durability) of the relational database model which constitutes a major attraction especially for applications that process transactions is thus harnessed. On a more general note, the results of this research suggest that databases can be made adaptive in the areas of their weaknesses as a one-size-fits- all database management system may no longer be feasible.


Author(s):  
Satyaveer Singh ◽  
Mahendra Singh Aswal

We live in a digital world where a large amount of data is being generated rapidly by various diverse sources with an unprecedented rate. The term Big Data has been coined to represent a large amount of data. But Big Data could not be processed and analysed by traditional database management systems. A number of challenges such as data heterogeneity and diversity are being faced by enterprises due to high volume, variety, and velocity of Big Data. Since the past few years, some research efforts have been attempted to integrate semantic web technologies such as ontologies with Big Data. This integration is paving the way to deal with various issues that are related to the processing of Big Data. This chapter firstly uncovers the fundamentals of Big Data, its characteristics and opportunities, challenges, related current tools, and technologies. Secondly, it tries to highlight the integration of Big Data with semantic web technologies. The promising research is going on to tackle volume and velocity of Big Data by using semantic technologies.


It is well known that, at the present, a huge amount of information, often referred as Big Data, is processed by each domain of modern society. Big data are well defined by the seven dimensions: Volume, Velocity, Variety, Variability, Veracity, Visualization and Value. The traditional database management systems cannot handle the requirements of high availability, scalability and reliability emerged with Big Data. The good news is that we are now in the age of NoSQL databases. NoSQL do not have a fixed structure, they have a flexible structure and are suited for storing unstructured data produced in a large scale in various field. This work outlines the four main types of NoSQL databases and presents some of their representative solutions.


Author(s):  
Patil N. S. ◽  
Kiran P ◽  
Kiran N. P. ◽  
Naresh Patel K. M.

Data analysis, data management, and big data play a major role in both social and business perspective, in the last decade. Nowadays, the graph database is the hottest and trending research topic. A graph database is preferred to deal with the dynamic and complex relationships in connected data and offer better results. Every data element is represented as a node. For example, in social media site, a person is represented as a node, and its properties name, age, likes, and dislikes, etc and the nodes are connected with the relationships via edges. Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques to handle with. This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management.


2020 ◽  
Vol 8 (6) ◽  
pp. 1609-1615

The constant innovations and rapid developments in the IT industry have revolutionized the thinking and mindset of the people throughout the world. Government departments have also been computerized to provide transparent, efficient and responsible government through e-governance. The government have been providing access to various websites or portal for filing complaints, uploading or downloading forms, pictures, data or PDFs to avail the government services. Enlightened citizens are frequently using the portal to access government services. Thus, the size and volume of data that need to be managed by government departments have been increasing drastically under e-governance. The traditional database management system is not designed to deal with such mix type of data. Moreover, the speed at which the e-governance generated data need to be processed is another big challenge being faced by traditional database system. All the abovesaid concerns can be solved by using the emerging technology - Big Data Analytics techniques. Big data analytic techniques can make the government more efficient and transparent by processing structured, unstructured or mixed types data at a great speed. In this paper, we shall understand the scenario for the need or the emergence of big data analytics in egovernance and knowhow of Apache Spark. This paper proposes a practical approach to integrate big data analytics with egovernance using Apache Spark. This paper also reflects how major issues of traditional database management system (mixed type datasets, speed and accuracy) can be resolved through the integration of big data analytics and e-governance.


2020 ◽  
Vol 1 (1) ◽  
pp. 12-20
Author(s):  
Jeffry Jeffry

Perkembangan teknologi informasi dan data meningkat pesat di era big data seperti sekarang ini. Database Management System menjadi bagian utama yang sangat penting untuk mengontrol arus data. Penelitian ini membandingkan kinerja web server yang menggunakan RDBMS open source yang berbeda antara MySQL dan MariaDB. Pengujian dilakukan pada Oracle Virtual Machine Virtualbox menggunakan ApacheBench untuk mengukur kinerja Web Server pada SIM Manajemen Diklat Poltekpel Sorong. Hasil percobaan menunjukkan bahwa web server ketika menggunakan RDBMS MySQL cenderung memiliki performa yang cukup stabil ketika permintaan akses web di bawah 300 kali secara bersamaan yaitu pada 100,200 dan 300 kali berturut-turut sebesar 7.764/ms, 16.386/ms dan 30.025/ms. Namun, saat permintaan akses web di atas 300 secara bersamaan RDBMS MariaDB justru menunjukkan kinerja yang lebih baik. Hal ini ditunjukkan dengan permintaan akses 400 dan 500 kali web server secara bersamaan, waktu respon terlihat lebih cepat dibandingkan ketika menggunakan RDBMS MySQL berturut-turut sebesar 51.877/ms dan 54.702/ms sedangkan RDBMS mariaDB untuk permintaan akses web server secara bersamaan pada 100,200,300,400 dan 500 berturut-turut sebesar 14.213/ms, 25.642/ms, 40.831/ms, 48.021/ms dan 51.630/ms


2015 ◽  
Vol 6 (1) ◽  
pp. 1-11 ◽  
Author(s):  
M Misbachul Huda ◽  
Dian Rahma Latifa Hayun ◽  
Zhin Martun

Today the rapid growth of the internet and the massive usage of the data have led to the increasing CPU requirement, velocity for recalling data, a schema for more complex data structure management, the reliability and the integrity of the available data. This kind of data is called as Large-scale Data or Big Data. Big Data demands high volume, high velocity, high veracity and high variety. Big Data has to deal with two key issues, the growing size of the datasets and the increasing of data complexity. To overcome these issues, today researches are devoted to kind of database management system that can be optimally used for big data management. There are two kinds of database management system, relational database management system and nonrelational system that can be optimally used for big data management. There are two kinds of database management, Relational Database Management and Non-relational Database Management. This paper will give reviews about these two database management system, including description, vantage, structure and the application of each DBMS. Index Terms - Big Data, DBMS, Large-scale Data, Non-relational Database, Relational Database.


Sign in / Sign up

Export Citation Format

Share Document