Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies

BioMed Research International ◽

10.1155/2015/904541 ◽

2015 ◽

Vol 2015 ◽

pp. 1-15 ◽

Cited By ~ 15

Author(s):

Alexandre G. de Brevern ◽

Jean-Philippe Meyniel ◽

Cécile Fairhead ◽

Cécile Neuvéglise ◽

Alain Malpertuy

Keyword(s):

Big Data ◽

Human Genome ◽

Business Intelligence ◽

Relational Databases ◽

Complete Sequence ◽

Nosql Databases ◽

Technology Innovations ◽

Data Formats ◽

It Innovation ◽

High Processing

Sequencing the human genome began in 1994, and 10 years of work were necessary in order to provide a nearly complete sequence. Nowadays, NGS technologies allow sequencing of a whole human genome in a few days. This deluge of data challenges scientists in many ways, as they are faced with data management issues and analysis and visualization drawbacks due to the limitations of current bioinformatics tools. In this paper, we describe how the NGS Big Data revolution changes the way of managing and analysing data. We present how biologists are confronted with abundance of methods, tools, and data formats. To overcome these problems, focus on Big Data Information Technology innovations from web and business intelligence. We underline the interest of NoSQL databases, which are much more efficient than relational databases. Since Big Data leads to the loss of interactivity with data during analysis due to high processing time, we describe solutions from the Business Intelligence that allow one to regain interactivity whatever the volume of data is. We illustrate this point with a focus on the Amadea platform. Finally, we discuss visualization challenges posed by Big Data and present the latest innovations with JavaScript graphic libraries.

Download Full-text

Formalizing the Mapping of UML Conceptual Schemas to Column-Oriented Databases

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2018070103 ◽

2018 ◽

Vol 14 (3) ◽

pp. 44-68 ◽

Cited By ~ 1

Author(s):

Fatma Abdelhedi ◽

Amal Ait Brahim ◽

Gilles Zurfluh

Keyword(s):

Big Data ◽

Data Warehouse ◽

Relational Databases ◽

Traditional Approach ◽

Physical Models ◽

Decision Making Process ◽

Nosql Databases ◽

Care Field ◽

Sufficient Degree

Nowadays, most organizations need to improve their decision-making process using Big Data. To achieve this, they have to store Big Data, perform an analysis, and transform the results into useful and valuable information. To perform this, it's necessary to deal with new challenges in designing and creating data warehouse. Traditionally, creating a data warehouse followed well-governed process based on relational databases. The influence of Big Data challenged this traditional approach primarily due to the changing nature of data. As a result, using NoSQL databases has become a necessity to handle Big Data challenges. In this article, the authors show how to create a data warehouse on NoSQL systems. They propose the Object2NoSQL process that generates column-oriented physical models starting from a UML conceptual model. To ensure efficient automatic transformation, they propose a logical model that exhibits a sufficient degree of independence so as to enable its mapping to one or more column-oriented platforms. The authors provide experiments of their approach using a case study in the health care field.

Download Full-text

When Relational-Based Applications Go to NoSQL Databases: A Survey

Information ◽

10.3390/info10070241 ◽

2019 ◽

Vol 10 (7) ◽

pp. 241

Author(s):

Geomar A. Schreiner ◽

Denio Duarte ◽

Ronaldo dos S. Melo

Keyword(s):

Big Data ◽

Comparative Analysis ◽

Relational Databases ◽

Research Area ◽

Relational Data ◽

Data Sets ◽

System Architectures ◽

Nosql Databases ◽

State Of Art ◽

Open Issues

Several data-centric applications today produce and manipulate a large volume of data, the so-called Big Data. Traditional databases, in particular, relational databases, are not suitable for Big Data management. As a consequence, some approaches that allow the definition and manipulation of large relational data sets stored in NoSQL databases through an SQL interface have been proposed, focusing on scalability and availability. This paper presents a comparative analysis of these approaches based on an architectural classification that organizes them according to their system architectures. Our motivation is that wrapping is a relevant strategy for relational-based applications that intend to move relational data to NoSQL databases (usually maintained in the cloud). We also claim that this research area has some open issues, given that most approaches deal with only a subset of SQL operations or give support to specific target NoSQL databases. Our intention with this survey is, therefore, to contribute to the state-of-art in this research area and also provide a basis for choosing or even designing a relational-to-NoSQL data wrapping solution.

Download Full-text

Security Analysis of MongoDB

10.31219/osf.io/c3w7y ◽

2020 ◽

Author(s):

Sahib Singh

Keyword(s):

Cloud Computing ◽

Big Data ◽

Relational Databases ◽

Security Analysis ◽

Nosql Databases ◽

Computing Platforms

NoSQL Databases are a form of non-relational databases whose primary purpose is to store and retrieve data. Due to recent advancements in cloud computing platforms and the emergence of Big Data, NoSQL Databases are more becoming popular than ever. In this paper we are going to understand and analyze the fundamental security features and the vulnerabilities of MongoDB and how it performs compared to relational databases on these fronts.

Download Full-text

Big Data

Pattern and Data Analysis in Healthcare Settings - Advances in Medical Technologies and Clinical Practice ◽

10.4018/978-1-5225-0536-5.ch009 ◽

2017 ◽

pp. 158-179

Author(s):

Vinod Kumar ◽

Ramjeevan Singh Thakur

Keyword(s):

Big Data ◽

Database System ◽

Smart Phones ◽

Data Handling ◽

Data Generation ◽

Business Organizations ◽

Nosql Databases ◽

Data Formats ◽

Nosql Database ◽

Tools And Techniques

With every passing day, data generation is increasing exponentially, its volume, variety, velocity are making it quite challenging to analyze, interpret, visualize for gaining the greater insights from the available data. Billions of networked sensors are being embedded in devices such as smart phones, automobiles, social media sites, laptop, PC's and industrial machines etc. that operates, generate and communicate data. Thus, the data obtained from various resources exists in structured, semi-structured and unstructured form. The traditional database system is not suitable to handle these data formats. Therefore, new tools and techniques are developed to work with these data. NoSQL is one of them. Currently, many NoSQL database are available in the market, each one of them specially designed to solve specific type of data handling problems, most of the NoSQL databases are developed with special attention to problem of business organizations and enterprises. The chapter focuses various aspects of NoSQL as tool for handling the big data.

Download Full-text

Towards Convergence in Information Systems Design

Novel Approaches to Information Systems Design - Advances in Computer and Electrical Engineering ◽

10.4018/978-1-7998-2975-1.ch011 ◽

2020 ◽

pp. 247-263

Author(s):

Deepika Prakash

Keyword(s):

Machine Learning ◽

Big Data ◽

Information Systems ◽

Data Warehouse ◽

Business Intelligence ◽

Systems Design ◽

Information Model ◽

Information Systems Design ◽

Nosql Databases ◽

Nosql Database

Three technologies—business intelligence, big data, and machine learning—developed independently and address different types of problems. Data warehouses have been used as systems for business intelligence, and NoSQL databases are used for big data. In this chapter, the authors explore the convergence of business intelligence and big data. Traditionally, a data warehouse is implemented on a ROLAP or MOLAP platform. Whereas MOLAP suffers from having propriety architecture, ROLAP suffers from the inherent disadvantages of RDBMS. In order to mitigate the drawbacks of ROLAP, the authors propose implementing a data warehouse on a NoSQL database. They choose Cassandra as their database. For this they start by identifying a generic information model that captures the requirements of the system to-be. They propose mapping rules that map the components of the information model to the Cassandra data model. They finally show a small implementation using an example.

Download Full-text

Which Way to Go for the Future

Bridging Relational and NoSQL Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-3385-6.ch008 ◽

2018 ◽

pp. 311-328

Keyword(s):

Big Data ◽

Relational Database ◽

Data Model ◽

Driving Force ◽

Relational Databases ◽

High Availability ◽

Nosql Databases ◽

The Future ◽

Data Requirements ◽

Open Issues

The chapter presents how relational databases answer to typical NoSQL features, and, vice versa, how NoSQL databases answer to typical relational features. Open issues related to the integration of relational and NoSQL databases, as well as next database generation features are discussed. The big relational database vendors have continuously worked to incorporate NoSQL features into their databases, as well as NoSQL vendors are trying to make their products more like relational databases. The convergence of these two groups of databases has been a driving force in the evolution of database market, in establishing a new level of focus to resolving big data requirements, and in enabling users to fully use data potential, wherever data is stored, in relational or NoSQL databases. In turn, the database of choice in the future will likely be one that provides the best of both worlds: flexible data model, high availability, and enterprise reliability.

Download Full-text

Automatic NoSQL to Relational Database Transformation with Dynamic Schema Mapping

Scientific Programming ◽

10.1155/2020/8813350 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Zain Aftab ◽

Waheed Iqbal ◽

Khaled Mohamad Almustafa ◽

Faisal Bukhari ◽

Muhammad Abdullah

Keyword(s):

Relational Database ◽

Business Intelligence ◽

Relational Databases ◽

State Of The Art ◽

Schema Mapping ◽

Excellent Performance ◽

Nosql Databases ◽

Nosql Database ◽

Domain Expertise ◽

Database Transformation

Recently, the use of NoSQL databases has grown to manage unstructured data for applications to ensure performance and scalability. However, many organizations prefer to transfer data from an operational NoSQL database to a SQL-based relational database for using existing tools for business intelligence, analytics, decision making, and reporting. The existing methods of NoSQL to relational database transformation require manual schema mapping, which requires domain expertise and consumes noticeable time. Therefore, an efficient and automatic method is needed to transform an unstructured NoSQL database into a structured database. In this paper, we proposed and evaluated an efficient method to transform a NoSQL database into a relational database automatically. In our experimental evaluation, we used MongoDB as a NoSQL database, and MySQL and PostgreSQL as relational databases to perform transformation tasks for different dataset sizes. We observed excellent performance, compared to the existing state-of-the-art methods, in transforming data from a NoSQL database into a relational database.

Download Full-text

Comparative study of NoSQL databases for big data storage

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.6.10072 ◽

2018 ◽

Vol 7 (2.6) ◽

pp. 83

Author(s):

Gourav Bathla ◽

Rinkle Rani ◽

Himanshu Aggarwal

Keyword(s):

Big Data ◽

Data Storage ◽

Relational Databases ◽

Large Scale ◽

Technology Selection ◽

Business Organizations ◽

Nosql Databases ◽

Real Solutions ◽

Adequate Knowledge ◽

Long Time

Big data is a collection of large scale of structured, semi-structured and unstructured data. It is generated due to Social networks, Business organizations, interaction and views of social connected users. It is used for important decision making in business and research organizations. Storage which is efficient to process this large scale of data to extract important information in less response time is the need of current competitive time. Relational databases which have ruled the storage technology for such a long time seems not suitable for mixed types of data. Data can not be represented just in the form of rows and columns in tables. NoSQL (Not only SQL) is complementary to SQL technology which can provide various formats for storage that can be easily compatible with high velocity,large volume and different variety of data. NoSQL databases are categorized in four techniques- Column oriented, Key Value based, Graph based and Document oriented databases. There are approximately 120 real solutions existing for these categories; most commonly used solutions are elaborated in Introduction section. Several research works have been carried out to analyze these NoSQL technology solutions. These studies have not mentioned the situations in which a particular data storage technique is to be chosen. In this study and analysis, we have tried our best to provide answer on technology selection based on specific requirement to the reader. In previous research, comparisons amongNoSQL data storage techniques have been described by using real examples like MongoDB, Neo4J etc. Our observation is that if users have adequate knowledge of NoSQL categories and their comparison, then it is easy for them to choose best suitable category and then real solutions can be selected from this category.

Download Full-text

An adaptive spark-based framework for querying large-scale NoSQL and relational databases

PLoS ONE ◽

10.1371/journal.pone.0255562 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0255562

Author(s):

Eman Khashan ◽

Ali Eldesouky ◽

Sally Elghamrawy

Keyword(s):

Big Data ◽

Data Storage ◽

Relational Databases ◽

Large Scale ◽

Query Languages ◽

Heterogeneous Data ◽

Query Execution ◽

Database Queries ◽

Nosql Databases ◽

Complex Queries

The growing popularity of big data analysis and cloud computing has created new big data management standards. Sometimes, programmers may interact with a number of heterogeneous data stores depending on the information they are responsible for: SQL and NoSQL data stores. Interacting with heterogeneous data models via numerous APIs and query languages imposes challenging tasks on multi-data processing developers. Indeed, complex queries concerning homogenous data structures cannot currently be performed in a declarative manner when found in single data storage applications and therefore require additional development efforts. Many models were presented in order to address complex queries Via multistore applications. Some of these models implemented a complex unified and fast model, while others’ efficiency is not good enough to solve this type of complex database queries. This paper provides an automated, fast and easy unified architecture to solve simple and complex SQL and NoSQL queries over heterogeneous data stores (CQNS). This proposed framework can be used in cloud environments or for any big data application to automatically help developers to manage basic and complicated database queries. CQNS consists of three layers: matching selector layer, processing layer, and query execution layer. The matching selector layer is the heart of this architecture in which five of the user queries are examined if they are matched with another five queries stored in a single engine stored in the architecture library. This is achieved through a proposed algorithm that directs the query to the right SQL or NoSQL database engine. Furthermore, CQNS deal with many NoSQL Databases like MongoDB, Cassandra, Riak, CouchDB, and NOE4J databases. This paper presents a spark framework that can handle both SQL and NoSQL Databases. Four scenarios’ benchmarks datasets are used to evaluate the proposed CQNS for querying different NoSQL Databases in terms of optimization process performance and query execution time. The results show that, the CQNS achieves best latency and throughput in less time among the compared systems.

Download Full-text

An Empirical Study of NoSQL Databases for Big Data

Effective Big Data Management and Opportunities for Implementation - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-0182-4.ch004 ◽

2016 ◽

pp. 60-76

Author(s):

Wen-Chen Hu ◽

Naima Kaabouch ◽

Hongyu Guo ◽

Hung-Jen Yang

Keyword(s):

Cloud Computing ◽

Big Data ◽

Empirical Study ◽

Theoretical Approach ◽

Relational Databases ◽

High Volume ◽

Inventory Systems ◽

Nosql Databases ◽

Nosql Database ◽

The One

Relational databases have dominated the database markets for decades because they perform extremely well for traditional applications like electronic commerce and inventory systems. However, the relational databases do not suit some of the contemporary applications such as big data and cloud computing well because of various reasons like their low scalability and unable to handle a high volume of data. NoSQL (not only SQL) databases are part of the solution for developing those newer applications. The approach they use is different from the one used by relational databases. This chapter discusses NoSQL databases by using an empirical instead of theoretical approach. Other than introducing the types and features of generic NoSQL databases, practical NoSQL database programming and usage are shown by using MongoDB, a NoSQL database. A summary of this research is given at the end of this chapter.

Download Full-text