scholarly journals Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency

2015 ◽  
Vol 2015 ◽  
pp. 1-7 ◽  
Author(s):  
Rodrigo Aniceto ◽  
Rene Xavier ◽  
Valeria Guimarães ◽  
Fernanda Hondo ◽  
Maristela Holanda ◽  
...  

Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.

2012 ◽  
Vol 29 (2) ◽  
pp. 290-291 ◽  
Author(s):  
Nozomu Sakurai ◽  
Takeshi Ara ◽  
Shigehiko Kanaya ◽  
Yukiko Nakamura ◽  
Yoko Iijima ◽  
...  

2019 ◽  
Vol 7 (7) ◽  
pp. 351-359
Author(s):  
Yashraj Sharma ◽  
Yashasvi Sharma

On the basis of reliability, rational models are useful but not in terms of systems which involve huge amount of data; in such cases, non-relational models are much more useful. To store large chunks of data, NoSQL databases are used. NoSQL databases are scalable and wide ranged because they are non-relationally distributed. In relational databases, it was not possible to manage data which involved very large number of Big Data applications hence the concept of NoSQL database was introduced. There are a lot of advantages of NoSQL which not only involve its own features but also some features of relational database management system. The severe benefit of NoSQL database is that it is an open source system which helps to adapt many numbers of features for newly generated applications. This paper is focused on understanding the concepts of non-relational database system architecture with relational database system architecture and figure out the advantages and disadvantages of both simultaneously.


1996 ◽  
Vol 8 (3) ◽  
pp. 160-168 ◽  
Author(s):  
Janet Burt ◽  
Tom Beaumont James

This article discusses the different approaches to the treatment of historical databases: the relational database system and κλειω, a source-oriented approach.


2020 ◽  
Vol 63 (8) ◽  
pp. 93-101
Author(s):  
Shangyu Luo ◽  
Zekai J. Gao ◽  
Michael Gubanov ◽  
Luis L. Perez ◽  
Dimitrije Jankov ◽  
...  

Viruses ◽  
2019 ◽  
Vol 11 (9) ◽  
pp. 806
Author(s):  
Shambhu G. Aralaguppe ◽  
Anoop T. Ambikan ◽  
Manickam Ashokkumar ◽  
Milner M. Kumar ◽  
Luke Elizabeth Hanna ◽  
...  

The detection of drug resistance mutations (DRMs) in minor viral populations is of potential clinical importance. However, sophisticated computational infrastructure and competence for analysis of high-throughput sequencing (HTS) data lack at most diagnostic laboratories. Thus, we have proposed a new pipeline, MiDRMpol, to quantify DRM from the HIV-1 pol region. The gag-vpu region of 87 plasma samples from HIV-infected individuals from three cohorts was amplified and sequenced by Illumina HiSeq2500. The sequence reads were adapter-trimmed, followed by analysis using in-house scripts. Samples from Swedish and Ethiopian cohorts were also sequenced by Sanger sequencing. The pipeline was validated against the online tool PASeq (Polymorphism Analysis by Sequencing). Based on an error rate of <1%, a value of >1% was set as reliable to consider a minor variant. Both pipelines detected the mutations in the dominant viral populations, while discrepancies were observed in minor viral populations. In five HIV-1 subtype C samples, minor mutations were detected at the <5% level by MiDRMpol but not by PASeq. MiDRMpol is a computationally as well as labor efficient bioinformatics pipeline for the detection of DRM from HTS data. It identifies minor viral populations (<20%) of DRMs. Our method can be incorporated into large-scale surveillance of HIV-1 DRM.


1991 ◽  
Vol 20 (3) ◽  
pp. 62-72 ◽  
Author(s):  
Tina M. Harvey ◽  
Craig W. Schnepf ◽  
Mark A. Roth

1987 ◽  
pp. 539-546
Author(s):  
Toshihisa Takagi ◽  
Fumihiro Matsuo ◽  
Shooichi Futamura ◽  
Kazuo Ushijima

2001 ◽  
Vol 6 (1-2) ◽  
pp. 100-109
Author(s):  
Sun Yong-qiang ◽  
Xu Shu-ting ◽  
Zhu Feng-hua ◽  
Lai Shu-hua

Sign in / Sign up

Export Citation Format

Share Document