scholarly journals The NIH BD2K center for big data in translational genomics

2015 ◽  
Vol 22 (6) ◽  
pp. 1143-1147 ◽  
Author(s):  
Benedict Paten ◽  
Mark Diekhans ◽  
Brian J Druker ◽  
Stephen Friend ◽  
Justin Guinney ◽  
...  

Abstract The world’s genomics data will never be stored in a single repository – rather, it will be distributed among many sites in many countries. No one site will have enough data to explain genotype to phenotype relationships in rare diseases; therefore, sites must share data. To accomplish this, the genetics community must forge common standards and protocols to make sharing and computing data among many sites a seamless activity. Through the Global Alliance for Genomics and Health, we are pioneering the development of shared application programming interfaces (APIs) to connect the world’s genome repositories. In parallel, we are developing an open source software stack (ADAM) that uses these APIs. This combination will create a cohesive genome informatics ecosystem. Using containers, we are facilitating the deployment of this software in a diverse array of environments. Through benchmarking efforts and big data driver projects, we are ensuring ADAM’s performance and utility.

2011 ◽  
Vol 2011 ◽  
pp. 1-21 ◽  
Author(s):  
Pavel Segec ◽  
Tatiana Kovacikova

The Session Initiation Protocol (SIP) is a multimedia signalling protocol that has evolved into a widely adopted communication standard. The integration of SIP into existing IP networks has fostered IP networks becoming a convergence platform for both real-time and non-real-time multimedia communications. This converged platform integrates data, voice, video, presence, messaging, and conference services into a single network that offers new communication experiences for users. The open source community has contributed to SIP adoption through the development of open source software for both SIP clients and servers. In this paper, we provide a survey on open SIP systems that can be built using publically available software. We identify SIP features for service development and programming, services and applications of a SIP-converged platform, and the most important technologies supporting SIP functionalities. We propose an advanced converged IP communication platform that uses SIP for service delivery. The platform supports audio and video calls, along with media services such as audio conferences, voicemail, presence, and instant messaging. Using SIP Application Programming Interfaces (APIs), the platform allows the deployment of advanced integrated services. The platform is implemented with open source software. Architecture components run on standardized hardware with no need for special purpose investments.


2021 ◽  
Author(s):  
Fabian Kovacs ◽  
Max Thonagel ◽  
Marion Ludwig ◽  
Alexander Albrecht ◽  
Manuel Hegner ◽  
...  

BACKGROUND Big data in healthcare must be exploited to achieve a substantial increase in efficiency and competitiveness. Especially the analysis of patient-related data possesses huge potential to improve decision-making processes. However, most analytical approaches used today are highly time- and resource-consuming. OBJECTIVE The presented software solution Conquery is an open-source software tool providing advanced, but intuitive data analysis without the need for specialized statistical training. Conquery aims to simplify big data analysis for novice database users in the medical sector. METHODS Conquery is a document-oriented distributed timeseries database and analysis platform. Its main application is the analysis of per-person medical records by non-technical medical professionals. Complex analyses are realized in the Conquery frontend by dragging tree nodes into the query editor. Queries are evaluated by a bespoke distributed query-engine for medical records in a column-oriented fashion. We present a custom compression scheme to facilitate low response times that uses online calculated as well as precomputed metadata and data statistics. RESULTS Conquery allows for easy navigation through the hierarchy and enables complex study cohort construction whilst reducing the demand on time and resources. The UI of Conquery and a query output is exemplified by the construction of a relevant clinical cohort. CONCLUSIONS Conquery is an efficient and intuitive open-source software for performant and secure data analysis and aims at supporting decision-making processes in the healthcare sector.


2019 ◽  
Author(s):  
Raphael Scheible ◽  
Dennis Kadioglu ◽  
Stephan Ehl ◽  
Marco Blum ◽  
Martin Boeker ◽  
...  

BACKGROUND The German Network on Primary Immunodeficiency Diseases (PID-NET) utilizes the European Society for Immunodeficiencies (ESID) registry as a platform for collecting data. In the context of PID-NET data, we show how registries based on custom software can be made interoperable for better collaborative access to precollected data. The Open Source Registry System for Rare Diseases (<i>Open-Source-Registersystem für Seltene Erkrankungen</i> [OSSE], in German) provides patient organizations, physicians, scientists, and other parties with open source software for the creation of patient registries. In addition, the necessary interoperability between different registries based on the OSSE, as well as existing registries, is supported, which allows those registries to be confederated at both the national and international levels. OBJECTIVE Data from the PID-NET registry should be made available in an interoperable manner without losing data sovereignty by extending the existing custom software of the registry using the OSSE registry framework. METHODS This paper describes the following: (1) the installation and configuration of the OSSE bridgehead, (2) an approach using a free toolchain to set up the required interfaces to connect a registry with the OSSE bridgehead, and (3) the decentralized search, which allows the formulation of inquiries that are sent to a selected set of registries of interest. RESULTS PID-NET uses the established and highly customized ESID registry software. By setting up a so-called OSSE bridgehead, PID-NET data are made interoperable according to a federated approach, and centrally formulated inquiries for data can be received. As the first registry to use the OSSE bridgehead, the authors introduce an approach using a free toolchain to efficiently implement and maintain the required interfaces. Finally, to test and demonstrate the system, two inquiries are realized using the graphical query builder. By establishing and interconnecting an OSSE bridgehead with the underlying ESID registry, confederated queries for data can be received and, if desired, the inquirer can be contacted to further discuss any requirements for cooperation. CONCLUSIONS The OSSE offers an infrastructure that provides the possibility of more collaborative and transparent research. The decentralized search functionality includes registries into one search application while still maintaining data sovereignty. The OSSE bridgehead enables any registry software to be integrated into the OSSE network. The proposed toolchain to set up the required interfaces consists of freely available software components that are well documented. The use of the decentralized search is uncomplicated to use and offers a well-structured, yet still improvable, graphical user interface to formulate queries.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Lingbin Zeng ◽  
Xin Guo ◽  
Cheng Yang ◽  
Yao Lu ◽  
Xiao Li

With the vigorous development of open-source software, a huge number of open-source projects and open-source codes have been accumulated in open-source big data, which contains a wealth of code resources. However, effectively and efficiently retrieving the relevant code snippets in such a large amount of open-source big data is an extremely difficult problem. There are usually large gaps between the user’s natural language description and the open-source code snippets. In this paper, we propose a novel code tag generation and code retrieval approach named TagNN, which combines software engineering empirical knowledge and a deep learning algorithm. The experimental results show that our method has good effects on code tag generation and code snippet retrieval.


Author(s):  
Andrew McCullum

In 2015, Central Asia made some vital enhancements in nature for cross-fringe e-business: Kazakhstan's promotion to the World Trade Organization (WTO) will help business straightforwardness, while the Kyrgyz Republic's enrollment in the Eurasian Customs Union grows its buyer base. Why e-business? Two reasons to begin with, e-trade diminishes the expense of separation. Focal Asia is the most elevated exchange cost locale on the planet: unlimited separations from real markets make discovering purchasers testing, shipping merchandise moderate, and fare costs high. Second, e-business can pull in populaces that are customarily under-spoke to in fare markets, for example, ladies, little organizations and rustic business visionaries.


This chapter deals with an ambitious Management Information System goal: the creation of open source supply chains. It starts with some basics and background for the open (source) supply chains, discusses relevant architectures and modelling work, proceeds to an analysis of real-world business cases and the related application scenarios, and presents an open source reference model. In current e-commerce frameworks, the issue of dynamic supply chain establishment and supply chain life cycle management is still misrepresented and not addressed adequately. Registration, advertisement, and change management for complex products and services heavily relies on proprietary application programming interfaces and protocols as well as emerging and partially competing (pseudo)standards.


Author(s):  
Richard S. Segall

This chapter discusses what Open Source Software is and its relationship to Big Data and how it differs from other types of software and its software development cycle. Open source software (OSS) is a type of computer software in which source code is released under a license in which the copyright holder grants users the rights to study, change, and distribute the software to anyone and for any purpose. Big Data are data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data can be discrete or a continuous stream data and is accessible using many types of computing devices ranging from supercomputers and personal workstations to mobile devices and tablets. It is discussed how fog computing can be performed with cloud computing for visualization of Big Data. This chapter also presents a summary of additional web-based Big Data visualization software.


Sign in / Sign up

Export Citation Format

Share Document