scholarly journals CoolBox: a interactive genomic data explorer for Jupyter Notebook

2019 ◽  
Author(s):  
Weize Xu ◽  
Da Lin ◽  
Ping Hong ◽  
Liang Yi ◽  
Rohit Tyagi ◽  
...  

AbstractSummaryCoolBox is a Python package for interactive genomic data exploration based on Jupyter notebook. It provides a ggplot2-like Application Programming Interface (API) for genomic data visualization, and a Jupyter/ipywidgets based Graphical User Interface (GUI) for interactive data exploration. CoolBox is a versatile multi-omics explorer supporting most types of data formats generated by various sequencing technologies like RNA-Seq, ChIP-Seq, ChIA-PET and Hi-C.Availability and implementationCoolBox is purely implemented with Python, and the GUI widget in Jupyter notebook is based on the ipywidgets package. It is open-source and available under GPLv3 license at https://github.com/GangCaoLab/CoolBox.

2020 ◽  
Author(s):  
Nowlan H Freese ◽  
Karthik Raveendran ◽  
Chaitanya Kintali ◽  
Srishti Tiwari ◽  
Pawan Bole ◽  
...  

AbstractBackgroundVisualization of genomic data is a key step in validating methods and results. Web-based science gateways such as CyVerse provide storage and analysis tools for genomic data but often lack visualization capability. Desktop visualization tools like Integrated Genome Browser (IGB) enable highly interactive data visualization but are difficult to deploy in science gateways. Developing ways for gateways to interoperate with pre-existing external tools like IGB would enhance their value to users.ResultsWe developed BioViz Connect, a new web application that connects CyVerse and IGB using the CyVerse Terrain Application Programming Interface (API). Using BioViz Connect, users can (i) stream their CyVerse data to IGB for visualization, (ii) add IGB specific metadata such as genome version and track appearance to CyVerse data, and (iii) run compute-intensive visual analytics functions to create new visualizations for IGB. To demonstrate BioViz Connect, we present an example visual analysis of RNA-Seq data from Arabidopsis thaliana plants undergoing heat and desiccation stresses. The example shows how researchers can seamlessly analyze and visualize their CyVerse data in IGB. BioViz Connect is accessible from https://bioviz.org.ConclusionsBioViz Connect demonstrates a new way to integrate science gateways with desktop applications using APIs.


2014 ◽  
Author(s):  
Simon Anders ◽  
Paul Theodor Pyl ◽  
Wolfgang Huber

Motivation: A large choice of tools exists for many standard tasks in the analysis of high-throughput sequencing (HTS) data. However, once a project deviates from standard work flows, custom scripts are needed. Results: We present HTSeq, a Python library to facilitate the rapid development of such scripts. HTSeq offers parsers for many common data formats in HTS projects, as well as classes to represent data such as genomic coordinates, sequences, sequencing reads, alignments, gene model information, variant calls, and provides data structures that allow for querying via genomic coordinates. We also present htseq-count, a tool developed with HTSeq that preprocesses RNA-Seq data for differential expression analysis by counting the overlap of reads with genes. Availability: HTSeq is released as open-source software under the GNU General Public Licence and available from http://www-huber.embl.de/HTSeq or from the Python Package Index, https://pypi.python.org/pypi/HTSeq


2019 ◽  
Vol 39 (06) ◽  
pp. 280-289 ◽  
Author(s):  
Raj Kumar Bhardwaj

The study aims to trace the development of Indian research data repositories (RDRs) and explore their content with the view of identifying prospects and possibilities. Further, it analyses the distribution of data repositories on the basis of content coverage, types of content, author identification system followed, software and the application programming interface used, subject wise number of repositories etc. The study is based on data repositories listed on the registry of data repositories accessible at http://www.re3data.org.The dataset was exported in Microsoft Excel format for analysis. A simple percentage method was followed in data analyses and results are presented through Tables and Figures. The study found a total of 2829 data repositories in existence worldwide. Further, it was seen that 1526 (53.9 %) are open and 924 (32.4 %) are restricted data repositories. Also, there are embargoed data repositories numbering 225 (8.0 %) and closed ones numbering 154 (5.4 %). There are 2829 RDRs covering 72 countries in the world. The study found that out of total 45 Indian RDRs, only 30 (67 %) are open, followed by restricted 12 (27 %) and 3 (6 %) that are closed. Majority of Indian RDRs (20) were developed in the year 2014. The study found that the majority of Indian RDRs (17) are‘disciplinary’. Further, the study also revealed that statistical data formats are available in a maximum of 31 (68.9 %) Indian RDRs. It was also seen that the majority of Indian RDRs (28) has datasets relating to ‘Life Sciences’. It was identified that only 20% of data repositories have been using metadata standards in metadata; the remaining 80% do not use any standards in metadata entry. This study covered only the research data repositories in India registered on the registry of data repositories. RDRs not listed in the registry of data repositories are left out.


2021 ◽  
Vol 2069 (1) ◽  
pp. 012135
Author(s):  
N D Svane ◽  
A Pranskunas ◽  
L B Lindgren ◽  
R L Jensen

Abstract The architecture, engineering, and construction (AEC) industry experiences a growing need for building performance simulations (BPS) as facilitators in the design process. However, inconsistent modelling practice and varying quality of export/import functions entail error-prone interoperability with IFC and gbXML data formats. Consequently, repeated manual modelling is still necessary. This paper presents a coupling module enabling a semi-automated extract of geometry data from the BIM software Revit and a further translation to a BPS input file using Revit Application Programming Interface (API) and visual programming in Dynamo. The module is tested with three test cases which shows promising results for fast and structured semi-automatic geometry modelling designed to fit today’s practice.


2020 ◽  
Vol 10 (18) ◽  
pp. 6367
Author(s):  
Eleonora Cappelli ◽  
Fabio Cumbo ◽  
Anna Bernasconi ◽  
Arif Canakoglu ◽  
Stefano Ceri ◽  
...  

Next Generation Sequencing technologies have produced a substantial increase of publicly available genomic data and related clinical/biospecimen information. New models and methods to easily access, integrate and search them effectively are needed. An effort was made by the Genomic Data Commons (GDC), which defined strict procedures for harmonizing genomic and clinical data of cancer, and created the GDC data portal with its application programming interface (API). In this work, we enhance GDC harmonization by applying a state of the art data model (called Genomic Data Model) made of two components: the genomic data, in Browser Extensible Data (BED) format, and the related metadata, in a tab-delimited key-value format. Furthermore, we extend the GDC genomic data with information extracted from other public genomic databases (e.g., GENCODE, HGNC and miRBase). For metadata, we implemented automatic procedures to extract and normalize them, recognizing and eliminating redundant ones, from both Clinical/Biospecimen Supplements and GDC Data Model, that are present on the two sources of GDC (i.e., data portal and API). We developed and released the OpenGDC software, which is able to extract, integrate, extend, and standardize genomic and clinical data of The Cancer Genome Atlas (TCGA) from the GDC. Additionally, we created a publicly accessible repository, containing such homogenized and enhanced TCGA data (resulting in about 1.3 TB). Our approach, implemented in the OpenGDC software, provides a step forward to the effective and efficient management of big genomic and clinical data of cancer. The strong usability of our data model and utility of our work is demonstrated through the application of the GenoMetric Query Language (GMQL) on the transformed TCGA data from the GDC, achieving promising results, facilitating information retrieval and knowledge discovery analyses.


2020 ◽  
Vol 36 (8) ◽  
pp. 2592-2594 ◽  
Author(s):  
Deren A R Eaton ◽  
Isaac Overcast

Abstract Summary ipyrad is a free and open source tool for assembling and analyzing restriction site-associated DNA sequence datasets using de novo and/or reference-based approaches. It is designed to be massively scalable to hundreds of taxa and thousands of samples, and can be efficiently parallelized on high performance computing clusters. It is available both as a command line interface and as a Python package with an application programming interface, the latter of which can be used interactively to write complex, reproducible scripts and implement a suite of downstream analysis tools. Availability and implementation ipyrad is a free and open source program written in Python. Source code is available from the GitHub repository (https://github.com/dereneaton/ipyrad/), and Linux and MacOS installs are distributed through the conda package manager. Complete documentation, including numerous tutorials, and Jupyter notebooks demonstrating example assemblies and applications of downstream analysis tools are available online: https://ipyrad.readthedocs.io/.


2014 ◽  
Vol 7 (6) ◽  
pp. 3135-3151 ◽  
Author(s):  
M. Bavay ◽  
T. Egger

Abstract. Using numerical models which require large meteorological data sets is sometimes difficult and problems can often be traced back to the Input/Output functionality. Complex models are usually developed by the environmental sciences community with a focus on the core modelling issues. As a consequence, the I/O routines that are costly to properly implement are often error-prone, lacking flexibility and robustness. With the increasing use of such models in operational applications, this situation ceases to be simply uncomfortable and becomes a major issue. The MeteoIO library has been designed for the specific needs of numerical models that require meteorological data. The whole task of data preprocessing has been delegated to this library, namely retrieving, filtering and resampling the data if necessary as well as providing spatial interpolations and parameterizations. The focus has been to design an Application Programming Interface (API) that (i) provides a uniform interface to meteorological data in the models, (ii) hides the complexity of the processing taking place, and (iii) guarantees a robust behaviour in the case of format errors, erroneous or missing data. Moreover, in an operational context, this error handling should avoid unnecessary interruptions in the simulation process. A strong emphasis has been put on simplicity and modularity in order to make it extremely easy to support new data formats or protocols and to allow contributors with diverse backgrounds to participate. This library is also regularly evaluated for computing performance and further optimized where necessary. Finally, it is released under an Open Source license and is available at http://models.slf.ch/p/meteoio. This paper gives an overview of the MeteoIO library from the point of view of conceptual design, architecture, features and computational performance. A scientific evaluation of the produced results is not given here since the scientific algorithms that are used have already been published elsewhere.


2018 ◽  
Vol 9 (1) ◽  
pp. 24-31
Author(s):  
Rudianto Rudianto ◽  
Eko Budi Setiawan

Availability the Application Programming Interface (API) for third-party applications on Android devices provides an opportunity to monitor Android devices with each other. This is used to create an application that can facilitate parents in child supervision through Android devices owned. In this study, some features added to the classification of image content on Android devices related to negative content. In this case, researchers using Clarifai API. The result of this research is to produce a system which has feature, give a report of image file contained in target smartphone and can do deletion on the image file, receive browser history report and can directly visit in the application, receive a report of child location and can be directly contacted via this application. This application works well on the Android Lollipop (API Level 22). Index Terms— Application Programming Interface(API), Monitoring, Negative Content, Children, Parent.


Sign in / Sign up

Export Citation Format

Share Document