chromoMap: An R package for Interactive Visualization and Annotation of Chromosomes

Mapping Intimacies ◽

10.1101/605600 ◽

2019 ◽

Cited By ~ 10

Author(s):

Lakshay Anand ◽

Carlos M. Rodriguez Lopez

Keyword(s):

Open Source ◽

Interactive Visualization ◽

Living Organism ◽

R Package ◽

Supplementary Information ◽

Supplementary Data ◽

Comprehensive Understanding ◽

Single Function ◽

Interactive Visualizations ◽

Supplementary Material

AbstractSummarychromoMap is an R package for constructing interactive visualizations of chromosomes/chromosomal regions, and mapping of chromosomal elements (like genes) onto them, of any living organism. The package takes separate tab-delimited files (BED like) to specify the genomic co-ordinates of the chromosomes and the elements to annotate. Each rendered chromosome is composed of continuous loci of specific ranges where each locus, on hover, displays detailed information about the elements annotated within that locus range. By just tweaking parameters of a single function, users can generate a variety of plots that can either be saved as static image or shared as HTML documents. Users can utilize the various prominent features of chromoMap including, but not limited to, visualizing polyploidy, creating chromosome heatmaps, mapping groups of elements, adding hyperlinks to elements, multi-species chromosome visualization.Availability and implementationThe R package chromoMap is available under the GPL-3 Open Source license. It is included with a vignette for comprehensive understanding of its various features, and is freely available from: https://CRAN.R-project.org/[email protected] informationSupplementary data are available online.

Download Full-text

mmgenome: a toolbox for reproducible genome extraction from metagenomes

10.1101/059121 ◽

2016 ◽

Cited By ~ 42

Author(s):

Søeren M. Karst ◽

Rasmus H. Kirkegaard ◽

Mads Albertsen

Keyword(s):

Optimal Strategy ◽

R Package ◽

Supplementary Information ◽

Data Generation ◽

Supplementary Data ◽

High Quality ◽

Standard Analysis ◽

Specific Population ◽

The Core ◽

Supplementary Material

ABSTRACTSummaryRecovery of population genomes is becoming a standard analysis in metagenomics and a multitude of different approaches exists. However, the workflows are complex, requiring data generation, binning, validation and finishing to generate high quality population genome bins. In addition, several different approaches are often used on the same dataset as the optimal strategy to extract a specific population genome varies. Here we introduce mmgenome: a toolbox for reproducible genome extraction from metagenomes. At the core of mmgenome is an R package that facilitates effortless integration of different binning strategies by collecting information on scaffolds. Genome binning is facilitated through integrated tools that support effortless visualizations, validation and calculation of key statistics. Full reproducibility and transparency is obtained through Rmarkdown, whereby every step can be recreated.Availability and implementationThe binning framework of mmge-nome is implemented in R. Wrapper scripts for data generation and finishing is written in Perl. The mmgenome toolbox and associated step-by-step guides are available at http://madsal-bertsen.github.io/mmgenome/[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

ccNetViz: a WebGL-based JavaScript library for visualization of large networks

Bioinformatics ◽

10.1093/bioinformatics/btaa559 ◽

2020 ◽

Vol 36 (16) ◽

pp. 4527-4529

Author(s):

Ales Saska ◽

David Tichy ◽

Robert Moore ◽

Achilles Rasquinha ◽

Caner Akdas ◽

...

Keyword(s):

Systems Biology ◽

Complex Networks ◽

Open Source ◽

High Speed ◽

A Priori ◽

Supplementary Information ◽

Network Visualization ◽

Supplementary Data ◽

Web Based ◽

Flow Of Information

Abstract Summary Visualizing a network provides a concise and practical understanding of the information it represents. Open-source web-based libraries help accelerate the creation of biologically based networks and their use. ccNetViz is an open-source, high speed and lightweight JavaScript library for visualization of large and complex networks. It implements customization and analytical features for easy network interpretation. These features include edge and node animations, which illustrate the flow of information through a network as well as node statistics. Properties can be defined a priori or dynamically imported from models and simulations. ccNetViz is thus a network visualization library particularly suited for systems biology. Availability and implementation The ccNetViz library, demos and documentation are freely available at http://helikarlab.github.io/ccNetViz/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DEsingle for detecting three types of differential expression in single-cell RNA-seq data

10.1101/173997 ◽

2017 ◽

Cited By ~ 1

Author(s):

Zhun Miao ◽

Ke Deng ◽

Xiaowo Wang ◽

Xuegong Zhang

Keyword(s):

Single Cell ◽

Differential Expression ◽

Negative Binomial ◽

Single Cells ◽

R Package ◽

Supplementary Information ◽

Binomial Model ◽

Supplementary Data ◽

Rna Seq ◽

Real Zeros

AbstractSummaryThe excessive amount of zeros in single-cell RNA-seq data include “real” zeros due to the on-off nature of gene transcription in single cells and “dropout” zeros due to technical reasons. Existing differential expression (DE) analysis methods cannot distinguish these two types of zeros. We developed an R package DEsingle which employed Zero-Inflated Negative Binomial model to estimate the proportion of real and dropout zeros and to define and detect 3 types of DE genes in single-cell RNA-seq data with higher accuracy.Availability and ImplementationThe R package DEsingle is freely available at https://github.com/miaozhun/DEsingle and is under Bioconductor’s consideration [email protected] informationSupplementary data are available at bioRxiv online.

Download Full-text

ngsReports: a Bioconductor package for managing FastQC reports and other NGS related log files

Bioinformatics ◽

10.1093/bioinformatics/btz937 ◽

2019 ◽

Vol 36 (8) ◽

pp. 2587-2588 ◽

Cited By ~ 10

Author(s):

Christopher M Ward ◽

Thu-Hien To ◽

Stephen M Pederson

Keyword(s):

Quality Control ◽

R Package ◽

Supplementary Information ◽

Bioconductor Package ◽

Supplementary Data ◽

Large Sample ◽

Log Files ◽

Shiny App ◽

Next Generation Sequencing Ngs ◽

Generation Sequencing

Abstract Motivation High throughput next generation sequencing (NGS) has become exceedingly cheap, facilitating studies to be undertaken containing large sample numbers. Quality control (QC) is an essential stage during analytic pipelines and the outputs of popular bioinformatics tools such as FastQC and Picard can provide information on individual samples. Although these tools provide considerable power when carrying out QC, large sample numbers can make inspection of all samples and identification of systemic bias a challenge. Results We present ngsReports, an R package designed for the management and visualization of NGS reports from within an R environment. The available methods allow direct import into R of FastQC reports along with outputs from other tools. Visualization can be carried out across many samples using default, highly customizable plots with options to perform hierarchical clustering to quickly identify outlier libraries. Moreover, these can be displayed in an interactive shiny app or HTML report for ease of analysis. Availability and implementation The ngsReports package is available on Bioconductor and the GUI shiny app is available at https://github.com/UofABioinformaticsHub/shinyNgsreports. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

CluMSID: an R package for similarity-based clustering of tandem mass spectra to aid feature annotation in metabolomics

Bioinformatics ◽

10.1093/bioinformatics/btz005 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3196-3198 ◽

Cited By ~ 5

Author(s):

Tobias Depke ◽

Raimo Franke ◽

Mark Brönstrup

Keyword(s):

Mass Spectra ◽

Neutral Loss ◽

Metabolite Identification ◽

R Package ◽

Supplementary Information ◽

Tandem Mass ◽

Compound Identification ◽

Feature Identification ◽

Tandem Mass Spectra ◽

Interactive Visualizations

Abstract Summary Compound identification is one of the most eminent challenges in the untargeted analysis of complex mixtures of small molecules by mass spectrometry. Similarity of tandem mass spectra can provide valuable information on putative structural similarities between known and unknown analytes and hence aids feature identification in the bioanalytical sciences. We have developed CluMSID (Clustering of MS2 spectra for metabolite identification), an R package that enables researchers to make use of tandem mass spectra and neutral loss pattern similarities as a part of their metabolite annotation workflow. CluMSID offers functions for all analysis steps from import of raw data to data mining by unsupervised multivariate methods along with respective (interactive) visualizations. A detailed tutorial with example data is provided as supplementary information. Availability and implementation CluMSID is available as R package from https://github.com/tdepke/CluMSID/and from https://bioconductor.org/packages/CluMSID/. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

CPS analysis: self-contained validation of biomedical data clustering

Bioinformatics ◽

10.1093/bioinformatics/btaa165 ◽

2020 ◽

Vol 36 (11) ◽

pp. 3516-3521 ◽

Cited By ~ 1

Author(s):

Lixiang Zhang ◽

Lin Lin ◽

Jia Li

Keyword(s):

Data Clustering ◽

State Of The Art ◽

R Package ◽

Research Community ◽

Supplementary Information ◽

Biomedical Data ◽

Data Generation ◽

Supplementary Data ◽

Point Set ◽

Class Labels

Abstract Motivation Cluster analysis is widely used to identify interesting subgroups in biomedical data. Since true class labels are unknown in the unsupervised setting, it is challenging to validate any cluster obtained computationally, an important problem barely addressed by the research community. Results We have developed a toolkit called covering point set (CPS) analysis to quantify uncertainty at the levels of individual clusters and overall partitions. Functions have been developed to effectively visualize the inherent variation in any cluster for data of high dimension, and provide more comprehensive view on potentially interesting subgroups in the data. Applying to three usage scenarios for biomedical data, we demonstrate that CPS analysis is more effective for evaluating uncertainty of clusters comparing to state-of-the-art measurements. We also showcase how to use CPS analysis to select data generation technologies or visualization methods. Availability and implementation The method is implemented in an R package called OTclust, available on CRAN. Contact [email protected] or [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

RMTL: an R library for multi-task learning

Bioinformatics ◽

10.1093/bioinformatics/bty831 ◽

2018 ◽

Vol 35 (10) ◽

pp. 1797-1798 ◽

Cited By ~ 2

Author(s):

Han Cao ◽

Jiayu Zhou ◽

Emanuel Schwarz

Keyword(s):

Biological Networks ◽

Simulated Data ◽

R Package ◽

Low Rank ◽

Supplementary Information ◽

Supplementary Data ◽

Software Environment ◽

Machine Learning Technique ◽

Task Learning ◽

Learning Technique

Abstract Motivation Multi-task learning (MTL) is a machine learning technique for simultaneous learning of multiple related classification or regression tasks. Despite its increasing popularity, MTL algorithms are currently not available in the widely used software environment R, creating a bottleneck for their application in biomedical research. Results We developed an efficient, easy-to-use R library for MTL (www.r-project.org) comprising 10 algorithms applicable for regression, classification, joint predictor selection, task clustering, low-rank learning and incorporation of biological networks. We demonstrate the utility of the algorithms using simulated data. Availability and implementation The RMTL package is an open source R package and is freely available at https://github.com/transbioZI/RMTL. RMTL will also be available on cran.r-project.org. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

schex avoids overplotting for large single-cell RNA-sequencing datasets

Bioinformatics ◽

10.1093/bioinformatics/btz907 ◽

2019 ◽

Vol 36 (7) ◽

pp. 2291-2292 ◽

Cited By ~ 1

Author(s):

Saskia Freytag ◽

Ryan Lister

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

R Package ◽

Supplementary Information ◽

Supplementary Data ◽

Sequencing Data ◽

Single Cell Rna Sequencing

Abstract Summary Due to the scale and sparsity of single-cell RNA-sequencing data, traditional plots can obscure vital information. Our R package schex overcomes this by implementing hexagonal binning, which has the additional advantages of improving speed and reducing storage for resulting plots. Availability and implementation schex is freely available from Bioconductor via http://bioconductor.org/packages/release/bioc/html/schex.html and its development version can be accessed on GitHub via https://github.com/SaskiaFreytag/schex. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DrawGlycan-SNFG and gpAnnotate: rendering glycans and annotating glycopeptide mass spectra

Bioinformatics ◽

10.1093/bioinformatics/btz819 ◽

2019 ◽

Cited By ~ 4

Author(s):

Kai Cheng ◽

Gabrielle Pawlowski ◽

Xinheng Yu ◽

Yusen Zhou ◽

Sriram Neelamegham

Keyword(s):

Mass Spectrometry ◽

Open Source ◽

Mass Spectra ◽

Supplementary Information ◽

Supplementary Data ◽

International Union ◽

Open Source Program ◽

Source Program ◽

Wide Range ◽

Peptide Modifications

Abstract Summary This manuscript describes an open-source program, DrawGlycan-SNFG (version 2), that accepts IUPAC (International Union of Pure and Applied Chemist)-condensed inputs to render Symbol Nomenclature For Glycans (SNFG) drawings. A wide range of local and global options enable display of various glycan/peptide modifications including bond breakages, adducts, repeat structures, ambiguous identifications etc. These facilities make DrawGlycan-SNFG ideal for integration into various glycoinformatics software, including glycomics and glycoproteomics mass spectrometry (MS) applications. As a demonstration of such usage, we incorporated DrawGlycan-SNFG into gpAnnotate, a standalone application to score and annotate individual MS/MS glycopeptide spectrum in different fragmentation modes. Availability and implementation DrawGlycan-SNFG and gpAnnotate are platform independent. While originally coded using MATLAB, compiled packages are also provided to enable DrawGlycan-SNFG implementation in Python and Java. All programs are available from https://virtualglycome.org/drawglycan; https://virtualglycome.org/gpAnnotate. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

GladiaTOX: GLobal Assessment of Dose-IndicAtor in TOXicology

Bioinformatics ◽

10.1093/bioinformatics/btz187 ◽

2019 ◽

Vol 35 (20) ◽

pp. 4190-4192 ◽

Cited By ~ 4

Author(s):

Vincenzo Belcastro ◽

Stephane Cano ◽

Diego Marescotti ◽

Stefano Acali ◽

Carine Poussin ◽

...

Keyword(s):

Quality Control ◽

Data Processing ◽

Web Service ◽

Biomedical Research ◽

Global Assessment ◽

R Package ◽

Supplementary Information ◽

High Content Screening ◽

Supplementary Data ◽

Severity Scores

Abstract Summary GladiaTOX R package is an open-source, flexible solution to high-content screening data processing and reporting in biomedical research. GladiaTOX takes advantage of the ‘tcpl’ core functionalities and provides a number of extensions: it provides a web-service solution to fetch raw data; it computes severity scores and exports ToxPi formatted files; furthermore it contains a suite of functionalities to generate PDF reports for quality control and data processing. Availability and implementation GladiaTOX R package (bioconductor). Also available via: git clone https://github.com/philipmorrisintl/GladiaTOX.git. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text