SgTiler: a fast method to design tiling sgRNAs for CRISPR/Cas9 mediated screening

Mapping Intimacies ◽

10.1101/217166 ◽

2017 ◽

Author(s):

Musaddeque Ahmed ◽

Housheng Hansen He

Keyword(s):

Source Code ◽

Regions Of Interest ◽

Fast Method ◽

Command Line ◽

Regulatory Regions ◽

Guide Rnas ◽

Command Line Tool ◽

Genomic Regions ◽

The Web ◽

Target Effects

AbstractSummaryScreening of genomic regions of interest using CRISPR/Cas9 is getting increasingly popular. The system requires designing of single guide RNAs (sgRNAs) that can efficiently guide the Cas9 endonuclease to the targeted region with minimal off-target effects. Tiling sgRNAs is the most effective way to perturb regulatory regions, such as promoters and enhancers. sgTiler is the first tool that provides a fast method for designing tiling sgRNAs.Availability and ImplementationsgTiler is a command line tool that requires only one command to execute. Its source code is freely available on the web at https://github.com/HansenHeLab/sgTiler. sgTiler is implemented in Python and supported on any platform with Python and Bowtie.

Download Full-text

Alview: Portable Software for Viewing Sequence Reads in BAM Formatted Files

Cancer Informatics ◽

10.4137/cin.s26470 ◽

2015 ◽

Vol 14 ◽

pp. CIN.S26470 ◽

Cited By ~ 2

Author(s):

Richard P. Finney ◽

Qing-Rong Chen ◽

Cu V. Nguyen ◽

Chih Hao Hsu ◽

Chunhua Yan ◽

...

Keyword(s):

Graphical User Interface ◽

Reference Genome ◽

Source Code ◽

Software Tool ◽

Command Line ◽

Sequencing Data ◽

Genome Data ◽

Command Line Tool ◽

Portable Software ◽

Microsoft Windows

The name Alview is a contraction of the term Alignment Viewer. Alview is a compiled to native architecture software tool for visualizing the alignment of sequencing data. Inputs are files of short-read sequences aligned to a reference genome in the SAM/BAM format and files containing reference genome data. Outputs are visualizations of these aligned short reads. Alview is written in portable C with optional graphical user interface (GUI) code written in C, C++, and Objective-C. The application can run in three different ways: as a web server, as a command line tool, or as a native, GUI program. Alview is compatible with Microsoft Windows, Linux, and Apple OS X. It is available as a web demo at https://cgwb.nci.nih.gov/cgi-bin/alview . The source code and Windows/Mac/Linux executables are available via https://github.com/NCIP/alview .

Download Full-text

Bedshift: perturbation of genomic interval sets

10.1101/2020.11.11.378554 ◽

2020 ◽

Author(s):

Aaron Gu ◽

Hyun Jae Cho ◽

Nathan C. Sheffield

Keyword(s):

Euclidean Distance ◽

Source Code ◽

Similarity Metrics ◽

Random Perturbations ◽

Command Line ◽

Genomic Interval ◽

Command Line Tool ◽

Evaluation Dataset ◽

Original File ◽

Coverage Score

Results of functional genomics experiments such as ChIP-Seq or ATAC-Seq produce data summarized as a region set. Many tools have been developed to analyze region sets, including computing similarity metrics to compare them. However, there is no way to objectively evaluate the effectiveness of region set similarity metrics. In this paper we present bedshift, a command-line tool and Python API to generate new BED files by making random perturbations to an original BED file. Perturbed files have known similarity to the original file and are therefore useful to benchmark similarity metrics. To demonstrate, we used bedshift to create an evaluation dataset of 3,600 perturbed files generated by shifting, adding, and dropping regions from a reference BED file. Then, we compared four similarity metrics: Jaccard score, coverage score, Euclidean distance, and cosine similarity. The results show that the Jaccard score is most sensitive to detecting adding and dropping regions, while the coverage score is more sensitive to shifted regions.AvailabilityBSD2-licensed source code and documentation can be found at https://bedshift.databio.org.

Download Full-text

BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences

Bioinformatics ◽

10.1093/bioinformatics/btaa928 ◽

2020 ◽

Author(s):

Aziz Khan ◽

Rafael Riudavets Puig ◽

Paul Boddie ◽

Anthony Mathelier

Keyword(s):

Dna Sequences ◽

Source Code ◽

Web Server ◽

Enrichment Analysis ◽

Nucleotide Composition ◽

Supplementary Information ◽

Command Line ◽

Sequence Composition ◽

Command Line Tool ◽

Gc Bias

Abstract Motivation Accurate motif enrichment analyses depend on the choice of background DNA sequences used, which should ideally match the sequence composition of the foreground sequences. It is important to avoid false positive enrichment due to sequence biases in the genome, such as GC-bias. Therefore, relying on an appropriate set of background sequences is crucial for enrichment analysis. Results We developed BiasAway, a command line tool and its dedicated easy-to-use web server to generate synthetic sequences matching any k-mer nucleotide composition or select genomic DNA sequences matching the mononucleotide composition of the foreground sequences through four different models. For genomic sequences, we provide precomputed partitions of genomes from nine species with five different bin sizes to generate appropriate genomic background sequences. Availability and implementation BiasAway source code is freely available from Bitbucket (https://bitbucket.org/CBGR/biasaway) and can be easily installed using bioconda or pip. The web server is available at https://biasaway.uio.no and a detailed documentation is available at https://biasaway.readthedocs.io. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

PrimeDesign software for rapid and simplified design of prime editing guide RNAs

10.1101/2020.05.04.077750 ◽

2020 ◽

Cited By ~ 4

Author(s):

Jonathan Y. Hsu ◽

Andrew V. Anzalone ◽

Julian Grünewald ◽

Kin Chung Lam ◽

Max W. Shen ◽

...

Keyword(s):

Genetic Variants ◽

Web Application ◽

Saturation Mutagenesis ◽

Command Line ◽

Searchable Database ◽

Guide Rna ◽

Guide Rnas ◽

Genome Wide ◽

Command Line Tool ◽

User Friendly

AbstractPrime editing (PE) is a versatile genome editing technology, but design of the required guide RNAs is more complex than for standard CRISPR-based nucleases or base editors. Here we describe PrimeDesign, a user-friendly, end-to-end web application and command-line tool for the design of PE experiments. PrimeDesign can be used for single and combination editing applications, as well as genome-wide and saturation mutagenesis screens. Using PrimeDesign, we also constructed PrimeVar, the first comprehensive and searchable database for prime editing guide RNA (pegRNA) and nicking sgRNA (ngRNA) combinations to install or correct >68,500 pathogenic human genetic variants from the ClinVar database.

Download Full-text

ped_draw: pedigree drawing with ease

BMC Bioinformatics ◽

10.1186/s12859-020-03917-4 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Matt Velinder ◽

Dillon Lee ◽

Gabor Marth

Keyword(s):

Ease Of Use ◽

Command Line ◽

Drawing Tool ◽

Text File ◽

Web Tool ◽

Family Structures ◽

Visual Patterns ◽

Image File ◽

Command Line Tool ◽

The Web

Abstract Background Pedigree files are ubiquitously used within bioinformatics and genetics studies to convey critical information about relatedness, sex and affected status of study samples. While the text based format of ped files is efficient for computational methods, it is not immediately intuitive to a bioinformatician or geneticist trying to understand family structures, many of which encode the affected status of individuals across multiple generations. The visualization of pedigrees into connected nodes with descriptive shapes and shading provides a far more interpretable format to recognize visual patterns and intuit family structures. Despite these advantages of a visual pedigree, it remains difficult to quickly and accurately visualize a pedigree given a pedigree text file. Results Here we describe ped_draw a command line and web tool as a simple and easy solution to pedigree visualization. Ped_draw is capable of drawing complex multi-generational pedigrees and conforms to the accepted standards for depicting pedigrees visually. The command line tool can be used as a simple one liner command, utilizing graphviz to generate an image file. The web tool, https://peddraw.github.io, allows the user to either: paste a pedigree file, type to construct a pedigree file in the text box or upload a pedigree file. Users can save the generated image file in various formats. Conclusions We believe ped_draw is a useful pedigree drawing tool that improves on current methods due to its ease of use and approachability. Ped_draw allows users with various levels of expertise to quickly and easily visualize pedigrees.

Download Full-text

PrimeDesign software for rapid and simplified design of prime editing guide RNAs

10.21203/rs.3.rs-111349/v1 ◽

2020 ◽

Author(s):

Jonathan Hsu ◽

Julian Grünewald ◽

Regan Szalay ◽

Justine Shih ◽

Andrew Anzalone ◽

...

Keyword(s):

Genetic Variants ◽

Web Application ◽

Saturation Mutagenesis ◽

Command Line ◽

Guide Rna ◽

Guide Rnas ◽

Pathogenic Variants ◽

Genome Wide ◽

Command Line Tool ◽

User Friendly

Abstract Prime editing (PE) is a versatile genome editing technology, but design of the required guide RNAs is more complex than for standard CRISPR-based nucleases or base editors. Here we describe PrimeDesign, a user-friendly, end-to-end web application and command-line tool for the design of PE experiments. PrimeDesign can be used for single and combination editing applications, as well as genome-wide and saturation mutagenesis screens. Using PrimeDesign, we constructed PrimeVar, a comprehensive and searchable database that includes candidate prime editing guide RNA (pegRNA) and nicking sgRNA (ngRNA) combinations for installing or correcting >68,500 pathogenic human genetic variants from the ClinVar database. Finally, we used PrimeDesign to design pegRNAs/ngRNAs to install a variety of human pathogenic variants in human cells.

Download Full-text

era5cli: The command line tool to download ERA5 data

10.5194/egusphere-egu2020-21619 ◽

2020 ◽

Author(s):

Jaro Camphuijsen ◽

Ronald van Haren ◽

Yifat Dzigan ◽

Niels Drost ◽

Fakhareh Alidoost ◽

...

Keyword(s):

Source Code ◽

Reanalysis Data ◽

Command Line ◽

Climate Data ◽

Web Interface ◽

Command Line Interface ◽

Short Introduction ◽

Data Store ◽

Command Line Tool ◽

Advanced Knowledge

With the release of the ERA5 dataset, worldwide high resolution reanalysis data became available with open access for public use. The Copernicus CDS (Climate Data Store) offers two options for accessing the data: a web interface and a Python API. Consequently, automated downloading of the data requires advanced knowledge of Python and a lot of work. To make this process easier, we developed era5cli.&#160;The command line interface tool era5cli enables automated downloading of ERA5 using a single command. All variables and options available in the CDS web form are now available for download in an efficient way. Both the monthly and hourly dataset are supported. Besides automation, era5cli adds several useful functionalities to the download pipeline.One of the key options in era5cli is to spread one download command over multiple CDS requests, resulting in higher download speeds. Files can be saved in both GRIB and NETCDF format with automatic, yet customizable file names. The `info` command lists correct names of the available variables and pressure levels for 3D variables. For debugging purposes and testing the `--dryrun` option can be selected to return only the CDS request. An overview of all available options, including instructions on how to configure your CDS account, is available in our documentation. Source code is available on https://github.com/eWaterCycle/era5cli.In this PICO presentation we will provide an overview of era5cli, as well as a short introduction on how to use era5cli.

Download Full-text

A decoupled, modular and scriptable architecture for tools to curate data platforms

10.1101/2020.09.28.282699 ◽

2020 ◽

Author(s):

Moritz Langenstein ◽

Henning Hermjakob ◽

Manuel Bernal Llinares

Keyword(s):

Web Application ◽

Production Systems ◽

Source Code ◽

Black Box ◽

Command Line ◽

Web Interface ◽

Link Type ◽

Data Platform ◽

The Web

AbstractMotivationCuration is essential for any data platform to maintain the quality of the data it provides. Existing databases, which require maintenance, and the amount of newly published information that needs to be surveyed, are growing rapidly. More efficient curation is often vital to keep up with this growth, requiring modern curation tools. However, curation interfaces are often complex and difficult to further develop. Furthermore, opportunities for experimentation with curation workflows may be lost due to a lack of development resources, or a reluctance to change sensitive production systems.ResultsWe propose a decoupled, modular and scriptable architecture to build curation tools on top of existing platforms. Instead of modifying the existing infrastructure, our architecture treats the existing platform as a black box and relies only on its public APIs and web application. As a decoupled program, the tool’s architecture gives more freedom to developers and curators. This added flexibility allows for quickly prototyping new curation workflows as well as adding all kinds of analysis around the data platform. The tool can also streamline and enhance the curator’s interaction with the web interface of the platform. We have implemented this design in cmd-iaso, a command-line curation tool for the identifiers.org registry.AvailabilityThe cmd-iaso curation tool is implemented in Python 3.7+ and supports Linux, macOS and Windows. Its source code and documentation are freely available from https://github.com/identifiers-org/cmd-iaso. It is also published as a Docker container at https://hub.docker.com/r/identifiersorg/[email protected]

Download Full-text

PrimeDesign software for rapid and simplified design of prime editing guide RNAs

Nature Communications ◽

10.1038/s41467-021-21337-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Jonathan Y. Hsu ◽

Julian Grünewald ◽

Regan Szalay ◽

Justine Shih ◽

Andrew V. Anzalone ◽

...

Keyword(s):

Genetic Variants ◽

Web Application ◽

Saturation Mutagenesis ◽

Command Line ◽

Guide Rna ◽

Guide Rnas ◽

Pathogenic Variants ◽

Genome Wide ◽

Command Line Tool ◽

User Friendly

AbstractPrime editing (PE) is a versatile genome editing technology, but design of the required guide RNAs is more complex than for standard CRISPR-based nucleases or base editors. Here we describe PrimeDesign, a user-friendly, end-to-end web application and command-line tool for the design of PE experiments. PrimeDesign can be used for single and combination editing applications, as well as genome-wide and saturation mutagenesis screens. Using PrimeDesign, we construct PrimeVar, a comprehensive and searchable database that includes candidate prime editing guide RNA (pegRNA) and nicking sgRNA (ngRNA) combinations for installing or correcting >68,500 pathogenic human genetic variants from the ClinVar database. Finally, we use PrimeDesign to design pegRNAs/ngRNAs to install a variety of human pathogenic variants in human cells.

Download Full-text

GenMap: ultra-fast computation of genome mappability

Bioinformatics ◽

10.1093/bioinformatics/btaa222 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3687-3692 ◽

Cited By ~ 1

Author(s):

Christopher Pockrandt ◽

Mai Alzamel ◽

Costas S Iliopoulos ◽

Knut Reinert

Keyword(s):

Source Code ◽

Probe Design ◽

Fast Method ◽

Biological Applications ◽

Fast Computation ◽

Genomic Position ◽

Guide Rna ◽

Binary Output ◽

A Genome ◽

Reciprocal Value

Abstract Motivation Computing the uniqueness of k-mers for each position of a genome while allowing for up to e mismatches is computationally challenging. However, it is crucial for many biological applications such as the design of guide RNA for CRISPR experiments. More formally, the uniqueness or (k, e)-mappability can be described for every position as the reciprocal value of how often this k-mer occurs approximately in the genome, i.e. with up to e mismatches. Results We present a fast method GenMap to compute the (k, e)-mappability. We extend the mappability algorithm, such that it can also be computed across multiple genomes where a k-mer occurrence is only counted once per genome. This allows for the computation of marker sequences or finding candidates for probe design by identifying approximate k-mers that are unique to a genome or that are present in all genomes. GenMap supports different formats such as binary output, wig and bed files as well as csv files to export the location of all approximate k-mers for each genomic position. Availability and implementation GenMap can be installed via bioconda. Binaries and C++ source code are available on https://github.com/cpockrandt/genmap.

Download Full-text