ChassiDex: A microbial database useful for synthetic biology applications

Mapping Intimacies ◽

10.1101/703033 ◽

2019 ◽

Author(s):

B P Kailash ◽

D Karthik ◽

Mousami Shinde ◽

Nikhita Damaraju ◽

Anantha Barathi Muthukrishnan ◽

...

Keyword(s):

Synthetic Biology ◽

Model Organism ◽

Genetically Engineered ◽

Model Organisms ◽

Host Organism ◽

Transformation Protocol ◽

Non Profit ◽

Link Type ◽

User Friendly ◽

Biobrick Parts

ChassiDex is an open-source, non-profit online host organism database that houses a repository of molecular, biological and genetic data for model organisms with applications in synthetic biology. The structured user-friendly environment makes it easy to browse information. The database consists of a page for each model organism subdivided into sections such as Growth Characteristics, Strain diversity, Culture sources, Maintenance protocol, Transformation protocol, BioBrick parts and commonly used vectors. With tools such as CUTE built for codon usage table generator, it is also easy to generate and download accurate novel codon tables for unconventional hosts in suitable formats. This database was built as a project for the International Genetically Engineered Machine Competition in 2017 with the mission of making it easy to shift from working with one host organism to another unconventional host organism for any researcher in the field of synthetic biology. The code along with other instructions for the usage of the database and tools are publicly available at the GitHub page. We encourage the synthetic biology community to contribute to the database by adding data for any additional or existing host organism.https://chassidex.org; https://github.com/ChassiDex

Download Full-text

BioMaster: An Integrated Database and Analytic Platform to Provide Comprehensive Information About BioBrick Parts

Frontiers in Microbiology ◽

10.3389/fmicb.2021.593979 ◽

2021 ◽

Vol 12 ◽

Author(s):

Beibei Wang ◽

Huayi Yang ◽

Jianan Sun ◽

Chuhao Dou ◽

Jian Huang ◽

...

Keyword(s):

Synthetic Biology ◽

Biological Systems ◽

Relevant Information ◽

Genetically Engineered ◽

Major Obstacle ◽

Genetic Circuit ◽

Related Literature ◽

Comprehensive Information ◽

User Friendly ◽

Biobrick Parts

Synthetic biology seeks to create new biological parts, devices, and systems, and to reconfigure existing natural biological systems for custom-designed purposes. The standardized BioBrick parts are the foundation of synthetic biology. The incomplete and flawed metadata of BioBrick parts, however, are a major obstacle for designing genetic circuit easily, quickly, and accurately. Here, a database termed BioMaster http://www.biomaster-uestc.cn was developed to extensively complement information about BioBrick parts, which includes 47,934 items of BioBrick parts from the international Genetically Engineered Machine (iGEM) Registry with more comprehensive information integrated from 10 databases, providing corresponding information about functions, activities, interactions, and related literature. Moreover, BioMaster is also a user-friendly platform for retrieval and analyses of relevant information on BioBrick parts.

Download Full-text

RACS: rapid analysis of ChIP-Seq data for contig based genomes

BMC Bioinformatics ◽

10.1186/s12859-019-3100-2 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Alejandro Saettone ◽

Marcelo Ponce ◽

Syed Nabeel-Shah ◽

Jeffrey Fillingham

Keyword(s):

Open Source ◽

Genome Sequence ◽

Dna Sequences ◽

Model Organism ◽

Rapid Analysis ◽

Model Organisms ◽

Published Data ◽

Computational Pipeline ◽

Data Set ◽

Link Type

Abstract Background Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. Results We present a one-stop computational pipeline, “Rapid Analysis of ChIP-Seq data” (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS. RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. Conclusions The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.

Download Full-text

A novel workflow to improve multi-locus genotyping of wildlife species: an experimental set-up with a known model system

10.1101/638288 ◽

2019 ◽

Cited By ~ 2

Author(s):

Mark A.F. Gillingham ◽

B. Karina Montero ◽

Kerstin Wihelm ◽

Kara Grudzus ◽

Simone Sommer ◽

...

Keyword(s):

Sequence Data ◽

Model Organism ◽

Model Systems ◽

Model Organisms ◽

Amplification Efficiency ◽

Amplification Bias ◽

Link Type ◽

Multiple Loci ◽

Number Variation ◽

Set Up

ABSTRACTGenotyping novel complex multigene systems is particularly challenging in non-model organisms. Target primers frequently amplify simultaneously multiple loci leading to high PCR and sequencing artefacts such as chimeras and allele amplification bias. Most next-generation sequencing genotyping pipelines have been validated in non-model systems whereby the real genotype is unknown and the generation of artefacts may be highly repeatable. Further hindering accurate genotyping, the relationship between artefacts and copy number variation (CNV) within a PCR remains poorly described. Here we investigate the latter by experimentally combining multiple known major histocompatibility complex (MHC) haplotypes of a model organism (chicken, Gallus gallus, 43 artificial genotypes with 2-13 alleles per amplicon). In addition to well defined “optimal” primers, we simulated a non-model species situation by designing “naive” primers, with sequence data from closely related Galliform species. We applied a novel open-source genotyping pipeline (ACACIA) to the data, and compared its performance with another, previously published, pipeline. ACACIA yielded very high allele calling accuracy (>98%). Non-chimeric artefacts increased linearly with increasing CNV but chimeric artefacts leveled when amplifying more than 4-6 alleles. As expected, we found heterogeneous amplification efficiency of allelic variants when co-amplifying multiple loci. Using our validated ACACIA pipeline and the example data of this study, we discuss in detail the pitfalls researchers should avoid in order to reliably genotype complex multigene systems. ACACIA and the datasets used in this study are publicly available at GitLab and FigShare (https://gitlab.com/psc_santos/ACACIAandhttps://figshare.com/projects/ACACIA/66485).

Download Full-text

Pedigree and Pedigree Import Wizard

HortScience ◽

10.21273/hortsci.33.3.552g ◽

1998 ◽

Vol 33 (3) ◽

pp. 552g-553

Author(s):

Shahrokh Khandizadeh

Keyword(s):

Additional Data ◽

File Format ◽

Fruit Crops ◽

Operating Environment ◽

Agronomic Characteristics ◽

Link Type ◽

Plant Characteristics ◽

User Friendly

Pedigree for Windows is a user-friendly program that allows the user to trace agronomic characteristics, draw pedigrees, and view images of several fruit crops, including more than 1400 apple, 800 strawberry, 800 almond, 100 blackberry, 80 blueberry, 790 pear, 200 raspberry examples. Pedigree Import Wizard®© for Windows is an add-on software for users who are interested in importing their research or breeding data records of fruit, flower, and plant characteristics and any related images into Pedigree for Windows. Pedigree for Windows and Pedigree Import Wizard have been designed so that a user familiar with the Windows operating environment should have little need to refer to the documentation provided with the program. Pedigree Import Wizard uses a comma-separated value (csv) file format under the MS Excel environment. This option allows the user to add or import additional data to the existing database that are already stored in other software such as Lotus, Excel, Access, QuattroPro, WordPerfect, and MS Word tables, etc., as long as they work under the Windows environment. A free demo version of Pedigree and Pedigree Import Wizard for Windows is available from http://www.pgris.com.

Download Full-text

In Search of Species-Specific SNPs in a Non-Model Animal (European Bison (Bison bonasus))—Comparison of De Novo and Reference-Based Integrated Pipeline of STACKS Using Genotyping-by-Sequencing (GBS) Data

Animals ◽

10.3390/ani11082226 ◽

2021 ◽

Vol 11 (8) ◽

pp. 2226

Author(s):

Sazia Kunvar ◽

Sylwia Czarnomska ◽

Cino Pertoldi ◽

Małgorzata Tokarska

Keyword(s):

Reference Genome ◽

De Novo ◽

Bos Taurus ◽

Model Organism ◽

Genotyping By Sequencing ◽

Model Organisms ◽

European Bison ◽

Model Animal ◽

Pcr Duplicates ◽

Species Specific

The European bison is a non-model organism; thus, most of its genetic and genomic analyses have been performed using cattle-specific resources, such as BovineSNP50 BeadChip or Illumina Bovine 800 K HD Bead Chip. The problem with non-specific tools is the potential loss of evolutionary diversified information (ascertainment bias) and species-specific markers. Here, we have used a genotyping-by-sequencing (GBS) approach for genotyping 256 samples from the European bison population in Bialowieza Forest (Poland) and performed an analysis using two integrated pipelines of the STACKS software: one is de novo (without reference genome) and the other is a reference pipeline (with reference genome). Moreover, we used a reference pipeline with two different genomes, i.e., Bos taurus and European bison. Genotyping by sequencing (GBS) is a useful tool for SNP genotyping in non-model organisms due to its cost effectiveness. Our results support GBS with a reference pipeline without PCR duplicates as a powerful approach for studying the population structure and genotyping data of non-model organisms. We found more polymorphic markers in the reference pipeline in comparison to the de novo pipeline. The decreased number of SNPs from the de novo pipeline could be due to the extremely low level of heterozygosity in European bison. It has been confirmed that all the de novo/Bos taurus and Bos taurus reference pipeline obtained SNPs were unique and not included in 800 K BovineHD BeadChip.

Download Full-text

BREC: an R package/Shiny app for automatically identifying heterochromatin boundaries and estimating local recombination rates along chromosomes

BMC Bioinformatics ◽

10.1186/s12859-021-04233-1 ◽

2021 ◽

Vol 22 (S6) ◽

Author(s):

Yasmine Mansour ◽

Annie Chateau ◽

Anna-Sophie Fiston-Lavier

Keyword(s):

Data Quality ◽

Data Science ◽

Fruit Fly ◽

R Package ◽

Model Organisms ◽

Data Quality Control ◽

Recombination Rates ◽

Functional Dynamics ◽

Shiny App ◽

User Friendly

Abstract Background Meiotic recombination is a vital biological process playing an essential role in genome's structural and functional dynamics. Genomes exhibit highly various recombination profiles along chromosomes associated with several chromatin states. However, eu-heterochromatin boundaries are not available nor easily provided for non-model organisms, especially for newly sequenced ones. Hence, we miss accurate local recombination rates necessary to address evolutionary questions. Results Here, we propose an automated computational tool, based on the Marey maps method, allowing to identify heterochromatin boundaries along chromosomes and estimating local recombination rates. Our method, called BREC (heterochromatin Boundaries and RECombination rate estimates) is non-genome-specific, running even on non-model genomes as long as genetic and physical maps are available. BREC is based on pure statistics and is data-driven, implying that good input data quality remains a strong requirement. Therefore, a data pre-processing module (data quality control and cleaning) is provided. Experiments show that BREC handles different markers' density and distribution issues. Conclusions BREC's heterochromatin boundaries have been validated with cytological equivalents experimentally generated on the fruit fly Drosophila melanogaster genome, for which BREC returns congruent corresponding values. Also, BREC's recombination rates have been compared with previously reported estimates. Based on the promising results, we believe our tool has the potential to help bring data science into the service of genome biology and evolution. We introduce BREC within an R-package and a Shiny web-based user-friendly application yielding a fast, easy-to-use, and broadly accessible resource. The BREC R-package is available at the GitHub repository https://github.com/GenomeStructureOrganization.

Download Full-text

Alliance of Genome Resources Portal: unified model organism research platform

Nucleic Acids Research ◽

10.1093/nar/gkz813 ◽

2019 ◽

Vol 48 (D1) ◽

pp. D650-D658 ◽

Cited By ~ 36

Author(s):

◽

Julie Agapite ◽

Laurent-Philippe Albou ◽

Suzi Aleksander ◽

Joanna Argasinska ◽

...

Keyword(s):

Gene Ontology ◽

Model Organism ◽

Model Organisms ◽

Data Types ◽

Primary Model ◽

Genomic Studies ◽

Health And Disease ◽

Extensive Body ◽

Access To Data ◽

Model Organism Databases

Abstract The Alliance of Genome Resources (Alliance) is a consortium of the major model organism databases and the Gene Ontology that is guided by the vision of facilitating exploration of related genes in human and well-studied model organisms by providing a highly integrated and comprehensive platform that enables researchers to leverage the extensive body of genetic and genomic studies in these organisms. Initiated in 2016, the Alliance is building a central portal (www.alliancegenome.org) for access to data for the primary model organisms along with gene ontology data and human data. All data types represented in the Alliance portal (e.g. genomic data and phenotype descriptions) have common data models and workflows for curation. All data are open and freely available via a variety of mechanisms. Long-term plans for the Alliance project include a focus on coverage of additional model organisms including those without dedicated curation communities, and the inclusion of new data types with a particular focus on providing data and tools for the non-model-organism researcher that support enhanced discovery about human health and disease. Here we review current progress and present immediate plans for this new bioinformatics resource.

Download Full-text

Properties of alternative microbial hosts used in synthetic biology: towards the design of a modular chassis

Essays in Biochemistry ◽

10.1042/ebc20160015 ◽

2016 ◽

Vol 60 (4) ◽

pp. 303-313 ◽

Cited By ~ 30

Author(s):

Juhyun Kim ◽

Manuel Salvador ◽

Elizabeth Saunders ◽

Jaime González ◽

Claudio Avignone-Rossa ◽

...

Keyword(s):

Synthetic Biology ◽

Genetic Information ◽

Essential Element ◽

Model Organisms ◽

Biotechnological Applications ◽

Cell Models ◽

Genetic Circuits ◽

Metabolic Diversity ◽

Translational Machinery ◽

Metabolic Fluxes

The chassis is the cellular host used as a recipient of engineered biological systems in synthetic biology. They are required to propagate the genetic information and to express the genes encoded in it. Despite being an essential element for the appropriate function of genetic circuits, the chassis is rarely considered in their design phase. Consequently, the circuits are transferred to model organisms commonly used in the laboratory, such as Escherichia coli, that may be suboptimal for a required function. In this review, we discuss some of the properties desirable in a versatile chassis and summarize some examples of alternative hosts for synthetic biology amenable for engineering. These properties include a suitable life style, a robust cell wall, good knowledge of its regulatory network as well as of the interplay of the host components with the exogenous circuits, and the possibility of developing whole-cell models and tuneable metabolic fluxes that could allow a better distribution of cellular resources (metabolites, ATP, nucleotides, amino acids, transcriptional and translational machinery). We highlight Pseudomonas putida, widely used in many different biotechnological applications as a prominent organism for synthetic biology due to its metabolic diversity, robustness and ease of manipulation.

Download Full-text

Transgenic Epigenetics: Using Transgenic Organisms to Examine Epigenetic Phenomena

Genetics Research International ◽

10.1155/2012/689819 ◽

2012 ◽

Vol 2012 ◽

pp. 1-14 ◽

Cited By ~ 2

Author(s):

Lori A. McEachern

Keyword(s):

Molecular Mechanisms ◽

Model Organism ◽

Evolutionary Conservation ◽

Model Organisms ◽

Valuable Insight ◽

Epigenetic Control ◽

Transgenic Organisms ◽

Epigenetic Analysis ◽

Insight Into ◽

Epigenetic Processes

Non-model organisms are generally more difficult and/or time consuming to work with than model organisms. In addition, epigenetic analysis of model organisms is facilitated by well-established protocols, and commercially-available reagents and kits that may not be available for, or previously tested on, non-model organisms. Given the evolutionary conservation and widespread nature of many epigenetic mechanisms, a powerful method to analyze epigenetic phenomena from non-model organisms would be to use transgenic model organisms containing an epigenetic region of interest from the non-model. Interestingly, while transgenic Drosophila and mice have provided significant insight into the molecular mechanisms and evolutionary conservation of the epigenetic processes that target epigenetic control regions in other model organisms, this method has so far been under-exploited for non-model organism epigenetic analysis. This paper details several experiments that have examined the epigenetic processes of genomic imprinting and paramutation, by transferring an epigenetic control region from one model organism to another. These cross-species experiments demonstrate that valuable insight into both the molecular mechanisms and evolutionary conservation of epigenetic processes may be obtained via transgenic experiments, which can then be used to guide further investigations and experiments in the species of interest.

Download Full-text

PathScore: a web tool for identifying altered pathways in cancer data

10.1101/067090 ◽

2016 ◽

Cited By ~ 2

Author(s):

Stephen G. Gaffney ◽

Jeffrey P. Townsend

Keyword(s):

Web Application ◽

Somatic Mutations ◽

Supplementary Information ◽

Web Tool ◽

Cancer Data ◽

Link Type ◽

Novel Approach ◽

Supplementary Material ◽

User Friendly ◽

Pathway Effect

ABSTRACTSummaryPathScore quantifies the level of enrichment of somatic mutations within curated pathways, applying a novel approach that identifies pathways enriched across patients. The application provides several user-friendly, interactive graphic interfaces for data exploration, including tools for comparing pathway effect sizes, significance, gene-set overlap and enrichment differences between projects.Availability and ImplementationWeb application available at pathscore.publichealth.yale.edu. Site implemented in Python and MySQL, with all major browsers supported. Source code available at github.com/sggaffney/pathscore with a GPLv3 [email protected] InformationAdditional documentation can be found at http://pathscore.publichealth.yale.edu/faq.

Download Full-text