OPTIMIZER: a web server for optimizing the codon usage of DNA sequences

P. Puigbo; E. Guzman; A. Romeu; S. Garcia-Vallve

doi:10.1093/nar/gkm219

Codon harmonization reduces amino acid misincorporation in bacterially expressed P. falciparum proteins and improves their immunogenicity

AMB Express ◽

10.1186/s13568-019-0890-6 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Neeraja Punde ◽

Jennifer Kooken ◽

Dagmar Leary ◽

Patricia M. Legler ◽

Evelina Angov

Keyword(s):

Protein Structure ◽

Amino Acid ◽

Codon Usage ◽

Dna Sequences ◽

Structural Integrity ◽

Host Cells ◽

Loss Of Function ◽

Species Specific ◽

And Function ◽

The Impact

Abstract Codon usage frequency influences protein structure and function. The frequency with which codons are used potentially impacts primary, secondary and tertiary protein structure. Poor expression, loss of function, insolubility, or truncation can result from species-specific differences in codon usage. “Codon harmonization” more closely aligns native codon usage frequencies with those of the expression host particularly within putative inter-domain segments where slower rates of translation may play a role in protein folding. Heterologous expression of Plasmodium falciparum genes in Escherichia coli has been a challenge due to their AT-rich codon bias and the highly repetitive DNA sequences. Here, codon harmonization was applied to the malarial antigen, CelTOS (Cell-traversal protein for ookinetes and sporozoites). CelTOS is a highly conserved P. falciparum protein involved in cellular traversal through mosquito and vertebrate host cells. It reversibly refolds after thermal denaturation making it a desirable malarial vaccine candidate. Protein expressed in E. coli from a codon harmonized sequence of P. falciparum CelTOS (CH-PfCelTOS) was compared with protein expressed from the native codon sequence (N-PfCelTOS) to assess the impact of codon usage on protein expression levels, solubility, yield, stability, structural integrity, recognition with CelTOS-specific mAbs and immunogenicity in mice. While the translated proteins were expected to be identical, the translated products produced from the codon-harmonized sequence differed in helical content and showed a smaller distribution of polypeptides in mass spectra indicating lower heterogeneity of the codon harmonized version and fewer amino acid misincorporations. Substitutions of hydrophobic-to-hydrophobic amino acid were observed more commonly than any other. CH-PfCelTOS induced significantly higher antibody levels compared with N-PfCelTOS; however, no significant differences in either IFN-γ or IL-4 cellular responses were detected between the two antigens.

Download Full-text

RepEx: A web server to extract sequence repeats from protein and DNA sequences

Computational Biology and Chemistry ◽

10.1016/j.compbiolchem.2018.12.015 ◽

2019 ◽

Vol 78 ◽

pp. 424-430 ◽

Cited By ~ 1

Author(s):

Daliah Michael ◽

M. Gurusaran ◽

R. Santhosh ◽

Md. Khaja Hussain ◽

S.N. Satheesh ◽

...

Keyword(s):

Dna Sequences ◽

Web Server

Download Full-text

A novel framework for evaluating the performance of codon usage bias metrics

Journal of The Royal Society Interface ◽

10.1098/rsif.2017.0667 ◽

2018 ◽

Vol 15 (138) ◽

pp. 20170667 ◽

Cited By ~ 3

Author(s):

Sophia S. Liu ◽

Adam J. Hockenberry ◽

Michael C. Jewett ◽

Luís A. N. Amaral

Keyword(s):

Codon Usage ◽

Dna Sequences ◽

Codon Usage Bias ◽

False Negative ◽

Gc Content ◽

Sequence Length ◽

Protein Coding ◽

Cellular Processes ◽

Negative Findings ◽

Measured Effect

The unequal utilization of synonymous codons affects numerous cellular processes including translation rates, protein folding and mRNA degradation. In order to understand the biological impact of variable codon usage bias (CUB) between genes and genomes, it is crucial to be able to accurately measure CUB for a given sequence. A large number of metrics have been developed for this purpose, but there is currently no way of systematically testing the accuracy of individual metrics or knowing whether metrics provide consistent results. This lack of standardization can result in false-positive and false-negative findings if underpowered or inaccurate metrics are applied as tools for discovery. Here, we show that the choice of CUB metric impacts both the significance and measured effect sizes in numerous empirical datasets, raising questions about the generality of findings in published research. To bring about standardization, we developed a novel method to create synthetic protein-coding DNA sequences according to different models of codon usage. We use these benchmark sequences to identify the most accurate and robust metrics with regard to sequence length, GC content and amino acid heterogeneity. Finally, we show how our benchmark can aid the development of new metrics by providing feedback on its performance compared to the state of the art.

Download Full-text

FOLDNA, a Web Server for Self-Assembled DNA Nanostructure Autoscaffolds and Autostaples

Journal of Nanotechnology ◽

10.1155/2012/453953 ◽

2012 ◽

Vol 2012 ◽

pp. 1-5 ◽

Cited By ~ 3

Author(s):

Chensheng Zhou ◽

Heng Luo ◽

Xiaolu Feng ◽

Xingwang Li ◽

Jie Zhu ◽

...

Keyword(s):

Drug Delivery ◽

Dna Sequences ◽

Self Assembly ◽

Web Server ◽

Automatic Design ◽

Dna Nanostructures ◽

Complementary Dna ◽

Dna Nanostructure ◽

Comprehensive Information ◽

Self Assembled

DNA self-assembly is a nanotechnology that folds DNA into desired shapes. Self-assembled DNA nanostructures, also known as origami, are increasingly valuable in nanomaterial and biosensing applications. Two ways to use DNA nanostructures in medicine are to form nanoarrays, and to work as vehicles in drug delivery. The DNA nanostructures perform well as a biomaterial in these areas because they have spatially addressable and size controllable properties. However, manually designing complementary DNA sequences for self-assembly is a technically demanding and time consuming task, which makes it advantageous for computers to do this job instead. We have developed a web server, FOLDNA, which can automatically design 2D self-assembled DNA nanostructures according to custom pictures and scaffold sequences provided by the users. It is the first web server to provide an entirely automatic design of self-assembled DNA nanostructure, and it takes merely a second to generate comprehensive information for molecular experiments including: scaffold DNA pathways, staple DNA directions, and staple DNA sequences. This program could save as much as several hours in the designing step for each DNA nanostructure. We randomly selected some shapes and corresponding outputs from our server and validated its performance in molecular experiments.

Download Full-text

Codon usage patterns distort phylogenies from or of DNA sequences

American Journal of Botany ◽

10.3732/ajb.92.8.1221 ◽

2005 ◽

Vol 92 (8) ◽

pp. 1221-1233 ◽

Cited By ~ 13

Author(s):

M. L. Christianson

Keyword(s):

Codon Usage ◽

Dna Sequences ◽

Usage Patterns

Download Full-text

Codon-Optimized Fluorescent Proteins Designed for Expression in Low-GC Gram-Positive Bacteria

Applied and Environmental Microbiology ◽

10.1128/aem.02066-08 ◽

2009 ◽

Vol 75 (7) ◽

pp. 2099-2110 ◽

Cited By ~ 46

Author(s):

Inka Sastalla ◽

Kannie Chim ◽

Gordon Y. C. Cheung ◽

Andrei P. Pomerantsev ◽

Stephen H. Leppla

Keyword(s):

Codon Usage ◽

Dna Sequences ◽

Fluorescent Protein ◽

Protective Antigen ◽

Fluorescent Proteins ◽

Yellow Fluorescent Protein ◽

Gc Content ◽

Virulence Plasmid ◽

Positive Bacterium ◽

Gram Positive

ABSTRACT Fluorescent proteins have wide applications in biology. However, not all of these proteins are properly expressed in bacteria, especially if the codon usage and genomic GC content of the host organism are not ideal for high expression. In this study, we analyzed the DNA sequences of multiple fluorescent protein genes with respect to codons and GC content and compared them to a low-GC gram-positive bacterium, Bacillus anthracis. We found high discrepancies for cyan fluorescent protein (CFP), yellow fluorescent protein (YFP), and the photoactivatable green fluorescent protein (PAGFP), but not GFP, with regard to GC content and codon usage. Concomitantly, when the proteins were expressed in B. anthracis, CFP- and YFP-derived fluorescence was undetectable microscopically, a phenomenon caused not by lack of gene transcription or degradation of the proteins but by lack of protein expression. To improve expression in bacteria with low genomic GC contents, we synthesized a codon-optimized gfp and constructed optimized photoactivatable pagfp, cfp, and yfp, which were in contrast to nonoptimized genes highly expressed in B. anthracis and in another low-GC gram-positive bacterium, Staphylococcus aureus. Using optimized GFP as a reporter, we were able to monitor the activity of the protective antigen promoter of B. anthracis and confirm its dependence on bicarbonate and regulators present on virulence plasmid pXO1.

Download Full-text

AmpliconDesign – An interactive web server for the design of high-throughput targeted DNA methylation assays

10.1101/2020.05.23.043448 ◽

2020 ◽

Author(s):

Maximilian Schönung ◽

Jana Hess ◽

Pascal Bawidamann ◽

Sina Stäble ◽

Joschka Hey ◽

...

Keyword(s):

Dna Methylation ◽

Quality Control ◽

High Throughput ◽

Dna Sequences ◽

Web Server ◽

Dna Amplification ◽

Pcr Primers ◽

Primer Design ◽

Link Type ◽

Targeted Analysis

ABSTRACTTargeted analysis of DNA methylation patterns based on bisulfite-treated genomic DNA (BT-DNA) is considered as a gold-standard for epigenetic biomarker development. Existing software tools facilitate primer design, primer quality control or visualization of primer localization. However, high-throughput design of primers for BT-DNA amplification is hampered by limits in throughput and functionality of existing tools, requiring users to repeatedly perform specific tasks manually. Consequently, the design of PCR primers for BT-DNA remains a tedious and time-consuming process. To bridge this gap, we developed AmpliconDesign, a webserver providing a scalable and user-friendly platform for the design and analysis of targeted DNA methylation studies based on BT-DNA, e.g. deep amplicon bisulfite sequencing (ampBS-seq), EpiTYPER MassArray, or pyrosequencing. Core functionality of the web server includes high-throughput primer design and binding site validation based on in silico bisulfite-converted DNA sequences, prediction of fragmentation patterns for EpiTYPER MassArray, an interactive quality control as well as a streamlined analysis workflow for ampBS-seq.Availability and ImplementationThe AmpliconDesign webserver is freely available online at: https://amplicondesign.dkfz.de/. AmpliconDesign has been implemented using the R Shiny framework (Chang et al., 2018). The source code is publicly available under the GNU General Public License v3.0 (https://github.com/MaxSchoenung/AmpliconDesign).ContactDaniel B. Lipka ([email protected]) & Maximilian Schönung ([email protected])

Download Full-text

Universal Features for the Classification of Coding and Non-coding DNA Sequences

Bioinformatics and Biology Insights ◽

10.4137/bbi.s2236 ◽

2009 ◽

Vol 3 ◽

pp. BBI.S2236 ◽

Cited By ~ 5

Author(s):

Nicolas Carels ◽

Ramon Vidal ◽

Diego Frías

Keyword(s):

Codon Usage ◽

Success Rate ◽

Dna Sequences ◽

Stop Codon ◽

Chemical Properties ◽

Natural Consequence ◽

Coding Sequences ◽

Physico Chemical ◽

Simple Features

In this report, we revisited simple features that allow the classification of coding sequences (CDS) from non-coding DNA. The spectrum of codon usage of our sequence sample is large and suggests that these features are universal. The features that we investigated combine (i) the stop codon distribution, (ii) the product of purine probabilities in the three positions of nucleotide triplets, (iii) the product of Cytosine, Guanine, Adenine probabilities in 1st, 2nd, 3rd position of triplets, respectively, (iv) the product of G and C probabilities in 1st and 2nd position of triplets. These features are a natural consequence of the physico-chemical properties of proteins and their combination is successful in classifying CDS and non-coding DNA (introns) with a success rate >95% above 350 bp. The coding strand and coding frame are implicitly deduced when the sequences are classified as coding.

Download Full-text

Similarity analysis of DNA sequences based on codon usage

Chemical Physics Letters ◽

10.1016/j.cplett.2008.05.039 ◽

2008 ◽

Vol 459 (1-6) ◽

pp. 172-174 ◽

Cited By ~ 7

Author(s):

Chun Li ◽

Xiaoqing Yu ◽

Nadia Helal

Keyword(s):

Codon Usage ◽

Dna Sequences ◽

Similarity Analysis

Download Full-text

BiasAway: command-line and web server to generate nucleotide composition-matched DNA background sequences

Bioinformatics ◽

10.1093/bioinformatics/btaa928 ◽

2020 ◽

Author(s):

Aziz Khan ◽

Rafael Riudavets Puig ◽

Paul Boddie ◽

Anthony Mathelier

Keyword(s):

Dna Sequences ◽

Source Code ◽

Web Server ◽

Enrichment Analysis ◽

Nucleotide Composition ◽

Supplementary Information ◽

Command Line ◽

Sequence Composition ◽

Command Line Tool ◽

Gc Bias

Abstract Motivation Accurate motif enrichment analyses depend on the choice of background DNA sequences used, which should ideally match the sequence composition of the foreground sequences. It is important to avoid false positive enrichment due to sequence biases in the genome, such as GC-bias. Therefore, relying on an appropriate set of background sequences is crucial for enrichment analysis. Results We developed BiasAway, a command line tool and its dedicated easy-to-use web server to generate synthetic sequences matching any k-mer nucleotide composition or select genomic DNA sequences matching the mononucleotide composition of the foreground sequences through four different models. For genomic sequences, we provide precomputed partitions of genomes from nine species with five different bin sizes to generate appropriate genomic background sequences. Availability and implementation BiasAway source code is freely available from Bitbucket (https://bitbucket.org/CBGR/biasaway) and can be easily installed using bioconda or pip. The web server is available at https://biasaway.uio.no and a detailed documentation is available at https://biasaway.readthedocs.io. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text