CodonGenie: optimised ambiguous codon design tools

10.7287/peerj.preprints.2797v1 ◽

2017 ◽

Author(s):

Neil Swainston ◽

Andrew Currin ◽

Lucy Green ◽

Rainer Breitling ◽

Philip J Day ◽

...

Keyword(s):

Amino Acids ◽

Codon Usage ◽

Directed Evolution ◽

Web Application ◽

Design Tools ◽

Host Organism ◽

Stop Codons ◽

Dna Library ◽

Protein Mutagenesis

CodonGenie, freely available from http://codon.synbiochem.co.uk, is a simple web application for designing ambiguous codons to support protein mutagenesis applications. Ambiguous codons are derived from specific heterogeneous nucleotide mixtures, which create sequence degeneracy when synthesised in a DNA library. In directed evolution studies, such codons are carefully selected to encode multiple amino acids. For example, the codon NTN, where the code N denotes a mixture of all four nucleotides, will encode a mixture of phenylalanine, leucine, isoleucine, methionine and valine. Given a user-defined target collection of amino acids matched to an intended host organism, CodonGenie designs and analyses all ambiguous codons that encode the required amino acids. The codons are ranked according to their efficiency in encoding the required amino acids while minimising the inclusion of additional amino acids and stop codons. Organism-specific codon usage is also considered.

Download Full-text

CodonGenie: optimised ambiguous codon design tools

10.7287/peerj.preprints.2797 ◽

2017 ◽

Author(s):

Neil Swainston ◽

Andrew Currin ◽

Lucy Green ◽

Rainer Breitling ◽

Philip J Day ◽

...

Keyword(s):

Amino Acids ◽

Codon Usage ◽

Directed Evolution ◽

Web Application ◽

Design Tools ◽

Host Organism ◽

Stop Codons ◽

Dna Library ◽

Protein Mutagenesis

CodonGenie, freely available from http://codon.synbiochem.co.uk, is a simple web application for designing ambiguous codons to support protein mutagenesis applications. Ambiguous codons are derived from specific heterogeneous nucleotide mixtures, which create sequence degeneracy when synthesised in a DNA library. In directed evolution studies, such codons are carefully selected to encode multiple amino acids. For example, the codon NTN, where the code N denotes a mixture of all four nucleotides, will encode a mixture of phenylalanine, leucine, isoleucine, methionine and valine. Given a user-defined target collection of amino acids matched to an intended host organism, CodonGenie designs and analyses all ambiguous codons that encode the required amino acids. The codons are ranked according to their efficiency in encoding the required amino acids while minimising the inclusion of additional amino acids and stop codons. Organism-specific codon usage is also considered.

Download Full-text

AB0210 ACREULAR: AN R PACKAGE FOR THE CALCULATION AND VISUALISATION OF ACR/EULAR RELATED RHEUMATOID ARTHRITIS MEASURES

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-eular.2326 ◽

2020 ◽

Vol 79 (Suppl 1) ◽

pp. 1405.1-1406

Author(s):

F. Morton ◽

J. Nijjar ◽

C. Goodyear ◽

D. Porter

Keyword(s):

Rheumatoid Arthritis ◽

Functional Status ◽

Rheumatic Diseases ◽

Web Application ◽

R Package ◽

Diagnostic Classification ◽

Microsoft Excel ◽

Link Type ◽

Large Joint ◽

Programming Skills

Background:The American College of Rheumatology (ACR) and the European League Against Rheumatism (EULAR) individually and collaboratively have produced/recommended diagnostic classification, response and functional status criteria for a range of different rheumatic diseases. While there are a number of different resources available for performing these calculations individually, currently there are no tools available that we are aware of to easily calculate these values for whole patient cohorts.Objectives:To develop a new software tool, which will enable both data analysts and also researchers and clinicians without programming skills to calculate ACR/EULAR related measures for a number of different rheumatic diseases.Methods:Criteria that had been developed by ACR and/or EULAR that had been approved for the diagnostic classification, measurement of treatment response and functional status in patients with rheumatoid arthritis were identified. Methods were created using the R programming language to allow the calculation of these criteria, which were incorporated into an R package. Additionally, an R/Shiny web application was developed to enable the calculations to be performed via a web browser using data presented as CSV or Microsoft Excel files.Results:acreular is a freely available, open source R package (downloadable fromhttps://github.com/fragla/acreular) that facilitates the calculation of ACR/EULAR related RA measures for whole patient cohorts. Measures, such as the ACR/EULAR (2010) RA classification criteria, can be determined using precalculated values for each component (small/large joint counts, duration in days, normal/abnormal acute-phase reactants, negative/low/high serology classification) or by providing “raw” data (small/large joint counts, onset/assessment dates, ESR/CRP and CCP/RF laboratory values). Other measures, including EULAR response and ACR20/50/70 response, can also be calculated by providing the required information. The accompanying web application is included as part of the R package but is also externally hosted athttps://fragla.shinyapps.io/shiny-acreular. This enables researchers and clinicians without any programming skills to easily calculate these measures by uploading either a Microsoft Excel or CSV file containing their data. Furthermore, the web application allows the incorporation of additional study covariates, enabling the automatic calculation of multigroup comparative statistics and the visualisation of the data through a number of different plots, both of which can be downloaded.Figure 1.The Data tab following the upload of data. Criteria are calculated by the selecting the appropriate checkbox.Figure 2.A density plot of DAS28 scores grouped by ACR/EULAR 2010 RA classification. Statistical analysis has been performed and shows a significant difference in DAS28 score between the two groups.Conclusion:The acreular R package facilitates the easy calculation of ACR/EULAR RA related disease measures for whole patient cohorts. Calculations can be performed either from within R or by using the accompanying web application, which also enables the graphical visualisation of data and the calculation of comparative statistics. We plan to further develop the package by adding additional RA related criteria and by adding ACR/EULAR related measures for other rheumatic disorders.Disclosure of Interests:Fraser Morton: None declared, Jagtar Nijjar Shareholder of: GlaxoSmithKline plc, Consultant of: Janssen Pharmaceuticals UK, Employee of: GlaxoSmithKline plc, Paid instructor for: Janssen Pharmaceuticals UK, Speakers bureau: Janssen Pharmaceuticals UK, AbbVie, Carl Goodyear: None declared, Duncan Porter: None declared

Download Full-text

The effect of expression levels on codon usage inPlasmodium falciparum

Parasitology ◽

10.1017/s0031182003004517 ◽

2004 ◽

Vol 128 (3) ◽

pp. 245-251 ◽

Cited By ~ 26

Author(s):

L. PEIXOTO ◽

V. FERNÁNDEZ ◽

H. MUSTO

Keyword(s):

Amino Acids ◽

Plasmodium Falciparum ◽

Natural Selection ◽

Codon Usage ◽

Complete Sequence ◽

Expression Data ◽

Expression Levels ◽

Synonymous Codons ◽

Translational Selection ◽

Highly Expressed Genes

The usage of alternative synonymous codons in the completely sequenced, extremely A+T-rich parasitePlasmodium falciparumwas studied. Confirming previous studies obtained with less than 3% of the total genes recently described, we found that A- and U-ending triplets predominate but translational selection increases the frequency of a subset of codons in highly expressed genes. However, some new results come from the analysis of the complete sequence. First, there is more variation in GC3 than previously described; second, the effect of natural selection acting at the level of translation has been analysed with real expression data at 4 different stages and third, we found that highly expressed proteins increment the frequency of energetically less expensive amino acids. The implications of these results are discussed.

Download Full-text

RiboA: a web application to identify ribosome A-site locations in ribosome profiling data

BMC Bioinformatics ◽

10.1186/s12859-021-04068-w ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Danying Shao ◽

Nabeel Ahmed ◽

Nishant Soni ◽

Edward P. O’Brien

Keyword(s):

Integer Programming ◽

Web Application ◽

Stop Codon ◽

Ribosome Profiling ◽

Programming Method ◽

Analysis Tool ◽

Site Location ◽

Link Type ◽

Wide Range ◽

A Site

Abstract Background Translation is a fundamental process in gene expression. Ribosome profiling is a method that enables the study of transcriptome-wide translation. A fundamental, technical challenge in analyzing Ribo-Seq data is identifying the A-site location on ribosome-protected mRNA fragments. Identification of the A-site is essential as it is at this location on the ribosome where a codon is translated into an amino acid. Incorrect assignment of a read to the A-site can lead to lower signal-to-noise ratio and loss of correlations necessary to understand the molecular factors influencing translation. Therefore, an easy-to-use and accurate analysis tool is needed to accurately identify the A-site locations. Results We present RiboA, a web application that identifies the most accurate A-site location on a ribosome-protected mRNA fragment and generates the A-site read density profiles. It uses an Integer Programming method that reflects the biological fact that the A-site of actively translating ribosomes is generally located between the second codon and stop codon of a transcript, and utilizes a wide range of mRNA fragment sizes in and around the coding sequence (CDS). The web application is containerized with Docker, and it can be easily ported across platforms. Conclusions The Integer Programming method that RiboA utilizes is the most accurate in identifying the A-site on Ribo-Seq mRNA fragments compared to other methods. RiboA makes it easier for the community to use this method via a user-friendly and portable web application. In addition, RiboA supports reproducible analyses by tracking all the input datasets and parameters, and it provides enhanced visualization to facilitate scientific exploration. RiboA is available as a web service at https://a-site.vmhost.psu.edu/. The code is publicly available at https://github.com/obrien-lab/aip_web_docker under the MIT license.

Download Full-text

Determination of amino acids misincorporation in recombinant protein by mass spectrometry

Bangladesh Journal of Animal Science ◽

10.3329/bjas.v42i1.15760 ◽

2013 ◽

Vol 42 (1) ◽

pp. 11-19 ◽

Cited By ~ 1

Author(s):

MZ Alam ◽

L Regioneiri ◽

MAS Santos

Keyword(s):

Mass Spectrometry ◽

Amino Acids ◽

Recombinant Protein ◽

Gene Fusion ◽

Aminoglycoside Antibiotic ◽

Protein Functionality ◽

Lacz Gene ◽

Host Organism ◽

E Coli

The synthesis of protein according to genetic code of a gene determines the basis of life and a stable proteome is necessary for cell homeostatis. However, errors occur naturally during translation of protein from its mRNA, which varies from 10-3 to 10-4 per codon. These errors are more frequent in recombinant protein overexpressed in heterologous hosts and affect protein functionality. The increasing amount of nonfunctional protein is often related to mistranslation of a gene under stress. In the present study, Saccharomyces cerevisiae as a host organism to overexpress E. coli lacZ gene fusion with GST to quantify misincorporation of amino acid in GST-? galactosidase recombinant protein. The yeast was treated with various stressors such as ethanol, chromium (CrO3), and aminoglycoside antibiotic - geneticin (G418) to induce protein aggregation. The misincorporation of amino acids was studied in soluble protein fractions by mass-spectrometry to determine how much misincorporation occur. We found that under experimental stress conditions the misincorporation of amino acids ranges from 5.6 ×10-3 to 8 × 10-3, which represents 60-80 fold higher than reported level. DOI: http://dx.doi.org/10.3329/bjas.v42i1.15760 Bang. J. Anim. Sci. 2013. 42 (1): 11-19

Download Full-text

Identification of novel translated small ORFs in Escherichia coli using complementary ribosome profiling approaches

Journal of Bacteriology ◽

10.1128/jb.00352-21 ◽

2021 ◽

Author(s):

Anne Stringer ◽

Carol Smith ◽

Kyle Mangano ◽

Joseph T. Wade

Keyword(s):

Escherichia Coli ◽

Amino Acids ◽

High Sensitivity ◽

Purifying Selection ◽

Ribosome Profiling ◽

Small Subset ◽

Stop Codons ◽

Small Proteins ◽

Domains Of Life ◽

Short Orfs

Small proteins of <51 amino acids are abundant across all domains of life but are often overlooked because their small size makes them difficult to predict computationally, and they are refractory to standard proteomic approaches. Ribosome profiling has been used to infer the existence of small proteins by detecting the translation of the corresponding open reading frames (ORFs). Detection of translated short ORFs by ribosome profiling can be improved by treating cells with drugs that stall ribosomes at specific codons. Here, we combine the analysis of ribosome profiling data for Escherichia coli cells treated with antibiotics that stall ribosomes at either start or stop codons. Thus, we identify ribosome-occupied start and stop codons with high sensitivity for ∼400 novel putative ORFs. The newly discovered ORFs are mostly short, with 365 encoding proteins of <51 amino acids. We validate translation of several selected short ORFs, and show that many likely encode unstable proteins. Moreover, we present evidence that most of the newly identified short ORFs are not under purifying selection, suggesting they do not impact cell fitness, although a small subset have the hallmarks of functional ORFs. IMPORTANCE Small proteins of <51 amino acids are abundant across all domains of life but are often overlooked because their small size makes them difficult to predict computationally, and they are refractory to standard proteomic approaches. Recent studies have discovered small proteins by mapping the location of translating ribosomes on RNA using a technique known as ribosome profiling. Discovery of translated sORFs using ribosome profiling can be improved by treating cells with drugs that trap initiating ribosomes. Here, we show that combining these data with equivalent data for cells treated with a drug that stalls terminating ribosomes facilitates the discovery of small proteins. We use this approach to discover 365 putative genes that encode small proteins in Escherichia coli .

Download Full-text

LandScape: a web application for interactive genomic summary visualization

10.1101/866087 ◽

2019 ◽

Author(s):

Wenlong Jia ◽

Hechen Li ◽

Shiying Li ◽

Shuaicheng Li

Keyword(s):

Genetic Information ◽

Web Application ◽

Genomic Research ◽

File Format ◽

Data Types ◽

Web Based ◽

Link Type ◽

Level Data ◽

Real Time Visualization ◽

Information Landscape

ABSTRACTSummaryVisualizing integrated-level data from genomic research remains a challenge, as it requires sufficient coding skills and experience. Here, we present LandScapeoviz, a web-based application for interactive and real-time visualization of summarized genetic information. LandScape utilizes a well-designed file format that is capable of handling various data types, and offers a series of built-in functions to customize the appearance, explore results, and export high-quality diagrams that are available for publication.Availability and implementationLandScape is deployed at bio.oviz.org/demo-project/analyses/landscape for online use. Documentation and demo data are freely available on this website and GitHub (github.com/Nobel-Justin/Oviz-Bio-demo)[email protected]

Download Full-text

PathScore: a web tool for identifying altered pathways in cancer data

10.1101/067090 ◽

2016 ◽

Cited By ~ 2

Author(s):

Stephen G. Gaffney ◽

Jeffrey P. Townsend

Keyword(s):

Web Application ◽

Somatic Mutations ◽

Supplementary Information ◽

Web Tool ◽

Cancer Data ◽

Link Type ◽

Novel Approach ◽

Supplementary Material ◽

User Friendly ◽

Pathway Effect

ABSTRACTSummaryPathScore quantifies the level of enrichment of somatic mutations within curated pathways, applying a novel approach that identifies pathways enriched across patients. The application provides several user-friendly, interactive graphic interfaces for data exploration, including tools for comparing pathway effect sizes, significance, gene-set overlap and enrichment differences between projects.Availability and ImplementationWeb application available at pathscore.publichealth.yale.edu. Site implemented in Python and MySQL, with all major browsers supported. Source code available at github.com/sggaffney/pathscore with a GPLv3 [email protected] InformationAdditional documentation can be found at http://pathscore.publichealth.yale.edu/faq.

Download Full-text

ORFhunteR: an accurate approach for the automatic identification and annotation of open reading frames in human mRNA molecules

10.1101/2021.02.05.429963 ◽

2021 ◽

Author(s):

Vasily V. Grinev ◽

Mikalai M. Yatskou ◽

Victor V. Skakun ◽

Maryna K. Chepeleva ◽

Petr V. Nazarov

Keyword(s):

Single Molecule ◽

Web Application ◽

R Package ◽

Nucleotide Sequences ◽

Open Reading Frames ◽

Classification Model ◽

Automatic Identification ◽

Large Set ◽

Link Type ◽

Reading Frames

AbstractMotivationModern methods of whole transcriptome sequencing accurately recover nucleotide sequences of RNA molecules present in cells and allow for determining their quantitative abundances. The coding potential of such molecules can be estimated using open reading frames (ORF) finding algorithms, implemented in a number of software packages. However, these algorithms show somewhat limited accuracy, are intended for single-molecule analysis and do not allow selecting proper ORFs in the case of long mRNAs containing multiple ORF candidates.ResultsWe developed a computational approach, corresponding machine learning model and a package, dedicated to automatic identification of the ORFs in large sets of human mRNA molecules. It is based on vectorization of nucleotide sequences into features, followed by classification using a random forest. The predictive model was validated on sets of human mRNA molecules from the NCBI RefSeq and Ensembl databases and demonstrated almost 95% accuracy in detecting true ORFs. The developed methods and pre-trained classification model were implemented in a powerful ORFhunteR computational tool that performs an automatic identification of true ORFs among large set of human mRNA molecules.Availability and implementationThe developed open-source R package ORFhunteR is available for the community at GitHub repository (https://github.com/rfctbio-bsu/ORFhunteR), from Bioconductor (https://bioconductor.org/packages/devel/bioc/html/ORFhunteR.html) and as a web application (http://orfhunter.bsu.by).

Download Full-text