scholarly journals Universal Features for the Classification of Coding and Non-coding DNA Sequences

2009 ◽  
Vol 3 ◽  
pp. BBI.S2236 ◽  
Author(s):  
Nicolas Carels ◽  
Ramon Vidal ◽  
Diego Frías

In this report, we revisited simple features that allow the classification of coding sequences (CDS) from non-coding DNA. The spectrum of codon usage of our sequence sample is large and suggests that these features are universal. The features that we investigated combine (i) the stop codon distribution, (ii) the product of purine probabilities in the three positions of nucleotide triplets, (iii) the product of Cytosine, Guanine, Adenine probabilities in 1st, 2nd, 3rd position of triplets, respectively, (iv) the product of G and C probabilities in 1st and 2nd position of triplets. These features are a natural consequence of the physico-chemical properties of proteins and their combination is successful in classifying CDS and non-coding DNA (introns) with a success rate >95% above 350 bp. The coding strand and coding frame are implicitly deduced when the sequences are classified as coding.

2014 ◽  
Vol 33 (4) ◽  
Author(s):  
Pavel Samec ◽  
Aleš Kučera ◽  
Klement Rejšek

AbstractSoil environment characteristics naturally affect the biogeographical classification of forests in central Europe. However, even on the same localities, different systems of vegetation classification de-scribe the forest types according to the naturally dominant tree species with different accuracy. A set of 20 representative natural beech stands in the borderland between the Bohemian Massif (Hercyni-an biogeographical subprovince) and the Outer Western Carpathians (Westcarpathian subprovince) was selected in order to compare textural, hydrostatic, physico-chemical and chemical properties of soils between the included geomorphological regions, bioregions and biotopes. Differences in the soils of the surveyed beech stands were mainly due to volume weight and specific weight, maximum capillary capacity (MCC), porosity, base saturation (BS), total soil nitrogen (N


2015 ◽  
Vol 4 (2) ◽  
pp. 156-170 ◽  
Author(s):  
Anne Karuma ◽  
Charles Gachene ◽  
Balthazar Msanya ◽  
Peter Mtakwa ◽  
Nyambilila Amuri ◽  
...  

2016 ◽  
Vol 683 ◽  
pp. 596-600
Author(s):  
Aleksey Zarubin ◽  
Natalia Chukhareva

Significant attention is paid to the production of peat-based materials in modern days. The study explores the influence of natural peat thermal modification on its properties by applying class-modeling techniques. Modification of different types of peat is achieved by heating at 250 °C. The set of peat properties such as component composition, g-factor and IR-spectra is used to obtain data matrix. It is shown that class-modeling techniques, such as partial least-squares discriminant analysis (PLS-DA) and simple independent modeling of class analogy (SIMCA), allow estimating peat class (natural or modified) by a set of properties without prediction errors by using three latent variables. According to the results of classification, it is established that thermal modification can be considered as a means of regulating the composition and physico-chemical properties of natural peats as a raw material


2019 ◽  
Vol 70 (11) ◽  
pp. 3783-3787
Author(s):  
Mioara Sebesan ◽  
Gabriela Elena Badea ◽  
Radu Sebesan ◽  
Ilona Katalin Fodor ◽  
Simona Bungau ◽  
...  

This paper presents a study of the physico-chemical properties of geothermal fluids, coming from some wells in Sacuieni, Bihor County, Romania. The thermal energy of the geothermal waters studied is used for heating some industrial buildings, greenhouses, and administrative buildings. Continuous monitoring of physical and chemical characteristics of geothermal waters is needed. Based on this fact, a classification of these waters has been made according to their chemical composition. Using a silica-enthalpy thermodynamic model, it was possible to estimate the deep reservoir temperature, and compare it with the temperatures at depth, calculated by the silica (quartz and chalcedony) and Na+/K+ geothermometers. The WATCH program is used to estimate the mineral deposits that may accumulate due to boiling and cooling of the geothermal fluid when it is used in heat exchangers The results are confirmed by XRD spectrometric and thermogravimetric analyses.


2009 ◽  
Vol 3 ◽  
pp. BBI.S3030 ◽  
Author(s):  
Nicolas Carels ◽  
Diego Frías

In this report, we compared the success rate of classification of coding sequences (CDS) vs. introns by Codon Structure Factor (CSF) and by a method that we called Universal Feature Method (UFM). UFM is based on the scoring of purine bias (Rrr) and stop codon frequency. We show that the success rate of CDS/intron classification by UFM is higher than by CSF. UFM classifies ORFs as coding or non-coding through a score based on (i) the stop codon distribution, (ii) the product of purine probabilities in the three positions of nucleotide triplets, (iii) the product of Cytosine (C), Guanine (G), and Adenine (A) probabilities in the 1st, 2nd, and 3rd positions of triplets, respectively, (iv) the probabilities of G in 1st and 2nd position of triplets and (v) the distance of their GC3 vs. GC2 levels to the regression line of the universal correlation. More than 80% of CDSs (true positives) of Homo sapiens (>250 bp), Drosophila melanogaster (>250 bp) and Arabidopsis thaliana (>200 bp) are successfully classified with a false positive rate lower or equal to 5%. The method releases coding sequences in their coding strand and coding frame, which allows their automatic translation into protein sequences with 95% confidence. The method is a natural consequence of the compositional bias of nucleotides in coding sequences.


Sign in / Sign up

Export Citation Format

Share Document