The statistical analysis of direct repeats in nucleic acid sequences

1985 ◽  
Vol 22 (1) ◽  
pp. 15-24 ◽  
Author(s):  
Rakesh Shukla ◽  
R. C. Srivastava

Sequence symmetries in DNA and RNA are being discovered at an increasing rate. Conjectures and hypotheses are being proposed for their possible structural and functional role in the nucleic acid. In this paper a probability model is studied which evaluates the probabilities of various repeats occurring by chance alone. Expressions are derived for the mean and variance of the statistics employed. The central limit theorem for dependent trials is used to obtain the asymptotic distributions. An indication is given of how to use the model to search for various gene amplification events in the evolutionary history of the sequences.

1985 ◽  
Vol 22 (01) ◽  
pp. 15-24 ◽  
Author(s):  
Rakesh Shukla ◽  
R. C. Srivastava

Sequence symmetries in DNA and RNA are being discovered at an increasing rate. Conjectures and hypotheses are being proposed for their possible structural and functional role in the nucleic acid. In this paper a probability model is studied which evaluates the probabilities of various repeats occurring by chance alone. Expressions are derived for the mean and variance of the statistics employed. The central limit theorem for dependent trials is used to obtain the asymptotic distributions. An indication is given of how to use the model to search for various gene amplification events in the evolutionary history of the sequences.


1984 ◽  
Vol 16 (1) ◽  
pp. 29-29
Author(s):  
Rakesh K. Shukla ◽  
R. C. Srivastava

A predominance of certain ‘sequence systemmetries’ in DNA/RNA has led to various conjectures about the possible structural/functional role these symmetries might play in nucleic acid sequences. De Wachter employed a binomial probability model to compare the observed number of ‘direct repeats’ with those expected in a random sequence. Counting of direct repeats essentially leads to a sequence of m-dependent trials. We develop a stochastic model for studying various types of symmetries. Expressions for means and variances of the statistics employed are derived. The asymptotic distributions are obtained using the central limit theorem for m-dependent random variables. It is proposed that each sequence pattern be examined separately for its chance occurrence as opposed to what de Wachter suggests, i.e., clumping of all patterns together. It is also shown how our model can be used to detect various gene-amplification events, if any, in nucleic acid sequences. Finally, for certain types of patterns, it is indicated how the theory of recurrent events can be used to get a better handle on the analysis of direct repeats.


PLoS Biology ◽  
2019 ◽  
Vol 17 (1) ◽  
pp. e3000122 ◽  
Author(s):  
Pierre Raia ◽  
Marta Carroni ◽  
Etienne Henry ◽  
Gérard Pehau-Arnaudet ◽  
Sébastien Brûlé ◽  
...  

Author(s):  
David Thompson ◽  
Philippe Pe´bay

Observed failures, rather than first principles, are used to estimate fatigue rates probabilistically conditioned on operating conditions. The method developed assumes that a normal random variable may be used to approximate the damage limit (remaining lifetime) of components subjected to cumulative damage and that when a component fails, its damage limit has vanished at a rate proportional to the amount of time spent at each operating condition experienced during its lifetime. By considering differences in cumulative damage between pairs of failed components, we obtain the relative rates at which damage is accumulated for each observed operating condition. When the differences in component lifetimes are dominated by variations in experienced conditions, it is possible to estimate absolute rates. Otherwise, variations in initial damage limits dominate and it is only possible to estimate the mean and variance of this distribution. We demonstrate the procedure on synthetic data, including a test for the dominant source of lifetime variations.


Psych ◽  
2019 ◽  
Vol 1 (1) ◽  
pp. 35-43
Author(s):  
James Flynn

Rushton believed not only that East Asians, whites, and blacks could be ranked in that order for desirable traits but also that the black/white IQ gap is predominantly genetic in origin. Concerning the first, he relied on the “ice ages hypothesis”to show that the evolutionary history of the three races had varied as East Asians were subjected to the most demanding environment (north of the Himalayas), whites to the next most demanding (north of the Alps), and blacks to the least demanding (Africa). As to the second, he appealed to arguments based on the method of correlated vectors (Jensen effects) and regression to the mean. To assess his contribution I argue: (1) That the racial ranking for desirable traits is not as tidy as it seems; (2) That the ice ages hypothesis has been falsified; (3) That the black/white Q gap is more likely to be environmental, with black American subculture as the culprit; and (4) That appeals to correlated vectors and regression cannot disentangle genetic and environmental causes.


Psych ◽  
2019 ◽  
Vol 1 (1) ◽  
pp. 35-43 ◽  
Author(s):  
James Flynn

Rushton believed not only that East Asians, whites, and blacks could be ranked in that order for desirable traits but also that the black/white IQ gap is predominantly genetic in origin. Concerning the first, he relied on the “ice ages hypothesis”to show that the evolutionary history of the three races had varied as East Asians were subjected to the most demanding environment (north of the Himalayas), whites to the next most demanding (north of the Alps), and blacks to the least demanding (Africa). As to the second, he appealed to arguments based on the method of correlated vectors (Jensen effects) and regression to the mean. To assess his contribution I argue: (1) That the racial ranking for desirable traits is not as tidy as it seems; (2) That the ice ages hypothesis has been falsified; (3) That the black/white Q gap is more likely to be environmental, with black American subculture as the culprit; and (4) That appeals to correlated vectors and regression cannot disentangle genetic and environmental causes.


2001 ◽  
Vol 82 (5) ◽  
pp. 1061-1067 ◽  
Author(s):  
Christine M. Jonassen ◽  
Tom Ø. Jonassen ◽  
Yehia M. Saif ◽  
David R. Snodgrass ◽  
Hiroshi Ushijima ◽  
...  

We have sequenced the genomic 3′-end, including the structural gene, of human astrovirus (HAstV) serotype 7 and morphologically related viruses infecting pig (PAstV), sheep (OAstV) and turkey (TAstV-1). These sequences were compared with corresponding astrovirus sequences available in the nucleic acid databases, including sequences of the seven other HAstV serotypes, two other avian astroviruses (TAstV-2 and avian nephritis virus) and astrovirus from cat (FAstV). A 35 nt stem–loop motif near the 3′-end of the genome, previously described as being highly conserved, was present in all of the astroviruses except TAstV-2. In the N-terminal half of the capsid precursor protein, there were several short conserved peptide motifs. Otherwise the capsid proteins of astroviruses infecting different hosts were highly divergent. Calculation of genetic distances revealed that the distance between FAstV and HAstV is comparable to the largest distances between different HAstV serotypes. Higher similarities between the HAstV, FAstV and PAstV capsid sequences suggest interspecies transmissions involving humans, cats and pigs relatively recently in the evolutionary history of astroviruses.


Genetics ◽  
1985 ◽  
Vol 111 (1) ◽  
pp. 147-164 ◽  
Author(s):  
Richard R Hudson ◽  
Norman L Kaplan

ABSTRACT Some statistical properties of samples of DNA sequences are studied under an infinite-site neutral model with recombination. The two quantities of interest are R, the number of recombination events in the history of a sample of sequences, and RM, the number of recombination events that can be parsimoniously inferred from a sample of sequences. Formulas are derived for the mean and variance of R. In contrast to R, RM can be determined from the sample. Since no formulas are known for the mean and variance of RM, they are estimated with Monte Carlo simulations. It is found that RM is often much less than R, therefore, the number of recombination events may be greatly under-estimated in a parsimonious reconstruction of the history of a sample. The statistic RM can be used to estimate the product of the recombination rate and the population size or, if the recombination rate is known, to estimate the population size. To illustrate this, DNA sequences from the Adh region of Drosophila melanogaster are used to estimate the effective population size of this species.


2018 ◽  
Vol 41 ◽  
Author(s):  
Kevin Arceneaux

AbstractIntuitions guide decision-making, and looking to the evolutionary history of humans illuminates why some behavioral responses are more intuitive than others. Yet a place remains for cognitive processes to second-guess intuitive responses – that is, to be reflective – and individual differences abound in automatic, intuitive processing as well.


Author(s):  
B.A. Hamkalo ◽  
S. Narayanswami ◽  
A.P. Kausch

The availability of nonradioactive methods to label nucleic acids an the resultant rapid and greater sensitivity of detection has catapulted the technique of in situ hybridization to become the method of choice to locate of specific DNA and RNA sequences on chromosomes and in whole cells in cytological preparations in many areas of biology. It is being applied to problems of fundamental interest to basic cell and molecular biologists such as the organization of the interphase nucleus in the context of putative functional domains; it is making major contributions to genome mapping efforts; and it is being applied to the analysis of clinical specimens. Although fluorescence detection of nucleic acid hybrids is routinely used, certain questions require greater resolution. For example, very closely linked sequences may not be separable using fluorescence; the precise location of sequences with respect to chromosome structures may be below the resolution of light microscopy(LM); and the relative positions of sequences on very small chromosomes may not be feasible.


Sign in / Sign up

Export Citation Format

Share Document