Thermal age, cytosine deamination and the veracity of 8,000 year old wheat DNA from sediments

Mapping Intimacies ◽

10.1101/032060 ◽

2015 ◽

Cited By ~ 5

Author(s):

Logan Kistler ◽

Oliver Smith ◽

Roselyn Ware ◽

Garry Momber ◽

Richard Bates ◽

...

Keyword(s):

Nuclear Dna ◽

Data Sets ◽

Museum Specimens ◽

Cytosine Deamination ◽

New Approach ◽

Fluctuating Environments ◽

Cave Environment ◽

Low Coverage ◽

Prime End ◽

Age Prediction

Recently, the finding of 8,000 year old wheat DNA from submerged marine sediments (1) was challenged on the basis of a lack of signal of cytosine deamination relative to three other data sets generated from young samples of herbarium and museum specimens, and a 7,000 year old human skeleton preserved in a cave environment (2). The study used a new approach for low coverage data sets to which tools such as mapDamage cannot be applied to infer chemical damage patterns. Here we show from the analysis of 148 palaeogenomic data sets that the rate of cytosine deamination is a thermally correlated process, and that organellar generally shows higher rates of deamination than nuclear DNA in comparable environments. We categorize four clusters of deamination rates (alpha,beta,gamma,epsilon) that are associated with cold stable environments, cool but thermally fluctuating environments, and progressively warmer environments. These correlations show that the expected level of deamination in the sedaDNA would be extremely low. The low coverage approach to detect DNA damage by Weiss et al. (2) fails to identify damage samples from the cold class of deamination rates. Finally, different enzymes used in library preparation processes exhibit varying capability in reporting cytosine deamination damage in the 5 prime region of fragments. The PCR enzyme used in the sedaDNA study would not have had the capability to report 5 prime cytosine deamination, as they do not read over uracil residues, and signatures of damage would have better been sought at the 3 prime end. The 8,000 year old sedaDNA matches both the thermal age prediction of fragmentation, and the expected level of cytosine deamination for the preservation environment. Given these facts and the use of rigorous controls these data meet the criteria of authentic ancient DNA to an extremely stringent level.

Download Full-text

Survey Data Quality in Analyzing Harmonized Indicators of Protest Behavior: A Survey Data Recycling Approach

American Behavioral Scientist ◽

10.1177/00027642211021623 ◽

2021 ◽

pp. 000276422110216

Author(s):

Kazimierz M. Slomczynski ◽

Irina Tomescu-Dubrow ◽

Ilona Wysmulek

Keyword(s):

Data Processing ◽

Data Quality ◽

Survey Data ◽

A Priori ◽

Data Sets ◽

New Approach ◽

Survey Quality ◽

Survey Error ◽

Ex Post ◽

The Impact

This article proposes a new approach to analyze protest participation measured in surveys of uneven quality. Because single international survey projects cover only a fraction of the world’s nations in specific periods, researchers increasingly turn to ex-post harmonization of different survey data sets not a priori designed as comparable. However, very few scholars systematically examine the impact of the survey data quality on substantive results. We argue that the variation in source data, especially deviations from standards of survey documentation, data processing, and computer files—proposed by methodologists of Total Survey Error, Survey Quality Monitoring, and Fitness for Intended Use—is important for analyzing protest behavior. In particular, we apply the Survey Data Recycling framework to investigate the extent to which indicators of attending demonstrations and signing petitions in 1,184 national survey projects are associated with measures of data quality, controlling for variability in the questionnaire items. We demonstrate that the null hypothesis of no impact of measures of survey quality on indicators of protest participation must be rejected. Measures of survey documentation, data processing, and computer records, taken together, explain over 5% of the intersurvey variance in the proportions of the populations attending demonstrations or signing petitions.

Download Full-text

Recent and Planned Developments of the Program OxCal

Radiocarbon ◽

10.1017/s0033822200057878 ◽

2013 ◽

Vol 55 (2) ◽

pp. 720-730 ◽

Cited By ~ 641

Author(s):

Christopher Bronk Ramsey ◽

Sharen Lee

Keyword(s):

Statistical Analysis ◽

Statistical Methods ◽

Software Package ◽

Model Averaging ◽

Data Sets ◽

Radiocarbon Dates ◽

New Approach ◽

New Models ◽

Multiphase Models ◽

Deposition Models

OxCal is a widely used software package for the calibration of radiocarbon dates and the statistical analysis of 14C and other chronological information. The program aims to make statistical methods easily available to researchers and students working in a range of different disciplines. This paper will look at the recent and planned developments of the package. The recent additions to the statistical methods are primarily aimed at providing more robust models, in particular through model averaging for deposition models and through different multiphase models. The paper will look at how these new models have been implemented and explore the implications for researchers who might benefit from their use. In addition, a new approach to the evaluation of marine reservoir offsets will be presented. As the quantity and complexity of chronological data increase, it is also important to have efficient methods for the visualization of such extensive data sets and methods for the presentation of spatial and geographical data embedded within planned future versions of OxCal will also be discussed.

Download Full-text

Quasi-2D inversion of DCR and TDEM data for shallow investigations

Geophysics ◽

10.1190/1.3587218 ◽

2011 ◽

Vol 76 (4) ◽

pp. F239-F250 ◽

Cited By ~ 10

Author(s):

Fernando A. Monteiro Santos ◽

Hesham M. El-Kaliouby

Keyword(s):

Joint Inversion ◽

Synthetic Data ◽

Data Sets ◽

Complex Environments ◽

Inversion Algorithm ◽

2D Inversion ◽

New Approach ◽

Earth Models ◽

Time Domain Electromagnetic ◽

Inversion Techniques

Joint or sequential inversion of direct current resistivity (DCR) and time-domain electromagnetic (TDEM) data commonly are performed for individual soundings assuming layered earth models. DCR and TDEM have different and complementary sensitivity to resistive and conductive structures, making them suitable methods for the application of joint inversion techniques. This potential joint inversion of DCR and TDEM methods has been used by several authors to reduce the ambiguities of the models calculated from each method separately. A new approach for joint inversion of these data sets, based on a laterally constrained algorithm, was found. The method was developed for the interpretation of soundings collected along a line over a 1D or 2D geology. The inversion algorithm was tested on two synthetic data sets, as well as on field data from Saudi Arabia. The results show that the algorithm is efficient and stable in producing quasi-2D models from DCR and TDEM data acquired in relatively complex environments.

Download Full-text

Increasing the Research Value of Digitized Fossil Museum Specimens via Integrated Stable Isotope Data

Biodiversity Information Science and Standards ◽

10.3897/biss.2.26567 ◽

2018 ◽

Vol 2 ◽

Author(s):

Sean Moran ◽

Bruce MacFadden ◽

Michelle Barboza

Keyword(s):

Stable Isotope ◽

Data Sets ◽

Museum Specimens ◽

Ancillary Data ◽

Deep Time ◽

Aggregated Data ◽

Isotope Data ◽

The Status ◽

The Individual ◽

Stable Isotope Data

Over the past several decades, thousands of stable isotope analyses (δ13C, δ18O) published in the peer-reviewed literature have advanced understanding of ecology and evolution of fossil mammals in Deep Time. These analyses typically have come from sampling vouchered museum specimens. However, the individual stable isotope data are typically disconnected from the vouchered specimens, and there likewise is no central repository for this information. This paper describes the status, potential, and value of the integration of stable isotope data in museum fossil collections. A pilot study in the Vertebrate Paleontology collection at the Florida Museum of Natural History has repatriated within Specify more than 1,000 legacy stable isotope data (mined from the literature) with the vouchered specimens by using ancillary non Darwin Core (DwC) data fields. As this database grows, we hope to both: validate previous studies that were done using smaller data sets; and ask new questions of the data that can only be addressed with larger, aggregated data sets. validate previous studies that were done using smaller data sets; and ask new questions of the data that can only be addressed with larger, aggregated data sets. Additionally, we envision that as the community gains a better understanding of the importance of these kinds of ancillary data to add value to vouchered museum specimens, then workflows, data fields, and protocols can be standardized.

Download Full-text

Web Graph Clustering for Displays and Navigation of Cyberspace

Web Mining ◽

10.4018/978-1-59140-414-9.ch012 ◽

2011 ◽

pp. 253-275

Author(s):

Xiaodi Huang ◽

Wei Lai

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Structural Information ◽

Graph Clustering ◽

Data Sets ◽

K Nearest Neighbor ◽

New Approach ◽

Web Graph

This chapter presents a new approach to clustering graphs, and applies it to Web graph display and navigation. The proposed approach takes advantage of the linkage patterns of graphs, and utilizes an affinity function in conjunction with the k-nearest neighbor. This chapter uses Web graph clustering as an illustrative example, and offers a potentially more applicable method to mine structural information from data sets, with the hope of informing readers of another aspect of data mining and its applications.

Download Full-text

Relative time seislet transform

Geophysics ◽

10.1190/geo2019-0212.1 ◽

2020 ◽

Vol 85 (2) ◽

pp. V223-V232 ◽

Cited By ~ 1

Author(s):

Zhicheng Geng ◽

Xinming Wu ◽

Sergey Fomel ◽

Yangkang Chen

Keyword(s):

Seismic Data ◽

Computational Cost ◽

Lifting Scheme ◽

Real Data ◽

Data Sets ◽

Relative Time ◽

New Approach ◽

New Formulation ◽

Wavelet Lifting ◽

Seismic Images

The seislet transform uses the wavelet-lifting scheme and local slopes to analyze the seismic data. In its definition, the designing of prediction operators specifically for seismic images and data is an important issue. We have developed a new formulation of the seislet transform based on the relative time (RT) attribute. This method uses the RT volume to construct multiscale prediction operators. With the new prediction operators, the seislet transform gets accelerated because distant traces get predicted directly. We apply our method to synthetic and real data to demonstrate that the new approach reduces computational cost and obtains excellent sparse representation on test data sets.

Download Full-text

A new approach to generate diversified clusters for small data sets

Applied Soft Computing ◽

10.1016/j.asoc.2020.106564 ◽

2020 ◽

Vol 95 ◽

pp. 106564

Author(s):

Chun-Cheng Peng ◽

Cheng-Jung Tsai ◽

Ting-Yi Chang ◽

Jen-Yuan Yeh ◽

Po-Wei Hua

Keyword(s):

Small Data ◽

Data Sets ◽

New Approach ◽

Small Data Sets

Download Full-text

Population genetics of wild Macaca fascicularis with low‐coverage shotgun sequencing of museum specimens

American Journal of Physical Anthropology ◽

10.1002/ajpa.24099 ◽

2020 ◽

Vol 173 (1) ◽

pp. 21-33 ◽

Cited By ~ 1

Author(s):

Lu Yao ◽

Kelsey Witt ◽

Hongjie Li ◽

Jonathan Rice ◽

Nelson R. Salinas ◽

...

Keyword(s):

Population Genetics ◽

Macaca Fascicularis ◽

Shotgun Sequencing ◽

Museum Specimens ◽

Low Coverage

Download Full-text

A New Approach for Supervised Dimensionality Reduction

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2018100102 ◽

2018 ◽

Vol 14 (4) ◽

pp. 20-37 ◽

Cited By ~ 1

Author(s):

Yinglei Song ◽

Yongzhong Li ◽

Junfeng Qu

Keyword(s):

Eigenvalue Problem ◽

Dimensionality Reduction ◽

Image Databases ◽

Data Sets ◽

Data Set ◽

New Approach ◽

Local Structures ◽

Benchmark Data ◽

Global And Local

This article develops a new approach for supervised dimensionality reduction. This approach considers both global and local structures of a labelled data set and maximizes a new objective that includes the effects from both of them. The objective can be approximately optimized by solving an eigenvalue problem. The approach is evaluated based on a few benchmark data sets and image databases. Its performance is also compared with a few other existing approaches for dimensionality reduction. Testing results show that, on average, this new approach can achieve more accurate results for dimensionality reduction than existing approaches.

Download Full-text

Linear Twin Quadratic Surface Support Vector Regression

Mathematical Problems in Engineering ◽

10.1155/2020/3238129 ◽

2020 ◽

Vol 2020 ◽

pp. 1-18

Author(s):

Qianru Zhai ◽

Ye Tian ◽

Jingyue Zhou

Keyword(s):

Support Vector Regression ◽

Support Vector ◽

Data Sets ◽

Twin Support Vector Regression ◽

New Approach ◽

Quadratic Surface ◽

The Matrix ◽

Nonparallel Hyperplanes ◽

Quadratic Surfaces ◽

Inverse Operation

Twin support vector regression (TSVR) generates two nonparallel hyperplanes by solving a pair of smaller-sized problems instead of a single larger-sized problem in the standard SVR. Due to its efficiency, TSVR is frequently applied in various areas. In this paper, we propose a totally new version of TSVR named Linear Twin Quadratic Surface Support Vector Regression (LTQSSVR), which directly uses two quadratic surfaces in the original space for regression. It is worth noting that our new approach not only avoids the notoriously difficult and time-consuming task for searching a suitable kernel function and its corresponding parameters in the traditional SVR-based method but also achieves a better generalization performance. Besides, in order to make further improvement on the efficiency and robustness of the model, we introduce the 1-norm to measure the error. The linear programming structure of the new model skips the matrix inverse operation and makes it solvable for those huge-sized problems. As we know, the capability of handling large-sized problem is very important in this big data era. In addition, to verify the effectiveness and efficiency of our model, we compare it with some well-known methods. The numerical experiments on 2 artificial data sets and 12 benchmark data sets demonstrate the validity and applicability of our proposed method.

Download Full-text