scholarly journals Using the Intermediate Data Structure (IDS) to Construct Files for Statistical Analysis

2015 ◽  
Vol 2 ◽  
pp. 86-107
Author(s):  
Luciana Quaranta

The use of longitudinal historical micro-level demographic data for research presents many challenges. The Intermediate Data Structure (IDS) was developed to try to solve some of these challenges by facilitating the storing and sharing of such data. This article proposes an extension to the IDS, which allows the standardization and storage of constructed variables. It also describes how to produce a rectangular episodes file for statistical analysis from data stored in the IDS and presents programs developed for such purpose.

2016 ◽  
Vol 3 ◽  
pp. 1-19
Author(s):  
Luciana Quaranta

The Intermediate Data Structure (IDS) provides a common structure for storing and sharing historical demographic data. The structure also facilitates the construction of different open-access software to extract information from these tables and construct new variables. The article Using the Intermediate Data Structure (IDS) to Construct Files for Analysis (Quaranta 2015) presented a series of concepts and programs that allow the user to construct a rectangular episodes file for longitudinal statistical analysis using data stored in the IDS. The current article discusses, in detail, each of these programs, describing their technicalities, structure and syntax, and also explaining how they can be used.


2021 ◽  
Vol 10 ◽  
pp. 76-80
Author(s):  
Luciana Quaranta

The Intermediate Data Structure (IDS) was developed as a strategy aimed at standardizing the dissemination of micro-level historical demographic data. The structure provides a common and clear data strategy which facilitates studies that consider several databases, and the development and exchange of software. Based on my own experiences from working with the IDS, in this article I provide reflections on the use of IDS to create datasets for analysis and to conduct comparative demographic research.


2018 ◽  
Vol 7 ◽  
pp. 11-27
Author(s):  
Luciana Quaranta

Studies conducted in historical populations and developing countries have evidenced the existence of clustering in infant deaths, which could be related to genetic inheritance and/or to social and cultural factors such as education, socioeconomic status or parental care. A transmission of death clustering has also been found across generations. One way of expanding the knowledge on intergenerational transfers in infant mortality is by conducting comparable studies across different populations. The Intermediate Data Structure (IDS) was developed as a strategy aimed at simplifying the collecting, storing and sharing of historical demographic data. The current work presents two programs that were developed in STATA to construct a dataset for analysis and run statistical models to study intergenerational transfers in infant mortality using databases that are stored in the IDS. The programs use information stored in the IDS tables and after elaborating such information produce Excel files with results. They can be used with any longitudinal database constructed from church books, civil registers, or population registers.


2014 ◽  
Vol 1 ◽  
pp. 27-46
Author(s):  
Finn Hedefalk ◽  
Lars Harrie ◽  
Patrick Svensson

The Intermediate Data Structure (IDS) is a standardised database structure for longitudinal historical databases. Such a common structure facilitates data sharing and comparative research. In this study, we propose an extended version of IDS, named IDS-Geo, that also includes geographic data. The geographic data that will be stored in IDS-Geo are primarily buildings and/or property units, and the purpose of these geographic data is mainly to link individuals to places in space. When we want to assign such detailed spatial locations to individuals (in times before there were any detailed house addresses available), we often have to create tailored geographic datasets. In those cases, there are benefits of storing geographic data in the same structure as the demographic data. Moreover, we propose the export of data from IDS-Geo using an eXtensible Markup Language (XML) Schema. IDS-Geo is implemented in a case study using historical property units, for the period 1804 to 1913, stored in a geographically extended version of the Scanian Economic Demographic Database (SEDD). To fit into the IDS-Geo data structure, we included an object lifeline representation of all of the property units (based on the snapshot time representation of single historical maps and poll-tax registers). The case study verifies that the IDS-Geo model is capable of handling geographic data that can be linked to demographic data.


Author(s):  
Dov H. Levin

This book examines why partisan electoral interventions occur as well as their effects on the election results in countries in which the great powers intervened. A new dataset shows that the U.S. and the USSR/Russia have intervened in one out of every nine elections between 1946 and 2000 in other countries in order to help or hinder one of the candidates or parties; the Russian intervention in the 2016 U.S. elections is just the latest example. Nevertheless, electoral interventions receive scant scholarly attention. This book develops a new theoretical model to answer both questions. It argues that electoral interventions are usually “inside jobs,” occurring only if a significant domestic actor within the target wants it. Likewise, electoral interventions won’t happen unless the intervening country fears its interests are endangered by another significant party or candidate with very different and inflexible preferences. As for the effects it argues that such meddling usually gives a significant boost to the preferred side, with overt interventions being more effective than covert ones in this regard. However, unlike in later elections, electoral interventions in founding elections usually harm the aided side. A multi-method framework is used in order to study these questions, including in-depth archival research into six cases in which the U.S. seriously considered intervening, the statistical analysis of the aforementioned dataset (PEIG), and a micro-level analysis of election surveys from three intervention cases. It also includes a preliminary analysis of the Russian intervention in the 2016 U.S. elections and the cyber-future of such meddling in general.


2021 ◽  
Vol 10 ◽  
pp. 9-12
Author(s):  
Kris Inwood ◽  
Hamish Maxwell-Stewart

Kees Mandemakers has enriched historical databases in the Netherlands and internationally through the development of the Historical Sample of the Netherlands, the Intermediate Data Structure, a practical implementation of rule-based record linking (LINKS) and personal encouragement of high quality longitudinal data in a number of countries.


2021 ◽  
Vol 28 ◽  
pp. 146-150
Author(s):  
L. A. Atramentova

Using the data obtained in a cytogenetic study as an example, we consider the typical errors that are made when performing statistical analysis. Widespread but flawed statistical analysis inevitably produces biased results and increases the likelihood of incorrect scientific conclusions. Errors occur due to not taking into account the study design and the structure of the analyzed data. The article shows how the numerical imbalance of the data set leads to a bias in the result. Using a dataset as an example, it explains how to balance the complex. It shows the advantage of presenting sample indicators with confidence intervals instead of statistical errors. Attention is drawn to the need to take into account the size of the analyzed shares when choosing a statistical method. It shows how the same data set can be analyzed in different ways depending on the purpose of the study. The algorithm of correct statistical analysis and the form of the tabular presentation of the results are described. Keywords: data structure, numerically unbalanced complex, confidence interval.


2018 ◽  
Vol 13 (6) ◽  
pp. 48-65 ◽  
Author(s):  
V. A. Vasenin

In Russian technical standards there are no criteria for the natural structure disturbance degree of laboratory samples of coherent dispersed soils. At the same time, such soils are widely represented in various regions of the country, in particular, in St. Petersburg. The paper discusses various criteria for estimating the degree of natural structure disturbance of laboratory samples, and also considers various methods for restoring the strength of samples. The main attention is paid to the evaluation of the degree of violation of the natural structure of laboratory samples when performing odometric tests. The statistical results of such an assessment are given for more than 3,000 oedometer tests of quaternary soils of different genesis based on deformation criteria. The quality assessment of laboratory samples was evaluated at 130 sites performing engineering and geological surveys (by various organizations) in St. Petersburg from 2003 and 2018. According to the results of statistical analysis, it was shown that the quality of samples by the criterion of the relative change in the porosity coefficient at the effective household stress corresponds to "poor" or "very poor" (according to the scale proposed by Т. Lunne and others). The main causes of the violation of the natural structure of the samples (sampling without primers, violation of sampling and storage rules, as well as transportation of samples) are described. Based on the results of a statistical analysis of the deformation parameters of laboratory soil samples during the implementation of complex geological survey in St. Petersburg, it was concluded that it is impossible to use the test results of these samples for performing geotechnical calculations using modern models of soil mechanics without special correction procedures.


2020 ◽  
Vol 39 (4) ◽  
pp. 5027-5036
Author(s):  
You Lu ◽  
Qiming Fu ◽  
Xuefeng Xi ◽  
Zhenping Chen

Data outsourcing has gradually become a mainstream solution, but once data is outsourced, data owners will without the control of the data hardware, there is a possibility that the integrity of the data will be destroyed objectively. Many current studies have achieved low network overhead cloud data set verification by designing algorithmic structures (e.g., hashing, Merkel verification trees); however, cloud service providers may not recognize the incompleteness of cloud data to avoid liability or business factors fact. There is a need to build a secure, reliable, non-tamperable, and non-forgeable verification system for accountability. Blockchain is a chain-like data structure constructed by using data signatures, timestamps, hash functions, and proof-of-work mechanisms. Using blockchain technology to build an integrity verification system can achieve fault accountability. Blockchain is a chain-like data structure constructed by using data signatures, timestamps, hash functions, and proof-of-work mechanisms. Using blockchain technology to build an integrity verification system can achieve fault accountability. This paper uses the Hadoop framework to implement data collection and storage of the HBase system based on big data architecture. In summary, based on the research of blockchain cloud data collection and storage technology, based on the existing big data storage middleware, a large flow, high concurrency and high availability data collection and processing system has been realized.


Sign in / Sign up

Export Citation Format

Share Document