scholarly journals An analysis of data paper templates and guidelines: types of contextual information described by data journals

2020 ◽  
Vol 7 (1) ◽  
pp. 16-23 ◽  
Author(s):  
Jihyun Kim

Purpose: Data papers are a promising genre of scholarly communication, in which research data are described, shared, and published. Rich documentation of data, including adequate contextual information, enhances the potential of data reuse. This study investigated the extent to which the components of data papers specified by journals represented the types of contextual information necessary for data reuse.Methods: A content analysis of 15 data paper templates/guidelines from 24 data journals indexed by the Web of Science was performed. A coding scheme was developed based on previous studies, consisting of four categories: general data set properties, data production information, repository information, and reuse information.Results: Only a few types of contextual information were commonly requested by the journals. Except data format information and file names, general data set properties were specified less often than other categories of contextual information. Researchers were frequently asked to provide data production information, such as information on the data collection, data producer, and related project. Repository information focused on data identifiers, while information about repository reputation and curation practices was rarely requested. Reuse information mostly involved advice on the reuse of data and terms of use.Conclusion: These findings imply that data journals should provide a more standardized set of data paper components to inform reusers of relevant contextual information in a consistent manner. Information about repository reputation and curation could also be provided by data journals to complement the repository information provided by the authors of data papers and to help researchers evaluate the reusability of data.

2021 ◽  
Vol 14 (3) ◽  
pp. 99
Author(s):  
Marc Peter Radke ◽  
Manuel Rupprecht

In this paper, we present a newly generated data set on real returns of households’ aggregated asset holdings, which adds additional and more sophisticated information to existing relevant datasets in the literature. To do this, we draw on various datasets from public and private sources and then transform and combine them in a consistent manner that allows for international comparative and intertemporal analyses. Based on this, we address two current debates on the development of household wealth in the euro area that have been triggered by the low-interest environment. The first debate refers to the development of real yields on household wealth from 2000 to 2018, whereas the second debate deals with the mean-variance efficiency of household portfolios. Contrary to widespread belief, we find that yields on total wealth, which were largely dominated by non-financial assets’ yields, were mostly positive, although they exhibit a declining trend. Moreover, on average, overall real yields were significantly lower after 2008. Referring to portfolio efficiency, we find that current portfolios seem to be comparatively close to mean-variance efficiency. If households were to optimize their portfolios despite limited room for improvement, holdings of equity and investment fund shares should be reduced, contradicting common recommendations of financial advisors.


Author(s):  
Valentin Raileanu ◽  

The article briefly describes the history and fields of application of the theory of extreme values, including climatology. The data format, the Generalized Extreme Value (GEV) probability distributions with Bock Maxima, the Generalized Pareto (GP) distributions with Point of Threshold (POT) and the analysis methods are presented. Estimating the distribution parameters is done using the Maximum Likelihood Estimation (MLE) method. Free R software installation, the minimum set of required commands and the GUI in2extRemes graphical package are described. As an example, the results of the GEV analysis of a simulated data set in in2extRemes are presented.


Author(s):  
Rhodri Saunders ◽  
Rafael Torrejon Torres ◽  
Maximilian Blüher

IntroductionReal-world evidence (RWE) is a useful supplement to a product's evidence base especially for medical devices, which are often unsuitable for randomized controlled trials. Generally, RWE is analyzed retrospectively (for example, healthcare records), which lack granularity for health-economic analysis. Prospective collection of RWE in hospitals can promote device-specific endpoint assessment. The advent of the General Data Protection Regulation (GDPR) requires a privacy-by-design approach. This work describes a workflow for a GDPR-compliant device-specific RWE collection as part of quality improvement initiatives (QII).MethodsA literature review identifies relevant clinical and quality markers as endpoints to the investigated technology. A panel of experts grade these endpoints on their clinical significance, privacy sensitivity, analytic value, and feasibility for collection. Endpoints meeting a predefined cut-off are considered quality markers for the QII. Finally, an RWE data collection app is designed to collect the quality markers using either longitudinal, pseudonymized data or single time-point anonymized data to ensure data protection by design.ResultsUsing this approach relevant clinical markers were identified in a GDPR-compliant manner. The data collection app design ensured that patient data were protected, while maintaining minimum requirements on patient information and consent. The pilot QII collected data on over 5,000 procedures, which represents the largest single data set available for the tested technology. Due to its prospective nature this programme was the first to collect patient outcomes in sufficient quantity for analysis, while previous studies only recorded adverse events.ConclusionsGDPR and RWE can co-exist in harmony. A design approach, which has data protection in mind from the start can combine high quality RWE collection of efficacy and safety data with maximum patient privacy.


2021 ◽  
Author(s):  
Ernesto Gomez ◽  
Ebikebena Ombe ◽  
Brennan Goodkey ◽  
Rafael Carvalho

Abstract In the current oil and gas drilling industry, the modernization of rig fleets has been shifting toward high mobility, artificial intelligence, and computerized systems. Part of this shift includes a move toward automation. This paper summarizes the successful application of a fully automated workflow to drill a stand, from slips out to slips back in, in a complex drilling environment in onshore gas. Repeatable processes with adherence to plans and operating practices are a key requirement in the implementation of drilling procedures and vital for optimizing operations in a systematic way. A drilling automation solution has been deployed in two rigs enabling the automation of both pre-connection and post-connection activities as well as rotary drilling of an interval equivalent to a typical drillpipe stand (approximately 90 ft) while optimizing the rate of penetration (ROP) and managing drilling dysfunctionalities, such as stick-slip and drillstring vibrations in a consistent manner. So far, a total of nine wells have been drilled using this solution. The automation system is configured with the outputs of the drilling program, including the drilling parameters roadmap, bottomhole assembly tools, and subsurface constraints. Before drilling every stand, the driller is presented with the planned configuration and can adjust settings whenever necessary. Once a goal is specified, the system directs the rig control system to command the surface equipment (draw works, auto-driller, top drive, and pumps). Everything is undertaken in the context of a workflow that reflects standard operating procedures. This solution runs with minimal intervention from the driller and each workflow contextual information is continuously displayed to the driller thereby giving him the best capacity to monitor and supervise the operational sequence. If drilling conditions change, the system will respond by automatically changing the sequence of activities to execute mitigation procedures and achieve the desired goal. At all times, the driller has the option to override the automation system and assume control by a simple touch on the rig controls. Prior to deployment, key performance indicators (KPI), including automated rig state-based measures, were selected. These KPIs are then monitored while drilling each well with the automation system to compare performance with a pre-deployment baseline. The solution was used to drill almost 60,000 ft of hole section with the system in control, and the results showed a 20% improvement in ROP with increased adherence to pre-connection and post-connection operations. Additionally, many lessons were learned from the use and observation of the automation workflow that was used to drive continuous improvement in efficiency and performance over the course of the project. This deployment was the first in the region and the system is part of a comprehensive digital well construction solution that is continuously enriched with new capabilities. This adaptive automated drilling solution delivered a step change in performance, safety, and consistency in the drilling operations.


Author(s):  
Saulius Daukantas ◽  
Vaidotas Marozas ◽  
George Drosatos ◽  
Eleni Kaldoudi ◽  
Arunas Lukosevicius

2018 ◽  
Vol 7 (04) ◽  
pp. 871-888 ◽  
Author(s):  
Sophie J. Lee ◽  
Howard Liu ◽  
Michael D. Ward

Improving geolocation accuracy in text data has long been a goal of automated text processing. We depart from the conventional method and introduce a two-stage supervised machine-learning algorithm that evaluates each location mention to be either correct or incorrect. We extract contextual information from texts, i.e., N-gram patterns for location words, mention frequency, and the context of sentences containing location words. We then estimate model parameters using a training data set and use this model to predict whether a location word in the test data set accurately represents the location of an event. We demonstrate these steps by constructing customized geolocation event data at the subnational level using news articles collected from around the world. The results show that the proposed algorithm outperforms existing geocoders even in a case added post hoc to test the generality of the developed algorithm.


2019 ◽  
Vol 2 (2) ◽  
pp. 169-187 ◽  
Author(s):  
Ruben C. Arslan

Data documentation in psychology lags behind not only many other disciplines, but also basic standards of usefulness. Psychological scientists often prefer to invest the time and effort that would be necessary to document existing data well in other duties, such as writing and collecting more data. Codebooks therefore tend to be unstandardized and stored in proprietary formats, and they are rarely properly indexed in search engines. This means that rich data sets are sometimes used only once—by their creators—and left to disappear into oblivion. Even if they can find an existing data set, researchers are unlikely to publish analyses based on it if they cannot be confident that they understand it well enough. My codebook package makes it easier to generate rich metadata in human- and machine-readable codebooks. It uses metadata from existing sources and automates some tedious tasks, such as documenting psychological scales and reliabilities, summarizing descriptive statistics, and identifying patterns of missingness. The codebook R package and Web app make it possible to generate a rich codebook in a few minutes and just three clicks. Over time, its use could lead to psychological data becoming findable, accessible, interoperable, and reusable, thereby reducing research waste and benefiting both its users and the scientific community as a whole.


Author(s):  
JIANLONG ZHOU ◽  
ZHIYAN WANG ◽  
KLAUS D. TÖNNIES

In this paper, a new approach named focal region-based volume rendering for visualizing internal structures of volumetric data is presented. This approach presents volumetric information through integrating context information as the structure analysis of the data set with a lens-like focal region rendering to show more detailed information. This feature-based approach contains three main components: (i) A feature extraction model using 3D image processing techniques to explore the structure of objects to provide contextual information; (ii) An efficient ray-bounded volume ray casting rendering to provide the detailed information of the volume of interest in the focal region; (iii) The tools used to manipulate focal regions to make this approach more flexible. The approach provides a powerful framework for producing detailed information from volumetric data. Providing contextual information and focal region renditions at the same time has the advantages of easy to understand and comprehend volume information for the scientist. The interaction techniques provided in this approach make the focal region-based volume rendering more flexible and easy to use.


2015 ◽  
Vol 10 (1) ◽  
pp. 82-94 ◽  
Author(s):  
Tiffany Chao

Understanding the methods and processes implemented by data producers to generate research data is essential for fostering data reuse. Yet, producing the metadata that describes these methods remains a time-intensive activity that data producers do not readily undertake. In particular, researchers in the long tail of science often lack the financial support or tools for metadata generation, thereby limiting future access and reuse of data produced. The present study investigates research journal publications as a potential source for identifying descriptive metadata about methods for research data. Initial results indicate that journal articles provide rich descriptive content that can be sufficiently mapped to existing metadata standards with methods-related elements, resulting in a mapping of the data production process for a study. This research has implications for enhancing the generation of robust metadata to support the curation of research data for new inquiry and innovation.


Sign in / Sign up

Export Citation Format

Share Document