scholarly journals A multi-decade record of high-quality fCO<sub>2</sub> data in version 3 of the Surface Ocean CO<sub>2</sub> Atlas (SOCAT)

Author(s):  
Dorothee C. E. Bakker ◽  
Benjamin Pfeil ◽  
Camilla S. Landa ◽  
Nicolas Metzl ◽  
Kevin M. O'Brien ◽  
...  

Abstract. The Surface Ocean CO2 Atlas (SOCAT) is a synthesis of quality-controlled fCO2 (fugacity of carbon dioxide) values for the global surface oceans and coastal seas with regular updates. Version 3 of SOCAT has 14.5 million fCO2 values from 3646 data sets covering the years 1957 to 2014. This latest version has an additional 4.4 million fCO2 values relative to version 2 and extends the record from 2011 to 2014. Version 3 also significantly increases the data availability for 2005 to 2013. SOCAT has an average of approximately 1.2 million surface water fCO2 values per year for the years 2006 to 2012. Quality and documentation of the data has improved. A new feature is the data set quality control (QC) flag of E for data from alternative sensors and platforms. The accuracy of surface water fCO2 has been defined for all data set QC flags. Automated range checking has been carried out for all data sets during their upload into SOCAT. The upgrade of the interactive Data Set Viewer (previously known as the Cruise Data Viewer) allows better interrogation of the SOCAT data collection and rapid creation of high-quality figures for scientific presentations. Automated data upload has been launched for version 4 and will enable more frequent SOCAT releases in the future. High-profile scientific applications of SOCAT include quantification of the ocean sink for atmospheric carbon dioxide and its long-term variation, detection of ocean acidification, as well as evaluation of coupled-climate and ocean-only biogeochemical models. Users of SOCAT data products are urged to acknowledge the contribution of data providers, as stated in the SOCAT Fair Data Use Statement. This ESSD (Earth System Science Data) "Living Data" publication documents the methods and data sets used for the assembly of this new version of the SOCAT data collection and compares these with those used for earlier versions of the data collection (Pfeil et al., 2013; Sabine et al., 2013; Bakker et al., 2014).

2016 ◽  
Vol 8 (2) ◽  
pp. 383-413 ◽  
Author(s):  
Dorothee C. E. Bakker ◽  
Benjamin Pfeil ◽  
Camilla S. Landa ◽  
Nicolas Metzl ◽  
Kevin M. O'Brien ◽  
...  

Abstract. The Surface Ocean CO2 Atlas (SOCAT) is a synthesis of quality-controlled fCO2 (fugacity of carbon dioxide) values for the global surface oceans and coastal seas with regular updates. Version 3 of SOCAT has 14.7 million fCO2 values from 3646 data sets covering the years 1957 to 2014. This latest version has an additional 4.6 million fCO2 values relative to version 2 and extends the record from 2011 to 2014. Version 3 also significantly increases the data availability for 2005 to 2013. SOCAT has an average of approximately 1.2 million surface water fCO2 values per year for the years 2006 to 2012. Quality and documentation of the data has improved. A new feature is the data set quality control (QC) flag of E for data from alternative sensors and platforms. The accuracy of surface water fCO2 has been defined for all data set QC flags. Automated range checking has been carried out for all data sets during their upload into SOCAT. The upgrade of the interactive Data Set Viewer (previously known as the Cruise Data Viewer) allows better interrogation of the SOCAT data collection and rapid creation of high-quality figures for scientific presentations. Automated data upload has been launched for version 4 and will enable more frequent SOCAT releases in the future. High-profile scientific applications of SOCAT include quantification of the ocean sink for atmospheric carbon dioxide and its long-term variation, detection of ocean acidification, as well as evaluation of coupled-climate and ocean-only biogeochemical models. Users of SOCAT data products are urged to acknowledge the contribution of data providers, as stated in the SOCAT Fair Data Use Statement. This ESSD (Earth System Science Data) "living data" publication documents the methods and data sets used for the assembly of this new version of the SOCAT data collection and compares these with those used for earlier versions of the data collection (Pfeil et al., 2013; Sabine et al., 2013; Bakker et al., 2014). Individual data set files, included in the synthesis product, can be downloaded here: doi:10.1594/PANGAEA.849770. The gridded products are available here: doi:10.3334/CDIAC/OTG.SOCAT_V3_GRID.


2021 ◽  
Vol 4 (1) ◽  
pp. 251524592092800
Author(s):  
Erin M. Buchanan ◽  
Sarah E. Crain ◽  
Ari L. Cunningham ◽  
Hannah R. Johnson ◽  
Hannah Stash ◽  
...  

As researchers embrace open and transparent data sharing, they will need to provide information about their data that effectively helps others understand their data sets’ contents. Without proper documentation, data stored in online repositories such as OSF will often be rendered unfindable and unreadable by other researchers and indexing search engines. Data dictionaries and codebooks provide a wealth of information about variables, data collection, and other important facets of a data set. This information, called metadata, provides key insights into how the data might be further used in research and facilitates search-engine indexing to reach a broader audience of interested parties. This Tutorial first explains terminology and standards relevant to data dictionaries and codebooks. Accompanying information on OSF presents a guided workflow of the entire process from source data (e.g., survey answers on Qualtrics) to an openly shared data set accompanied by a data dictionary or codebook that follows an agreed-upon standard. Finally, we discuss freely available Web applications to assist this process of ensuring that psychology data are findable, accessible, interoperable, and reusable.


2017 ◽  
Vol 9 (1) ◽  
pp. 211-220 ◽  
Author(s):  
Amelie Driemel ◽  
Eberhard Fahrbach ◽  
Gerd Rohardt ◽  
Agnieszka Beszczynska-Möller ◽  
Antje Boetius ◽  
...  

Abstract. Measuring temperature and salinity profiles in the world's oceans is crucial to understanding ocean dynamics and its influence on the heat budget, the water cycle, the marine environment and on our climate. Since 1983 the German research vessel and icebreaker Polarstern has been the platform of numerous CTD (conductivity, temperature, depth instrument) deployments in the Arctic and the Antarctic. We report on a unique data collection spanning 33 years of polar CTD data. In total 131 data sets (1 data set per cruise leg) containing data from 10 063 CTD casts are now freely available at doi:10.1594/PANGAEA.860066. During this long period five CTD types with different characteristics and accuracies have been used. Therefore the instruments and processing procedures (sensor calibration, data validation, etc.) are described in detail. This compilation is special not only with regard to the quantity but also the quality of the data – the latter indicated for each data set using defined quality codes. The complete data collection includes a number of repeated sections for which the quality code can be used to investigate and evaluate long-term changes. Beginning with 2010, the salinity measurements presented here are of the highest quality possible in this field owing to the introduction of the OPTIMARE Precision Salinometer.


Author(s):  
Avinash Navlani ◽  
V. B. Gupta

In the last couple of decades, clustering has become a very crucial research problem in the data mining research community. Clustering refers to the partitioning of data objects such as records and documents into groups or clusters of similar characteristics. Clustering is unsupervised learning, because of unsupervised nature there is no unique solution for all problems. Most of the time complex data sets require explanation in multiple clustering sets. All the Traditional clustering approaches generate single clustering. There is more than one pattern in a dataset; each of patterns can be interesting in from different perspectives. Alternative clustering intends to find all unlike groupings of the data set such that each grouping has high quality and distinct from each other. This chapter gives you an overall view of alternative clustering; it's various approaches, related work, comparing with various confusing related terms like subspace, multi-view, and ensemble clustering, applications, issues, and challenges.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 879 ◽  
Author(s):  
Uwe Köckemann ◽  
Marjan Alirezaie ◽  
Jennifer Renoux ◽  
Nicolas Tsiftes ◽  
Mobyen Uddin Ahmed ◽  
...  

As research in smart homes and activity recognition is increasing, it is of ever increasing importance to have benchmarks systems and data upon which researchers can compare methods. While synthetic data can be useful for certain method developments, real data sets that are open and shared are equally as important. This paper presents the E-care@home system, its installation in a real home setting, and a series of data sets that were collected using the E-care@home system. Our first contribution, the E-care@home system, is a collection of software modules for data collection, labeling, and various reasoning tasks such as activity recognition, person counting, and configuration planning. It supports a heterogeneous set of sensors that can be extended easily and connects collected sensor data to higher-level Artificial Intelligence (AI) reasoning modules. Our second contribution is a series of open data sets which can be used to recognize activities of daily living. In addition to these data sets, we describe the technical infrastructure that we have developed to collect the data and the physical environment. Each data set is annotated with ground-truth information, making it relevant for researchers interested in benchmarking different algorithms for activity recognition.


Geophysics ◽  
1993 ◽  
Vol 58 (9) ◽  
pp. 1281-1296 ◽  
Author(s):  
V. J. S. Grauch

The magnetic data set compiled for the Decade of North American Geology (DNAG) project presents an important digital data base that can be used to examine the North American crust. The data represent a patchwork from many individual airborne and marine magnetic surveys. However, the portion of data for the conterminous U.S. has problems that limit the resolution and use of the data. Now that the data are available in digital form, it is important to describe the data limitations more specifically than before. The primary problem is caused by datum shifts between individual survey boundaries. In the western U.S., the DNAG data are generally shifted less than 100 nT. In the eastern U.S., the DNAG data may be shifted by as much as 300 nT and contain regionally shifted areas with wavelengths on the order of 800 to 1400 km. The worst case is the artificial low centered over Kentucky and Tennessee produced by a series of datum shifts. A second significant problem is lack of anomaly resolution that arises primarily from using survey data that is too widely spaced compared to the flight heights above magnetic sources. Unfortunately, these are the only data available for much of the U.S. Another problem is produced by the lack of common observation surface between individual pieces of the U.S. DNAG data. The height disparities introduce variations in spatial frequency content that are unrelated to the magnetization of rocks. The spectral effects of datum shifts and the variation of spatial frequency content due to height disparities were estimated for the DNAG data for the conterminous U.S. As a general guideline for digital filtering, the most reliable features in the U.S. DNAG data have wavelengths roughly between 170 and 500 km, or anomaly half‐widths between 85 and 250 km. High‐quality, large‐region magnetic data sets have become increasingly important to meet exploration and scientific objectives. The acquisition of a new national magnetic data set with higher quality at a greater range of wavelengths is clearly in order. The best approach is to refly much of the U.S. with common specifications and reduction procedures. At the very least, magnetic data sets should be remerged digitally using available or newly flown long‐distance flight‐line data to adjust survey levels. In any case, national coordination is required to produce a consistent, high‐quality national magnetic map.


2012 ◽  
Vol 5 (2) ◽  
pp. 735-780 ◽  
Author(s):  
B. Pfeil ◽  
A. Olsen ◽  
D. C. E. Bakker ◽  
S. Hankin ◽  
H. Koyuk ◽  
...  

Abstract. A well documented, publicly available, global data set of surface ocean carbon dioxide (CO2) parameters has been called for by international groups for nearly two decades. The Surface Ocean CO2 Atlas (SOCAT) project was initiated by the international marine carbon science community in 2007 with the aim of providing a comprehensive, publicly available, regularly updated, global data set of marine surface CO2, which had been subject to quality control (QC). Many additional CO2 data, not yet made public via the Carbon Dioxide Information Analysis Center (CDIAC), were retrieved from data originators, public websites and other data centres. All data were put in a uniform format following a strict protocol. Quality control was carried out according to clearly defined criteria. Regional specialists performed the quality control, using state-of-the-art web-based tools, specially developed for accomplishing this global team effort. SOCAT version 1.5 was made public in September 2011 and holds 6.3 million quality controlled surface CO2 data points from the global oceans and coastal seas, spanning four decades (1968–2007). Three types of data products are available: individual cruise files, a merged complete data set and gridded products. With the rapid expansion of marine CO2 data collection and the importance of quantifying net global oceanic CO2 uptake and its changes, sustained data synthesis and data access are priorities.


2019 ◽  
Vol 16 (3) ◽  
pp. 705-731
Author(s):  
Haoze Lv ◽  
Zhaobin Liu ◽  
Zhonglian Hu ◽  
Lihai Nie ◽  
Weijiang Liu ◽  
...  

With the invention of big data era, data releasing is becoming a hot topic in database community. Meanwhile, data privacy also raises the attention of users. As far as the privacy protection models that have been proposed, the differential privacy model is widely utilized because of its many advantages over other models. However, for the private releasing of multi-dimensional data sets, the existing algorithms are publishing data usually with low availability. The reason is that the noise in the released data is rapidly grown as the increasing of the dimensions. In view of this issue, we propose algorithms based on regular and irregular marginal tables of frequent item sets to protect privacy and promote availability. The main idea is to reduce the dimension of the data set, and to achieve differential privacy protection with Laplace noise. First, we propose a marginal table cover algorithm based on frequent items by considering the effectiveness of query cover combination, and then obtain a regular marginal table cover set with smaller size but higher data availability. Then, a differential privacy model with irregular marginal table is proposed in the application scenario with low data availability and high cover rate. Next, we obtain the approximate optimal marginal table cover algorithm by our analysis to get the query cover set which satisfies the multi-level query policy constraint. Thus, the balance between privacy protection and data availability is achieved. Finally, extensive experiments have been done on synthetic and real databases, demonstrating that the proposed method preforms better than state-of-the-art methods in most cases.


Data ◽  
2019 ◽  
Vol 4 (1) ◽  
pp. 26 ◽  
Author(s):  
Collin Gros ◽  
Jeremy Straub

Facial recognition, as well as other types of human recognition, have found uses in identification, security, and learning about behavior, among other uses. Because of the high cost of data collection for training purposes, logistical challenges and other impediments, mirroring images has frequently been used to increase the size of data sets. However, while these larger data sets have shown to be beneficial, their comparative level of benefit to the data collection of similar data has not been assessed. This paper presented a data set collected and prepared for this and related research purposes. The data set included both non-occluded and occluded data for mirroring assessment.


2022 ◽  
Vol 163 (2) ◽  
pp. 62
Author(s):  
E. Spalding ◽  
K. M. Morzinski ◽  
P. Hinz ◽  
J. Males ◽  
M. Meyer ◽  
...  

Abstract The Large Binocular Telescope (LBT) has two 8.4 m primary mirrors that produce beams that can be combined coherently in a “Fizeau” interferometric mode. In principle, the Fizeau point-spread function (PSF) enables the probing of structure at a resolution up to three times better than that of the adaptive-optics-corrected PSF of a single 8.4 m telescope. In this work, we examined the nearby star Altair (5.13 pc, type A7V, hundreds of Myr to ≈1.4 Gyr) in the Fizeau mode with the LBT at Brα (4.05 μm) and carried out angular differential imaging to search for companions. This work presents the first filled-aperture LBT Fizeau science data set to benefit from a correcting mirror that provides active phase control. In the analysis of the λ/D angular regime, the sensitivity of the data set is down to ≈0.5 M ⊙ at 1″ for a 1.0 Gyr system. This sensitivity remains limited by the small amount of integration time, which is in turn limited by the instability of the Fizeau PSF. However, in the Fizeau fringe regime we attain sensitivities of Δm ≈ 5 at 0.″2 and put constraints on companions of 1.3 M ⊙ down to an inner angle of ≈0.″15, closer than any previously published direct imaging of Altair. This analysis is a pathfinder for future data sets of this type, and represents some of the first steps to unlocking the potential of the first Extremely Large Telescope. Fizeau observations will be able to reach dimmer targets with upgrades to the instrument, in particular the phase detector.


Sign in / Sign up

Export Citation Format

Share Document