Individual vs. Collaborative Methods of Crowdsourced Transcription

Journal of Data Mining & Digital Humanities ◽

10.46298/jdmdh.5759 ◽

2019 ◽

Vol Special Issue on Collecting,... ◽

Author(s):

Samantha Blickhan ◽

Coleman Krawczyk ◽

Daniel Hanson ◽

Amy Boyer ◽

Andrea Simenstad ◽

...

Keyword(s):

Design Methodology ◽

Research Question ◽

Public Library ◽

Ground Truth ◽

Message Board ◽

Quality Outcomes ◽

Ground Truth Data ◽

Project Outcomes ◽

International Audience ◽

Shed Light

International audience While online crowdsourced text transcription projects have proliferated in the last decade, there is a need within the broader field to understand differences in project outcomes as they relate to task design, as well as to experiment with different models of online crowdsourced transcription that have not yet been explored. The experiment discussed in this paper involves the evaluation of newly-built tools on the Zooniverse.org crowdsourcing platform, attempting to answer the research question: "Does the current Zooniverse methodology of multiple independent transcribers and aggregation of results render higher-quality outcomes than allowing volunteers to see previous transcriptions and/or markings by other users? How does each methodology impact the quality and depth of analysis and participation?" To answer these questions, the Zooniverse team ran an A/B experiment on the project Anti-Slavery Manuscripts at the Boston Public Library. This paper will share results of this study, and also describe the process of designing the experiment and the metrics used to evaluate each transcription method. These include the comparison of aggregate transcription results with ground truth data; evaluation of annotation methods; the time it took for volunteers to complete transcribing each dataset; and the level of engagement with other project elements such as posting on the message board or reading supporting documentation. Particular focus will be given to the (at times) competing goals of data quality, efficiency, volunteer engagement, and user retention, all of which are of high importance for projects that focus on data from galleries, libraries, archives and museums. Ultimately, this paper aims to provide a model for impactful, intentional design and study of online crowdsourcing transcription methods, as well as shed light on the associations between project design, methodology and outcomes.

Download Full-text

Promoting knowledge circulation in public libraries: the role of gamification

Library Management ◽

10.1108/lm-04-2020-0064 ◽

2020 ◽

Vol 41 (8/9) ◽

pp. 669-676

Author(s):

Nathalie Colasanti ◽

Valerio Fiori ◽

Rocco Frondizi

Keyword(s):

Design Methodology ◽

Public Libraries ◽

Research Question ◽

Public Library ◽

Content Type ◽

The Public ◽

Knowledge Circulation ◽

The Impact

PurposeThe aim of the paper is to investigate the impact of nudges and considerations stemming from behavioural economics on the promotion and enhancement of knowledge circulation in public libraries. In fact, literature indicates that an approach based on nudging individuals towards desired behaviours may be more effective than top-down policy actions that may be perceived as excessive.Design/methodology/approachIn order to answer the research question, the paper analyses an exploratory case study regarding the network of public libraries in Rome, called Biblioteche di Roma (BdR). BdR launched its online platform in 2009, but it was never able to create a strong connection with offline activities, and contributions by readers (such as comments and book ratings) remained very low. In 2018, BdR introduced a gamification section in its website, with the goal of increasing users' interactions and book circulation. Data resulting from the use of gamification, both at city level and within different neighbourhoods, will be presented and analysed.FindingsResults indicate that the introduction of gamification was successful in increasing users' interactions and engagement, both online and offline.Originality/valueThe paper is valuable as it explores the introduction of nudge theory and gamification in the public library system.

Download Full-text

When feminism meets social networks

Library Hi Tech ◽

10.1108/lht-02-2021-0074 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Gila Prebor

Keyword(s):

Text Mining ◽

Violence Against Women ◽

Text Analysis ◽

High Frequency ◽

Research Method ◽

Design Methodology ◽

Research Question ◽

Analysis Tool ◽

Content Type ◽

Shed Light

PurposeThe purpose of this study is to examine how different feminist Facebook groups in Israel operate in order to better understand the main issues in their discussions about feminism in Israel. The study will also identify the variances between the different subgroups. A secondary research question examined was whether Voyant Tools can be used as an effective content text analysis tool in general and in Hebrew in particular.Design/methodology/approachThe study's research method analyzes the content of Facebook posts using the Voyant Tools online toolkit to quantitatively analyze and visualize the results of text mining and data visualization. The sample consists of the texts of posts of three groups representing different currents in Israeli feminism, gathered over a period of three months.FindingsThe results show that there are high-frequency words occurring in all groups, each group has its unique words, which distinguish it from the other groups. Feminist and Halachic Feminist groups had few words in common, while the Religious Feminist groups had more words in common with both the Feminist and the Halachic Feminist groups and more so with the latter group. While all groups discussed the issue of violence against women, especially sexual violence, the degree of engagement varied greatly between the groups. In addition, there were clear differences in the prominent issues concerning the various groups. This paper demonstrates the possibility of using Voyant Tools for text mining and analysis.Originality/valueThis paper demonstrates the possibility of using Voyant Tools for text mining and analysis. Voyant Tools shed light on common concepts, their location and prevalence in the text.

Download Full-text

A people's palace opens: the new Library of Birmingham

New Library World ◽

10.1108/nlw-10-2013-0079 ◽

2014 ◽

Vol 115 (1/2) ◽

pp. 65-67

Author(s):

Mike Freeman

Keyword(s):

Interior Design ◽

Design Methodology ◽

Public Library ◽

Content Type ◽

Special Collections ◽

New Public ◽

Official Opening

Purpose – The purpose of this report is to provide an account of the UK's new public library of Birmingham. Design/methodology/approach – Details the construction, exterior and interior design, contents and location, including special collections, and describes the official opening. Findings – A large public library building which moves away from old conceptions of libraries, accommodating a variety of functions in a striking and accessible building. Originality/value – Provides a description of the new Library of Birmingham, the largest public library in Europe.

Download Full-text

Assessing Wildfire Burn Severity and Its Relationship with Environmental Factors: A Case Study in Interior Alaska Boreal Forest

Remote Sensing ◽

10.3390/rs13101966 ◽

2021 ◽

Vol 13 (10) ◽

pp. 1966

Author(s):

Christopher W Smith ◽

Santosh K Panda ◽

Uma S Bhatt ◽

Franz J Meyer ◽

Anushree Badola ◽

...

Keyword(s):

Boreal Forest ◽

Ground Truth ◽

Burn Severity ◽

Classification Methods ◽

Spectral Indices ◽

Ground Truth Data ◽

Burn Scar ◽

Interior Alaska ◽

Remote Sensing Methods ◽

The Relationship

In recent years, there have been rapid improvements in both remote sensing methods and satellite image availability that have the potential to massively improve burn severity assessments of the Alaskan boreal forest. In this study, we utilized recent pre- and post-fire Sentinel-2 satellite imagery of the 2019 Nugget Creek and Shovel Creek burn scars located in Interior Alaska to both assess burn severity across the burn scars and test the effectiveness of several remote sensing methods for generating accurate map products: Normalized Difference Vegetation Index (NDVI), Normalized Burn Ratio (NBR), and Random Forest (RF) and Support Vector Machine (SVM) supervised classification. We used 52 Composite Burn Index (CBI) plots from the Shovel Creek burn scar and 28 from the Nugget Creek burn scar for training classifiers and product validation. For the Shovel Creek burn scar, the RF and SVM machine learning (ML) classification methods outperformed the traditional spectral indices that use linear regression to separate burn severity classes (RF and SVM accuracy, 83.33%, versus NBR accuracy, 73.08%). However, for the Nugget Creek burn scar, the NDVI product (accuracy: 96%) outperformed the other indices and ML classifiers. In this study, we demonstrated that when sufficient ground truth data is available, the ML classifiers can be very effective for reliable mapping of burn severity in the Alaskan boreal forest. Since the performance of ML classifiers are dependent on the quantity of ground truth data, when sufficient ground truth data is available, the ML classification methods would be better at assessing burn severity, whereas with limited ground truth data the traditional spectral indices would be better suited. We also looked at the relationship between burn severity, fuel type, and topography (aspect and slope) and found that the relationship is site-dependent.

Download Full-text

Automatic Evaluation of Wheat Resistance to Fusarium Head Blight Using Dual Mask-RCNN Deep Learning Frameworks in Computer Vision

Remote Sensing ◽

10.3390/rs13010026 ◽

2020 ◽

Vol 13 (1) ◽

pp. 26

Author(s):

Wen-Hao Su ◽

Jiajing Zhang ◽

Ce Yang ◽

Rae Page ◽

Tamas Szinyei ◽

...

Keyword(s):

Fusarium Head Blight ◽

Ground Truth ◽

Wheat Breeding ◽

Head Blight ◽

Detection Rates ◽

Ground Truth Data ◽

Resistant Cultivars ◽

Feature Pyramid ◽

Rater Error ◽

Wheat Lines

In many regions of the world, wheat is vulnerable to severe yield and quality losses from the fungus disease of Fusarium head blight (FHB). The development of resistant cultivars is one means of ameliorating the devastating effects of this disease, but the breeding process requires the evaluation of hundreds of lines each year for reaction to the disease. These field evaluations are laborious, expensive, time-consuming, and are prone to rater error. A phenotyping cart that can quickly capture images of the spikes of wheat lines and their level of FHB infection would greatly benefit wheat breeding programs. In this study, mask region convolutional neural network (Mask-RCNN) allowed for reliable identification of the symptom location and the disease severity of wheat spikes. Within a wheat line planted in the field, color images of individual wheat spikes and their corresponding diseased areas were labeled and segmented into sub-images. Images with annotated spikes and sub-images of individual spikes with labeled diseased areas were used as ground truth data to train Mask-RCNN models for automatic image segmentation of wheat spikes and FHB diseased areas, respectively. The feature pyramid network (FPN) based on ResNet-101 network was used as the backbone of Mask-RCNN for constructing the feature pyramid and extracting features. After generating mask images of wheat spikes from full-size images, Mask-RCNN was performed to predict diseased areas on each individual spike. This protocol enabled the rapid recognition of wheat spikes and diseased areas with the detection rates of 77.76% and 98.81%, respectively. The prediction accuracy of 77.19% was achieved by calculating the ratio of the wheat FHB severity value of prediction over ground truth. This study demonstrates the feasibility of rapidly determining levels of FHB in wheat spikes, which will greatly facilitate the breeding of resistant cultivars.

Download Full-text

Classification of Cattle Behaviours Using Neck-Mounted Accelerometer-Equipped Collars and Convolutional Neural Networks

Sensors ◽

10.3390/s21124050 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4050

Author(s):

Dejan Pavlovic ◽

Christopher Davison ◽

Andrew Hamilton ◽

Oskar Marko ◽

Robert Atkinson ◽

...

Keyword(s):

Neural Network ◽

Model Performance ◽

Ground Truth ◽

Practical Implementation ◽

Ground Truth Data ◽

Battery Lifetime ◽

Implementation Challenges ◽

Memory Footprint ◽

Commercial Farms ◽

Using Data

Monitoring cattle behaviour is core to the early detection of health and welfare issues and to optimise the fertility of large herds. Accelerometer-based sensor systems that provide activity profiles are now used extensively on commercial farms and have evolved to identify behaviours such as the time spent ruminating and eating at an individual animal level. Acquiring this information at scale is central to informing on-farm management decisions. The paper presents the development of a Convolutional Neural Network (CNN) that classifies cattle behavioural states (`rumination’, `eating’ and `other’) using data generated from neck-mounted accelerometer collars. During three farm trials in the United Kingdom (Easter Howgate Farm, Edinburgh, UK), 18 steers were monitored to provide raw acceleration measurements, with ground truth data provided by muzzle-mounted pressure sensor halters. A range of neural network architectures are explored and rigorous hyper-parameter searches are performed to optimise the network. The computational complexity and memory footprint of CNN models are not readily compatible with deployment on low-power processors which are both memory and energy constrained. Thus, progressive reductions of the CNN were executed with minimal loss of performance in order to address the practical implementation challenges, defining the trade-off between model performance versus computation complexity and memory footprint to permit deployment on micro-controller architectures. The proposed methodology achieves a compression of 14.30 compared to the unpruned architecture but is nevertheless able to accurately classify cattle behaviours with an overall F1 score of 0.82 for both FP32 and FP16 precision while achieving a reasonable battery lifetime in excess of 5.7 years.

Download Full-text

A machine learning approach to estimate the strain energy absorption in expanded polystyrene foams

Journal of Cellular Plastics ◽

10.1177/0021955x211021014 ◽

2021 ◽

pp. 0021955X2110210

Author(s):

Alejandro E Rodríguez-Sánchez ◽

Héctor Plascencia-Mora

Keyword(s):

Neural Network ◽

Energy Absorption ◽

Mechanical Energy ◽

Compressive Loading ◽

Ground Truth ◽

Expanded Polystyrene ◽

Polystyrene Foam ◽

Stress Strain ◽

Ground Truth Data ◽

Expanded Polystyrene Foam

Traditional modeling of mechanical energy absorption due to compressive loadings in expanded polystyrene foams involves mathematical descriptions that are derived from stress/strain continuum mechanics models. Nevertheless, most of those models are either constrained using the strain as the only variable to work at large deformation regimes and usually neglect important parameters for energy absorption properties such as the material density or the rate of the applying load. This work presents a neural-network-based approach that produces models that are capable to map the compressive stress response and energy absorption parameters of an expanded polystyrene foam by considering its deformation, compressive loading rates, and different densities. The models are trained with ground-truth data obtained in compressive tests. Two methods to select neural network architectures are also presented, one of which is based on a Design of Experiments strategy. The results show that it is possible to obtain a single artificial neural networks model that can abstract stress and energy absorption solution spaces for the conditions studied in the material. Additionally, such a model is compared with a phenomenological model, and the results show than the neural network model outperforms it in terms of prediction capabilities, since errors around 2% of experimental data were obtained. In this sense, it is demonstrated that by following the presented approach is possible to obtain a model capable to reproduce compressive polystyrene foam stress/strain data, and consequently, to simulate its energy absorption parameters.

Download Full-text

Preliminary photographs and improved positives: discovering the New York Public Library’s Arctic Exploration album

Heritage Science ◽

10.1186/s40494-021-00506-3 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Elena Basso ◽

Federica Pozzi ◽

Jessica Keister ◽

Elizabeth Cronin

Keyword(s):

New York ◽

Fiber Optics ◽

Public Library ◽

Extreme Environment ◽

The Arctic ◽

Low Contrast ◽

X Ray ◽

Full Characterization ◽

New York Public Library ◽

Shed Light

AbstractIn the late 19th and early 20th centuries, original photographs were sent to publishers so that they could be reproduced in print. The photographs often needed to be reworked with overpainting and masking, and such modifications were especially necessary for low-contrast photographs to be reproduced as a letterpress halftone. As altered objects, many of these marked-up photographs were simply discarded after use. An album at The New York Public Library, however, contains 157 such photographs, all relating to the Jackson–Harmsworth expedition to Franz Josef Land, from 1894 to 1897. Received as gifts from publishers, the photographs are heavily retouched with overpainting and masking, as well as drawn and collaged elements. The intense level of overpainting on many of the photographs, but not on others, raised questions about their production and alteration. Jackson’s accounts attested to his practice of developing and printing photographs on site, testing different materials and techniques—including platino-bromide and silver-gelatin papers—to overcome the harsh environmental conditions. In this context, sixteen photographs from the album were analyzed through a combination of non-invasive and micro-invasive techniques, including X-ray fluorescence (XRF) spectroscopy, fiber optics reflectance spectroscopy (FORS), Raman and Fourier-transform infrared (FTIR) spectroscopies, and scanning electron microscopy with energy-dispersive X-ray spectroscopy (SEM/EDS). This analytical campaign aimed to evaluate the possible residual presence of silver halides in any of the preliminary and improved photographs. The detection of these compounds would be one of several factors supporting a hypothesis that some of the photographs in the album were indeed printed on site, in the Arctic, and, as a result, may have been impacted by the extreme environment. Additional goals of the study included the evaluation of the extent of retouching, providing a full characterization of the pigments and dyes used in overpainted prints, and comparing the results with contemporaneous photographic publications that indicate which coloring materials were available at the time. Further analyses shed light on the organic components present in the binders and photographic emulsions. This research has increased our knowledge of photographic processes undertaken in a hostile environment such as the Arctic, and shed light on the technical aspects of photographically illustrating books during the late 19th and early 20th centuries.

Download Full-text

Multi-Temporal Arable Land Monitoring in Arid Region of Northwest China Using a New Extraction Index

Sustainability ◽

10.3390/su13095274 ◽

2021 ◽

Vol 13 (9) ◽

pp. 5274

Author(s):

Xinyang Yu ◽

Younggu Her ◽

Xicun Zhu ◽

Changhe Lu ◽

Xuefei Li

Keyword(s):

Arable Land ◽

Ground Truth ◽

Northwest China ◽

Hexi Corridor ◽

Ground Truth Data ◽

Land Protection ◽

Promising Tool ◽

Study Results ◽

Multi Temporal ◽

The Mean

Development of a high-accuracy method to extract arable land using effective data sources is crucial to detect and monitor arable land dynamics, servicing land protection and sustainable development. In this study, a new arable land extraction index (ALEI) based on spectral analysis was proposed, examined by ground truth data, and then applied to the Hexi Corridor in northwest China. The arable land and its change patterns during 1990–2020 were extracted and identified using 40 Landsat TM/OLI images acquired in 1990, 2000, 2010, and 2020. The results demonstrated that the proposed method can distinguish arable land areas accurately, with the User’s (Producer’s) accuracy and overall accuracy (kappa coefficient) exceeding 0.90 (0.88) and 0.89 (0.87), respectively. The mean relative error calculated using field survey data obtained in 2012 and 2020 was 0.169 and 0.191, respectively, indicating the feasibility of the ALEI method in arable land extracting. The study found that arable land area in the Hexi Corridor was 13217.58 km2 in 2020, significantly increased by 25.33% compared to that in 1990. At 10-year intervals, the arable land experienced different change patterns. The study results indicate that ALEI index is a promising tool used to effectively extract arable land in the arid area.

Download Full-text

Detection of heat pumps from smart meter and open data

Energy Informatics ◽

10.1186/s42162-020-00124-6 ◽

2020 ◽

Vol 3 (S1) ◽

Author(s):

Andreas Weigert ◽

Konstantin Hopf ◽

Nicolai Weinig ◽

Thorsten Staake

Keyword(s):

Heat Pump ◽

Open Data ◽

Ground Truth ◽

Heat Pumps ◽

Geographical Information ◽

Weather Data ◽

Smart Meters ◽

Ground Truth Data ◽

Thermal Reservoir ◽

Grid Operators

Abstract Heat pumps embody solutions that heat or cool buildings effectively and sustainably, with zero emissions at the place of installation. As they pose significant load on the power grid, knowledge on their existence is crucial for grid operators, e.g., to forecast load and to plan grid operation. Further details, like the thermal reservoir (ground or air source) or the age of a heat pump installation renders energy-related services possible that utility companies can offer in the future (e.g., detecting wrongly calibrated installations, household energy efficiency checks). This study investigates the prediction of heat pump installations, their thermal reservoir and age. For this, we obtained a dataset with 397 households in Switzerland, all equipped with smart meters, collected ground truth data on installed heat pumps and enriched this data with weather data and geographical information. Our investigation replicates the state of the art in the area of heat pump detection and goes beyond it, as we obtain three major findings: First, machine learning can detect the existence of heat pumps with an AUC performance metric of 0.82, their heat reservoir with an AUC of 0.86, and their age with an AUC of 0.73. Second, heat pump existence can be better detected using data during the heating period than during summer. Third the number of training samples to detect the existence of heat pumps must not be necessarily large in terms of the number of training instances and observation period.

Download Full-text