Vocabulary challenges with invasive species data sharing

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25642 ◽

2018 ◽

Vol 2 ◽

pp. e25642

Author(s):

Annie Simpson

Keyword(s):

Invasive Species ◽

Native Species ◽

Value Added ◽

Application Programming Interface ◽

The United States ◽

United States Geological Survey ◽

Global Biodiversity Information Facility ◽

Darwin Core ◽

Biodiversity Information ◽

The U.S

Biodiversity Information Serving our Nation - BISON (bison.usgs.gov) is the U.S. node to the Global Biodiversity Information Facility (gbif.org), containing more than 375 million documented locations for all species in the U.S. It is hosted by the United States Geological Survey (USGS) and includes a web site and application programming interface for apps and other websites to use for free. With this massive database one can see not only the 15 million records for nearly 10 thousand non-native species in the U.S. and its territories, but also their relationship to all of the other species in the country as well as their full national range. Leveraging this huge resource and its enterprise level cyberinfrastructure, USGS BISON staff have created a value-added feature by labeling non-native species records, even where contributing datasets have not provided such labels. Based on our ongoing four-year compilation of non-native species scientific names from the literature, specific examples will be shared about the ambiguity and evolution of terms that have been discovered, as they relate to invasiveness, impact, dispersal, and management. The idea of incorporating these terms into an invasive species extension to Darwin Core has been discussed by Biodiversity Information Standards (TDWG) working group participants since at least 2005. One roadblock to the implementation of this standard's extension has been the diverse terminology used to describe the characteristics of biological invasions, terminology which has evolved significantly over the past decade.

Download Full-text

DiSSCo, iDigBio and the Future of Global Collaboration

Biodiversity Information Science and Standards ◽

10.3897/biss.3.37896 ◽

2019 ◽

Vol 3 ◽

Cited By ~ 1

Author(s):

Gil Nelson ◽

Deborah L Paul

Keyword(s):

Working Group ◽

Application Programming Interface ◽

The United States ◽

Common Source ◽

Data Generation ◽

Biodiversity Data ◽

Global Biodiversity Information Facility ◽

Data Accessibility ◽

The Us ◽

Biodiversity Information

Integrated Digitized Biocollections (iDigBio) is the United States’ (US) national resource and coordinating center for biodiversity specimen digitization and mobilization. It was established in 2011 through the US National Science Foundation’s (NSF) Advancing Digitization of Biodiversity Collections (ADBC) program, an initiative that grew from a working group of museum-based and other biocollections professionals working in concert with NSF to make collections' specimen data accessible for science, education, and public consumption. The working group, Network Integrated Biocollections Alliance (NIBA), released two reports (Beach et al. 2010, American Institute of Biological Sciences 2013) that provided the foundation for iDigBio and ADBC. iDigBio is restricted in focus to the ingestion of data generated by public, non-federal museum and academic collections. Its focus is on specimen-based (as opposed to observational) occurrence records. iDigBio currently serves about 118 million transcribed specimen-based records and 29 million specimen-based media records from approximately 1600 datasets. These digital objects have been contributed by about 700 collections representing nearly 400 institutions and is the most comprehensive biodiversity data aggregator in the US. Currently, iDigBio, DiSSCo (Distributed System of Scientific Collections), GBIF (Global Biodiversity Information Facility), and the Atlas of Living Australia (ALA) are collaborating on a global framework to harmonize technologies towards standardizing and synchronizing ingestion strategies, data models and standards, cyberinfrastructure, APIs (application programming interface), specimen record identifiers, etc. in service to a developing consolidated global data product that can provide a common source for the world’s digital biodiversity data. The collaboration strives to harness and combine the unique strengths of its partners in ways that ensure the individual needs of each partner’s constituencies are met, design pathways for accommodating existing and emerging aggregators, simultaneously strengthen and enhance access to the world’s biodiversity data, and underscore the scope and importance of worldwide biodiversity informatics activities. Collaborators will share technology strategies and outputs, align conceptual understandings, and establish and draw from an international knowledge base. These collaborators, along with Biodiversity Information Standards (TDWG), will join iDigBio and the Smithsonian National Museum of Natural History as they host Biodiversity 2020 in Washington, DC. Biodiversity 2020 will combine an international celebration of the worldwide progress made in biodiversity data accessibility in the 21st century with a biodiversity data conference that extends the life of Biodiversity Next. It will provide a venue for the GBIF governing board meeting, TDWG annual meeting, and the annual iDigBio Summit as well as three days of plenary and concurrent sessions focused on the present and future of biodiversity data generation, mobilization, and use.

Download Full-text

SPECIES: Supporting big-data-driven research

Biodiversity Information Science and Standards ◽

10.3897/biss.3.36095 ◽

2019 ◽

Vol 3 ◽

Cited By ~ 1

Author(s):

Raul Sierra-Alcocer ◽

Christopher Stephens ◽

Juan Barrios ◽

Constantino González‐Salazar ◽

Juan Carlos Salazar Carrillo ◽

...

Keyword(s):

Web Application ◽

Application Programming Interface ◽

Data Driven ◽

Spatial Correlations ◽

Species Occurrence ◽

Global Biodiversity Information Facility ◽

Application Programming ◽

Abiotic Variables ◽

Programming Interface ◽

Biodiversity Information

SPECIES (Stephens et al. 2019) is a tool to explore spatial correlations in biodiversity occurrence databases. The main idea behind the SPECIES project is that the geographical correlations between the distributions of taxa records have useful information. The problem, however, is that if we have thousands of species (Mexico's National System of Biodiversity Information has records of around 70,000 species) then we have millions of potential associations, and exploring them is far from easy. Our goal with SPECIES is to facilitate the discovery and application of meaningful relations hiding in our data. The main variables in SPECIES are the geographical distributions of species occurrence records. Other types of variables, like the climatic variables from WorldClim (Hijmans et al. 2005), are explanatory data that serve for modeling. The system offers two modes of analysis. In one, the user defines a target species, and a selection of species and abiotic variables; then the system computes the spatial correlations between the target species and each of the other species and abiotic variables. The request from the user can be as small as comparing one species to another, or as large as comparing one species to all the species in the database. A user may wonder, for example, which species are usual neighbors of the jaguar, this mode could help answer this question. The second mode of analysis gives a network perspective, in it, the user defines two groups of taxa (and/or environmental variables), the output in this case is a correlation network where the weight of a link between two nodes represents the spatial correlation between the variables that the nodes represent. For example, one group of taxa could be hummingbirds (Trochilidae family) and the second flowers of the Lamiaceae family. This output would help the user analyze which pairs of hummingbird and flower are highly correlated in the database. SPECIES data architecture is optimized to support fast hypotheses prototyping and testing with the analysis of thousands of biotic and abiotic variables. It has a visualization web interface that presents descriptive results to the user at different levels of detail. The methodology in SPECIES is relatively simple, it partitions the geographical space with a regular grid and treats a species occurrence distribution as a present/not present boolean variable over the cells. Given two species (or one species and one abiotic variable) it measures if the number of co-occurrences between the two is more (or less) than expected. If it is more than expected indicates a signal of a positive relation, whereas if it is less it would be evidence of disjoint distributions. SPECIES provides an open web application programming interface (API) to request the computation of correlations and statistical dependencies between variables in the database. Users can create applications that consume this 'statistical web service' or use it directly to further analyze the results in frameworks like R or Python. The project includes an interactive web application that does exactly that: requests analysis from the web service and lets the user experiment and visually explore the results. We believe this approach can be used on one side to augment the services provided from data repositories; and on the other side, facilitate the creation of specialized applications that are clients of these services. This scheme supports big-data-driven research for a wide range of backgrounds because end users do not need to have the technical know-how nor the infrastructure to handle large databases. Currently, SPECIES hosts: all records from Mexico's National Biodiversity Information System (CONABIO 2018) and a subset of Global Biodiversity Information Facility data that covers the contiguous USA (GBIF.org 2018b) and Colombia (GBIF.org 2018a). It also includes discretizations of environmental variables from WorldClim, from the Environmental Rasters for Ecological Modeling project (Title and Bemmels 2018), from CliMond (Kriticos et al. 2012), and topographic variables (USGS EROS Center 1997b, USGS EROS Center 1997a). The long term plan, however, is to incrementally include more data, specially all data from the Global Biodiversity Information Facility. The code of the project is open source, and the repositories are available online (Front-end, Web Services Application Programming Interface, Database Building scripts). This presentation is a demonstration of SPECIES' functionality and its overall design.

Download Full-text

Best practices for connecting genetic records with specimen data

Biodiversity Information Science and Standards ◽

10.3897/biss.2.26369 ◽

2018 ◽

Vol 2 ◽

pp. e26369

Author(s):

Michael Trizna

Keyword(s):

Best Practices ◽

Good News ◽

Sequencing Technology ◽

Global Biodiversity Information Facility ◽

A Value ◽

Darwin Core ◽

High Quality Sequence ◽

Voucher Specimens ◽

Tight Connection ◽

Biodiversity Information

As rapid advances in sequencing technology result in more branches of the tree of life being illuminated, there has actually been a decrease in the percentage of sequence records that are backed by voucher specimens Trizna 2018b. The good news is that there are tools Trizna (2017), NCBI (2005), Biocode LLC (2014) to enable well-databased museum vouchers to automatically validate and format specimen and collection metadata for high quality sequence records. Another problem is that there are millions of existing sequence records that are known to contain either incorrect or incomplete specimen data. I will show an end-to-end example of sequencing specimens from a museum, depositing their sequence records in NCBI's (National Center for Biotechnology Information) GenBank database, and then providing updates to GenBank as the museum database revises identifications. I will also talk about linking records from specimen databases as well. Over one million records in the Global Biodiversity Information Facility (GBIF) Trizna (2018a) contain a value in the Darwin Core term "associatedSequences", and I will examine what is currently contained in these entries, and how best to format them to ensure that a tight connection is made to sequence records.

Download Full-text

The Fall of the Labor Share and the Rise of Superstar Firms*

The Quarterly Journal of Economics ◽

10.1093/qje/qjaa004 ◽

2020 ◽

Vol 135 (2) ◽

pp. 645-709 ◽

Cited By ~ 61

Author(s):

David Autor ◽

David Dorn ◽

Lawrence F Katz ◽

Christina Patterson ◽

John Van Reenen

Keyword(s):

Value Added ◽

The United States ◽

Market Concentration ◽

Technological Changes ◽

Labor Share ◽

Economic Census ◽

New Interpretation ◽

Labor's Share ◽

The U.S ◽

Number Of Firms

Abstract The fall of labor’s share of GDP in the United States and many other countries in recent decades is well documented but its causes remain uncertain. Existing empirical assessments typically rely on industry or macro data, obscuring heterogeneity among firms. In this article, we analyze micro panel data from the U.S. Economic Census since 1982 and document empirical patterns to assess a new interpretation of the fall in the labor share based on the rise of “superstar firms.” If globalization or technological changes push sales toward the most productive firms in each industry, product market concentration will rise as industries become increasingly dominated by superstar firms, which have high markups and a low labor share of value added. We empirically assess seven predictions of this hypothesis: (i) industry sales will increasingly concentrate in a small number of firms; (ii) industries where concentration rises most will have the largest declines in the labor share; (iii) the fall in the labor share will be driven largely by reallocation rather than a fall in the unweighted mean labor share across all firms; (iv) the between-firm reallocation component of the fall in the labor share will be greatest in the sectors with the largest increases in market concentration; (v) the industries that are becoming more concentrated will exhibit faster growth of productivity; (vi) the aggregate markup will rise more than the typical firm’s markup; and (vii) these patterns should be observed not only in U.S. firms but also internationally. We find support for all of these predictions.

Download Full-text

An audit of some processing effects in aggregated occurrence records

ZooKeys ◽

10.3897/zookeys.751.24791 ◽

2018 ◽

Vol 751 ◽

pp. 129-146 ◽

Cited By ~ 7

Author(s):

Robert Mesibov

Keyword(s):

Data Loss ◽

Global Biodiversity Information Facility ◽

Australian Museum ◽

Darwin Core ◽

Species Groups ◽

Processing Effects ◽

Global Biodiversity ◽

Name Changes ◽

Biodiversity Information ◽

Occurrence Records

A total of ca 800,000 occurrence records from the Australian Museum (AM), Museums Victoria (MV) and the New Zealand Arthropod Collection (NZAC) were audited for changes in selected Darwin Core fields after processing by the Atlas of Living Australia (ALA; for AM and MV records) and the Global Biodiversity Information Facility (GBIF; for AM, MV and NZAC records). Formal taxon names in the genus- and species-groups were changed in 13–21% of AM and MV records, depending on dataset and aggregator. There was little agreement between the two aggregators on processed names, with names changed in two to three times as many records by one aggregator alone compared to records with names changed by both aggregators. The type status of specimen records did not change with name changes, resulting in confusion as to the name with which a type was associated. Data losses of up to 100% were found after processing in some fields, apparently due to programming errors. The taxonomic usefulness of occurrence records could be improved if aggregators included both original and the processed taxonomic data items for each record. It is recommended that end-users check original and processed records for data loss and name replacements after processing by aggregators.

Download Full-text

Economic Contributions of the Green Industry in the United States in 2007–08

HortTechnology ◽

10.21273/horttech.21.5.628 ◽

2011 ◽

Vol 21 (5) ◽

pp. 628-638 ◽

Cited By ~ 12

Author(s):

Alan W. Hodges ◽

Charles R. Hall ◽

Marco A. Palma

Keyword(s):

United States ◽

Building Materials ◽

Value Added ◽

The United States ◽

Total Output ◽

Full Time ◽

Regional Economic ◽

Green Industry ◽

Total Sales ◽

The U.S

Economic contributions of the green industry in each state of the United States were estimated for 2007–08 using regional economic multipliers, together with information on horticulture product sales, employment, and payroll reported by the U.S. Economic Census and a nursery industry survey. Total sales revenues for all sectors were $176.11 billion, direct output was $117.40 billion, and total output impacts, including indirect and induced regional economic multiplier effects of nonlocal output, were $175.26 billion. The total value added impact was $107.16 billion, including employee compensation, proprietor (business owner) income, other property income, and indirect business taxes paid to state/local and federal governments. The industry had direct employment of 1.20 million full-time and part-time jobs and total employment impacts of 1.95 million jobs in the broader economy. The largest individual industry sectors in terms of employment and value added impacts were Landscaping services (1,075,343 jobs, $50.3 billion), Nursery and greenhouse production (436,462 jobs, $27.1 billion), and Building materials and garden equipment and supplies stores (190,839 jobs, $9.7 billion). The top 10 individual states in terms of employment contributions were California (257,885 jobs), Florida (188,437 jobs), Texas (82,113 jobs), North Carolina (81,113 jobs), Ohio (79,707 jobs), Pennsylvania (75,604 jobs), New Jersey (67,993 jobs), Illinois (67,382 jobs), Georgia (66,042 jobs), and Virginia (58,677 jobs). The total value added of the U.S. green industry represented 0.76% of U.S. Gross Domestic Product (GDP) in 2007, and up to 1.60% of GDP in individual states. On the basis of a similar previous study for 2002 (Hall et al., 2006), total sales of horticultural products and services in 2007–08 increased by 3.5%, and total output impacts increased by 29.2%, or an average annual rate of 5.8% in inflation-adjusted terms.

Download Full-text

A Spatially Detailed and Economically Complete Blue Water Footprint of the United States

10.5194/hess-2017-650 ◽

2017 ◽

Cited By ~ 1

Author(s):

Richard R. Rushforth ◽

Benjamin L. Ruddell

Keyword(s):

United States ◽

Water Use ◽

Water Footprint ◽

The United States ◽

Irrigated Agriculture ◽

Energy Information Administration ◽

United States Geological Survey ◽

United States Department ◽

Blue Water ◽

The U.S

Abstract. This paper quantifies and maps a spatially detailed and economically complete blue water footprint for the United States, utilizing the National Water Economy Database version 1.1 (NWED). NWED utilizes multiple mesoscale federal data resources from the United States Geological Survey (USGS), the United States Department of Agriculture (USDA), the U.S. Energy Information Administration (EIA), the U.S. Department of Transportation (USDOT), the U.S. Department of Energy (USDOE), and the U.S. Bureau of Labor Statistics (BLS) to quantify water use, economic trade, and commodity flows to construct this water footprint. Results corroborate previous studies in both the magnitude of the U.S. water footprint (F) and in the observed pattern of virtual water flows. The median water footprint (FCUMed) of the U.S. is 181 966 Mm3 (FWithdrawal: 400 844 Mm3; FCUMax: 222 144 Mm3; FCUMin: 61 117 Mm3) and the median per capita water footprint (F'CUMed) of the U.S. is 589 m3 capita−1 (F'Withdrawal: 1298 m3 capita−1; F'CUMax: 720 m3 capita−1; F'CUMin: 198 m3 capita−1). The U.S. hydro-economic network is centered on cities and is dominated by the local and regional scales. Approximately (58 %) of U.S. water consumption is for the direct and indirect use by cities. Further, the water footprint of agriculture and livestock is 93 % of the total U.S. water footprint, and is dominated by irrigated agriculture in the Western U.S. The water footprint of the industrial, domestic, and power economic sectors is centered on population centers, while the water footprint of the mining sector is highly dependent on the location of mineral resources. Owing to uncertainty in consumptive use coefficients alone, the mesoscale blue water footprint uncertainty ranges from 63 % to over 99 % depending on location. Harmonized region-specific, economic sector-specific consumption coefficients are necessary to reduce water footprint uncertainties and to better understand the human economy's water use impact on the hydrosphere.

Download Full-text

Sectoral Impacts of Invasive Species in the United States and Approaches to Management

Invasive Species in Forests and Rangelands of the United States ◽

10.1007/978-3-030-45367-1_9 ◽

2021 ◽

pp. 203-229

Author(s):

Anne S. Marsh ◽

Deborah C. Hayes ◽

Patrice N. Klein ◽

Nicole Zimmerman ◽

Alison Dalsimer ◽

...

Keyword(s):

United States ◽

Invasive Species ◽

Cultural Practices ◽

Well Being ◽

The United States ◽

Major Effect ◽

Public And Private ◽

Species Establishment ◽

And Control ◽

The U.S

AbstractInvasive species have a major effect on many sectors of the U.S. economy and on the well-being of its citizens. Their presence impacts animal and human health, military readiness, urban vegetation and infrastructure, water, energy and transportations systems, and indigenous peoples in the United States (Table 9.1). They alter bio-physical systems and cultural practices and require significant public and private expenditure for control. This chapter provides examples of the impacts to human systems and explains mechanisms of invasive species’ establishment and spread within sectors of the U.S. economy. The chapter is not intended to be comprehensive but rather to provide insight into the range and severity of impacts. Examples provide context for ongoing Federal programs and initiatives and support State and private efforts to prevent the introduction and spread of invasive species and eradicate and control established invasive species.

Download Full-text

Biodiversity Information Services: A (not-so-) little knowledge that acts

Biodiversity Information Science and Standards ◽

10.3897/biss.2.25738 ◽

2018 ◽

Vol 2 ◽

pp. e25738 ◽

Cited By ~ 1

Author(s):

Arturo Ariño ◽

Daniel Noesgaard ◽

Angel Hjarding ◽

Dmitry Schigel

Keyword(s):

Critical Mass ◽

The Body ◽

Global Biodiversity Information Facility ◽

Entire List ◽

Darwin Core ◽

Biogeographical Regions ◽

Continuous Presence ◽

Set Up ◽

Taxonomic Groups ◽

Biodiversity Information

Standards set up by Biodiversity Information Standards-Taxonomic Databases Working Group (TDWG), initially developed as a way to share taxonomical data, greatly facilitated the establishment of the Global Biodiversity Information Facility (GBIF) as the largest index to digitally-accessible primary biodiversity information records (PBR) held by many institutions around the world. The level of detail and coverage of the body of standards that later became the Darwin Core terms enabled increasingly precise retrieval of relevant records useful for increased digitally-accessible knowledge (DAK) which, in turn, may have helped to solve ecologically-relevant questions. After more than a decade of data accrual and release, an increasing number of papers and reports are citing GBIF either as a source of data or as a pointer to the original datasets. GBIF has curated a list of over 5,000 citations that were examined for contents, and to which tags were applied describing such contents as additional keywords. The list now provides a window on what users want to accomplish using such DAK. We performed a preliminary word frequency analysis of this literature, starting at titles, which refers to GBIF as a resource. Through a standardization and mapping of terms, we examined how the facility-enabled data seem to have been used by scientists and other practitioners through time: what concepts/issues are pervasive, which taxon groups are mostly addressed, and whether data concentrate around specific geographical or biogeographical regions. We hoped to cast light on which types of ecological problems the community believes are amenable to study through the judicious use of this data commons and found that, indeed, a few themes were distinctly more frequently mentioned than others. Among those, generally-perceived issues such as climate change and its effect on biodiversity at global and regional scales seemed prevalent. The taxonomic groups were also unevenly mentioned, with birds and plants being the most frequently named. However, the entire list of potential subjects that might have used GBIF-enabled data is now quite wide, showing that the availability of well-structured data has spawned a widening spectrum of possible use cases. Among them, some enjoy early and continuous presence (e.g. species, biodiversity, climate) while others have started to show up only later, once a critical mass of data seemed to have been attained (e.g. ecosystems, suitability, endemism). Biodiversity information in the form of standards-compliant DAK may thus already have become a commodity enabling insight into an increasingly more complex and diverse body of science. Paraphrasing Tennyson, more things were wrought by data than TDWG dreamt of.

Download Full-text

Impacts of Invasive Species in Terrestrial and Aquatic Systems in the United States

Invasive Species in Forests and Rangelands of the United States ◽

10.1007/978-3-030-45367-1_2 ◽

2021 ◽

pp. 5-39

Author(s):

Albert E. Mayfield ◽

Steven J. Seybold ◽

Wendell R. Haag ◽

M. Tracy Johnson ◽

Becky K. Kerns ◽

...

Keyword(s):

United States ◽

Invasive Species ◽

Native Species ◽

Animal Health ◽

Forest Products ◽

The United States ◽

Usda Forest Service ◽

Terrestrial Habitats ◽

Noxious Weed ◽

National Significance

AbstractThe introduction, establishment, and spread of invasive species in terrestrial and aquatic environments is widely recognized as one of the most serious threats to the health, sustainability, and productivity of native ecosystems (Holmes et al. 2009; Mack et al. 2000; Pyšek et al. 2012; USDA Forest Service 2013). In the United States, invasive species are the second leading cause of native species endangerment and extinction, and their costs to society have been estimated at $120 billion annually (Crowl et al. 2008; Pimentel et al. 2000, 2005). These costs include lost production and revenue from agricultural and forest products, compromised use of waterways and terrestrial habitats, harm to human and animal health, reduced property values and recreational opportunities, and diverse costs associated with managing (e.g., monitoring, preventing, controlling, and regulating) invasive species (Aukema et al. 2011; Pimentel et al. 2005). The national significance of these economic, ecological, and social impacts in the United States has prompted various actions by both legislative and executive branches of the Federal Government (e.g., the Nonindigenous Aquatic Nuisance Prevention and Control Act of 1990; the Noxious Weed Control and Eradication Act of 2002; Executive Order 13112 of 1999, amended in 2016).

Download Full-text