scholarly journals Managing Taxon Data in FinBIF

Author(s):  
Esko Piirainen ◽  
Eija-Leena Laiho ◽  
Tea von Bonsdorff ◽  
Tapani Lahti

The Finnish Biodiversity Information Facility, FinBIF (https://species.fi), has developed its own taxon database. This allows FinBIF taxon specialists to maintain their own, expert-validated view of Finnish species. The database covers national needs and can be rapidly expanded by our own development team. Furthermore, in the database each taxon is given a globally unique persistent URI identifier (https://www.w3.org/TR/uri-clarification), which refers to the taxon concept, not just to the name. The identifier doesn’t change if the taxon concept doesn’t change. We aim to ensure compatibility with checklists from other countries by linking taxon concepts as Linked Data (https://www.w3.org/wiki/LinkedData) — a work started as a part of the Nordic e-Infrastructure Collaboration (NeIC) DeepDive project (https://neic.no/deepdive). The database is used as a basis for observation/specimen searches, e-Learning and identification tools, and it is browsable by users of the FinBIF portal. The data is accessible to everyone under CC-BY 4.0 license (https://creativecommons.org/licenses/by/4.0) in machine readable formats. The taxon specialists maintain the taxon data using a web application. Currently, there are 60 specialists. All changes made to the data go live every night. The nightly update interval allows the specialists a grace period to make their changes. Allowing the taxon specialists to modify the taxonomy database themselves leads to some challenges. To maintain the integrity of critical data, such as lists of protected species, we have had to limit what the specialists can do. Changes to critical data is carried out by an administrator. The database has special features for linking observations to the taxonomy. These include hidden species aggregates and tools to override how a certain name used in observations is linked to the taxonomy. Misapplied names remain an unresolved problem. The most precise way to record an observation is to use a taxon concept: Most observations are still recorded using plain names, but it is possible for the observer to pick a concept. Also, when data is published in FinBIF from other information systems, the data providers can link their observations to the concepts using the identifiers of concepts. The ability to use taxon concepts as basis of observations means we have to maintain the concepts over time — a task that may become arduous in the future (Fig. 1). As it stands now, the FinBIF taxon data model — including adjacent classes such as publication, person, image, and endangerment assessments — consists of 260 properties. If the data model were stored in a normalized relational database, there would be approximately 56 tables, which could be difficult to maintain. Keeping track of a complete history of data is difficult in relational databases. Alternatively, we could use document storage to store taxon data. However, there are some difficulties associated with document storages: (1) much work is required to implement a system that does small atomic update operations; (2) batch updates modifying multiple documents usually require writing a script; and (3) they are not ideal for doing searches. We use a document storage for observation data, however, because they are well suited for storing large quantities of complex records. In FinBIF, we have decided to use a triplestore for all small datasets, such as taxon data. More specifically, the data is stored according to the RDF specification (https://www.w3.org/RDF). An RDF Schema defines the allowed properties for each class. Our triplestore implementation is an Oracle relational database with two tables (resource and statement), which gives us the ability to do SQL queries and updates. Doing small atomic updates is easy as only a small subset of the triplets can be updated instead of the entire data entity. Maintaining a complete record of history comes without much effort, as it can be done on an individual triplet level. For performance-critical queries, the taxon data is loaded into an Elasticsearch (https://www.elastic.co) search engine.

2019 ◽  
pp. 453-460
Author(s):  
Vitalii I. Yesin ◽  
Mikolaj Karpinski ◽  
Maryna V. Yesina ◽  
Vladyslav V. Vilihura

The goal of the article is to develop a universal (standard) data model that allows you to get rid of the need for a costly policy of doing extra work when developing new ones or transforming existing relational databases (RDBs) caused by dynamic changes in the subject domain (SD). The requirements for the developed data model were formulated. In accordance with the formulated requirements, the data model was synthesized. To simplify the process of creating relational database schemas an algorithm for transforming the description of the subject domain into the relations of the universal basis of the developed model was proposed. The scientific novelty of the obtained results is: a data model that, unlike known ones, allows us to simplify the creation of RDB schemas at the stage of logical design of relational databases, under the conditions of dynamic changes in subject domains, due to the introduced universal basis of relations, as a means of describing structures and the presentation of data for various SDs has been developed.


Author(s):  
M. V. Smirnov ◽  
V. M. Polenok

The article actualizes the need to develop software for modeling relational databases for use in the process of teaching students of technical specialties in disciplines related to databases.The problem is considered from the point of view of assessing modern software used in the process of teaching students database design skills. Based on the shortcomings identified during the software review, a number of requirements for the actual software were determined. Formed key requirements are mobility, accessibility, versatility and openness of the development platform.The article describes the process of solving key problems that arose during the implementation of a project to develop a web application for modeling relational databases in accordance with the generated requirements. The practical implementation of the following functions is sequentially considered: creation of a logical relational data model, creation of a physical data model, direct engineering into relational database software. The main technological solutions used in the development of a web application to ensure the qualities specified in the condition are described.The result of the work is the successful testing of the development results in the process of creating a real web application, both within the framework of laboratory and practical work in the disciplines “Design and administration of databases” and “Data management”, and at the stage of writing graduate works for technical directions of training.


Author(s):  
Sapiahon Khaidarova ◽  

The article outlines the methods for creating SQL queries in relational databases. The use of the structured query language SQL in relational databases is substantiated. It provides information about the SQL standard and the three-tier database organization system. The author describes the choice of a data model based on the conceptual level using to that end an example of the Kokand Pedagogical Institute as the relational database model. A relational conceptual diagram of the information model of a pedagogical institute is compiled. Such a conceptual diagram is depicted using a cluster. Objects of the subject area are depicted in the form of tables, which differ from each other in geometric shapes or colors. The relationships between tables in Microsoft Access are presented. The basic rules for creating and filling tables in SQL using the instructions CREATE TABLE and INSERT INTO are considered. The syntax of the SELECT statement is given. All offers of the SELECT statement and their order are listed. Examples are given for compiling simple queries and subqueries in SQL using the SELECT statement for the database of the Kokand Pedagogical Institute. Information about the order of execution of internal and external requests is given. The article considers the ORDER BY offer of a SELECT statement for sorting query results.


The chapter presents how relational databases answer to typical NoSQL features, and, vice versa, how NoSQL databases answer to typical relational features. Open issues related to the integration of relational and NoSQL databases, as well as next database generation features are discussed. The big relational database vendors have continuously worked to incorporate NoSQL features into their databases, as well as NoSQL vendors are trying to make their products more like relational databases. The convergence of these two groups of databases has been a driving force in the evolution of database market, in establishing a new level of focus to resolving big data requirements, and in enabling users to fully use data potential, wherever data is stored, in relational or NoSQL databases. In turn, the database of choice in the future will likely be one that provides the best of both worlds: flexible data model, high availability, and enterprise reliability.


2014 ◽  
Vol 2014 ◽  
pp. 1-9
Author(s):  
Julie Yu-Chih Liu

Functional dependency is the basis of database normalization. Various types of fuzzy functional dependencies have been proposed for fuzzy relational database and applied to the process of database normalization. However, the problem of achieving lossless join decomposition occurs when employing the fuzzy functional dependencies to database normalization in an extended possibility-based fuzzy data models. To resolve the problem, this study defined fuzzy functional dependency based on a notion of approximate equality for extended possibility-based fuzzy relational databases. Examples show that the notion is more applicable than other similarity concept to the research related to the extended possibility-based data model. We provide a decomposition method of using the proposed fuzzy functional dependency for database normalization and prove the lossless join property of the decomposition method.


2021 ◽  
Vol 9 (7) ◽  
pp. 71-78
Author(s):  
Ian Adamson

With the extensive use of relational databases in the business environment there is a need to reduce database complexity in order to avoid data inconsistency and redundancy, which can provide a company with unreliable and/or meaningless data and information. The use of the REA Data Model in database design can significantly help with this problem.  The model can eliminate the need for unnecessary data artifacts which should only be generated by the system when needed. This paper also addresses the need for a Relational Database Complexity Metric. A simple and easy to understand metric is presented.


2011 ◽  
Vol 8 (1) ◽  
pp. 27-40 ◽  
Author(s):  
Srdjan Skrbic ◽  
Milos Rackovic ◽  
Aleksandar Takaci

In this paper we examine the possibilities to extend the relational data model with the mechanisms that can handle imprecise, uncertain and inconsistent attribute values using fuzzy logic and fuzzy sets. We present a fuzzy relational data model which we use for fuzzy knowledge representation in relational databases that guarantees the model in 3rd normal form. We also describe the CASE tool for the fuzzy database model development which is apparently the first implementation of such a CASE tool. In this sense, this paper presents a leap forward towards the specification of a methodology for fuzzy relational database applications development.


Author(s):  
Z. M. Ma

Database modeling of engineering information is crucial for constructing manufacturing systems because current manufacturing industries are typically information-based enterprises and information systems have become their nervous center. Engineering information can be modeled at two levels: conceptual data model and logical database model. Generally a conceptual data model is designed and then the designed conceptual data model will be transformed into the chosen logical database schema. Imprecise and uncertain information, however, is generally involved in many engineering activities and imprecise and uncertain engineering information are represented by fuzzy sets. Nowadays relational databases are still the most useful database product and IDEF1X is most useful for logical database design of relational databases in engineering. So in this paper, we focus on fuzzy data modeling in IDEF1X and relational databases. The formal approaches to mapping fuzzy IDEF1X models to fuzzy relational database schemes are hereby developed.


Author(s):  
Karthikeyan Ramasamy ◽  
Prasad M. Deshpande

About three decades ago, when Codd (1970) invented the relational database model, it took the database world by storm. The enterprises that adapted it early won a large competitive edge. The past two decades have witnessed tremendous growth of relational database systems, and today the relational model is by far the dominant data model and is the foundation for leading DBMS products, including IBM DB2, Informix, Oracle, Sybase, and Microsoft SQL server. Relational databases have become a multibillion-dollar industry.


2006 ◽  
pp. 145-170
Author(s):  
Jose Galindo ◽  
Angelica Urrutia ◽  
Mario Piattini

The Relational Model was developed by E.F. Codd of IBM and published in 1970. It is currently the most used and has been a milestone in the history of databases, revolutionizing the market. In fact, relational databases have been the most widespread of all databases. On a theoretical level, many Fuzzy Relational Database models (Chapter II), which are based on the relational model, extend this so that vague and uncertain information can be stored and/or treated with or without fuzzy logic (see Chapter I). The FuzzyEER Model (see Chapter IV) is an extension of the EER Model for creating conceptual schemas with fuzzy semantics and notations. This extension is a good eclectic synthesis between different models (see Chapter III) and provides new and useful definitions: fuzzy attributes, fuzzy entities, fuzzy relationships, fuzzy specializations, and so forth.


Sign in / Sign up

Export Citation Format

Share Document