Constructing Crop Portraits Based on Graph Databases Is Essential to Agricultural Data Mining

Yue-Xin Shi; Bo-Kai Zhang; Yong-Xiang Wang; Han-Qian Luo; Xiang Li

doi:10.3390/info12060227

Constructing Crop Portraits Based on Graph Databases Is Essential to Agricultural Data Mining

Information ◽

10.3390/info12060227 ◽

2021 ◽

Vol 12 (6) ◽

pp. 227

Author(s):

Yue-Xin Shi ◽

Bo-Kai Zhang ◽

Yong-Xiang Wang ◽

Han-Qian Luo ◽

Xiang Li

Keyword(s):

Big Data Analytics ◽

Graph Model ◽

Original Data ◽

Popular Science ◽

Theoretical Research ◽

Graph Database ◽

Science Data ◽

Comprehensive Description ◽

Agricultural Knowledge ◽

Crop Variety

Neo4j is a graph database that can use not only data, but also data relationships. Crop portraits, a kind of property graph, model the crop entity in the real world based on data to realize the networked management of crop knowledge. The existing crop knowledge base has shortcomings such as single crop variety, incomplete description, and lack of agricultural knowledge. Constructing crop portraits can provide a comprehensive description of crops and make up for these shortcomings. This research used agricultural question-and-answer data and popular science data obtained by text crawling as the original data, selected labels to establish a crop portrait that including three categories (crops, pesticides, and diseases and pests), and used the graph database (Neo4j) to store and display these portrait data. Information mining found that the crop portrait revealed the occurrence trend of diseases and pests, exhibited a nonintrinsic connection between different diseases and pests, and provided a variety of pesticides to choose from for control of diseases and pests. The results showed that constructing crop portraits is beneficial to agricultural analysis, and has practical application values and theoretical research prospects in the field of big data analytics.

Download Full-text

Big Data Analytics in E-commerce for the U.S. and China Through Literature Reviewing

Journal of Systems Science and Information ◽

10.21078/jssi-2021-016-29 ◽

2021 ◽

Vol 9 (1) ◽

pp. 16-44

Author(s):

Weiqing Zhuang ◽

Morgan C. Wang ◽

Ichiro Nakamoto ◽

Ming Jiang

Keyword(s):

Big Data ◽

Data Analytics ◽

Web Of Science ◽

Big Data Analytics ◽

Theoretical Research ◽

Analysis Method ◽

Theoretical Comparison ◽

Statistical Analysis Method ◽

And Performance ◽

The U.S

Abstract Big data analytics (BDA) in e-commerce, which is an emerging field that started in 2006, deeply affects the development of global e-commerce, especially its layout and performance in the U.S. and China. This paper seeks to examine the relative influence of theoretical research of BDA in e-commerce to explain the differences between the U.S. and China by adopting a statistical analysis method on the basis of samples collected from two main literature databases, Web of Science and CNKI, aimed at the U.S. and China. The results of this study help clarify doubts regarding the development of China’s e-commerce, which exceeds that of the U.S. today, in view of the theoretical comparison of BDA in e-commerce between them.

Download Full-text

The Influence of Big Data Analytics on E-Commerce: Case Study of the U.S. and China

Wireless Communications and Mobile Computing ◽

10.1155/2021/2888673 ◽

2021 ◽

Vol 2021 ◽

pp. 1-20

Author(s):

Weiqing Zhuang

Keyword(s):

Big Data ◽

Global Economy ◽

Data Analytics ◽

Big Data Analytics ◽

The United States ◽

Mediating Effect ◽

Theoretical Research ◽

Business Strategies ◽

Future Research ◽

The U.S

Big data analytics (BDA) is a wide and deep application in e-commerce, which impacts positively on the global economy, especially the U.S. and China who have done well. This paper seeks to examine the relative influence of theoretical research and practical activities of BDA in e-commerce to explain the differences between the U.S. and China according to the two main literature databases, Web of Science and CNKI, respectively, and by employing other samples that present retail e-commerce sales and the number of some data companies founded in the U.S. and China each year. We further determine the reasons leading to the difference between the U.S. and China in BDA in e-commerce, which can help managers devise appropriate business strategies in e-commerce for each of them, and provide a proof of the significant relationship of theoretical research and practical activities in BDA in e-commerce. In addition, the variables related to big data companies show a moderation effect rather than mediating effect relative to the practice of theoretical research in e-commerce in the United States, but they show a moderate effect and mediating effects in China. The results of this study help clarify doubts regarding the development of China’s e-commerce. Moreover, three orientations in e-commerce using BDA and the use of quantum computing in e-commerce to solve existing e-commerce problems are explored to provide better evidence for decision-making that could be valuable in future research.

Download Full-text

Analisis dan Perancangan Software WordNet Bahasa Indonesia dengan Graph Database

ILKOMNIKA: Journal of Computer Science and Applied Informatics ◽

10.28926/ilkomnika.v2i2.52 ◽

2020 ◽

Vol 2 (2) ◽

pp. 198-209

Author(s):

Seta Murdha Pamungkas ◽

Muhammad Ainul Yaqin ◽

Kurnia Z. Matondang ◽

Asfilia N. Anggraini ◽

Abd. Charis Fauzan

Keyword(s):

Graph Model ◽

Graph Database ◽

Online Data ◽

Bahasa Indonesia

Paper ini bertujuan untuk merancang sebuah WordNet dengan menggunakan database bermetode graph yang akan dirancang dengan tahap awal pengelompokan kata hingga tahap akhir yaitu pengimplementasian dalam bentuk aplikasi siap pakai. Penelitian ini merancang aplikasi WordNet siap pakai yang mengimplementasikan database berbasis graph dengan menggunakan Metode Waterfall yang akan menjadi dasar alur dalam perancangan aplikasi. Uji coba dalam penelitian ini berupa: pemilahan kata, input kata ke dalam database, pemberian relasi antar kata, uji coba setiap kata dengan kata yang lain, dan implementasi database ke dalam sebuah aplikasi yang siap pakai. Pada penelitian ini menghasilkan sebuah rancangan WordNet yang menggunakan Bahasa Indonesia dengan penggunaan database berbasis graph model sebagai tempat penyimpanan data. Rancangan WordNet tersebut lalu diimplementasikan ke dalam sebuah aplikasi yang siap pakai. Data yang digunakan dalam penelitian didapatkan dari KBBI dan Tesaurus yang dapat diakses melalui media online. Data dikumpulkan secara berelasi hingga membentuk relasi Sinonim, Antonim, Imbuhan, dan Kata Dasar. Penelitian ini menggunakan relasi yang spesifik untuk menghubungkan antar node, tetapi tidak menggunakannya untuk pencarian jarak antar node. Penelitian ini menggabungkan hasil dari penelitian perancangan WordNet dan perancangan database dengan metode graph hingga membentuk sebuah aplikasi siap pakai.

Download Full-text

A Graph Database Model for Knowledge Extracted from Place Descriptions

10.20944/preprints201804.0202.v1 ◽

2018 ◽

Author(s):

Hao Chen ◽

Maria Vasardani ◽

Stephan Winter ◽

Martin Tomko

Keyword(s):

Management System ◽

Graph Model ◽

Rich Source ◽

Extended Model ◽

Graph Database ◽

Contextual Knowledge ◽

Database Model ◽

Place Descriptions

Everyday place descriptions provide a rich source of knowledge about places and their relative locations. This research proposes a place graph model for modeling this spatial, non-spatial, and contextual knowledge from place descriptions. The model extends a prior place graph, and overcomes a number of limitations. The model is implemented using the Neo4j graph database, and a management system has also been developed that allows operations including querying, mapping, and visualizing the stored knowledge in an extended place graph. Then three experimental tasks, namely georeferencing, reasoning, and querying, are selected to demonstrate the superiority of the extended model.

Download Full-text

Big Data analytics for prediction: parallel processing of the big learning base with the possibility of improving the final result of the prediction

Information Discovery and Delivery ◽

10.1108/idd-02-2018-0002 ◽

2018 ◽

Vol 46 (3) ◽

pp. 147-160 ◽

Cited By ~ 2

Author(s):

Laouni Djafri ◽

Djamel Amar Bensaber ◽

Reda Adjoudj

Keyword(s):

Big Data ◽

Data Analytics ◽

Sampling Method ◽

New Technologies ◽

Predictive Analytics ◽

Big Data Analytics ◽

Sampling Strategy ◽

Original Data ◽

Data Set ◽

Content Type

Purpose This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in the shortest possible time. Design/methodology/approach This paper is divided into two parts. The first one is to improve the result of the prediction. In this part, two ideas are proposed: the double pruning enhanced random forest algorithm and extracting a shared learning base from the stratified random sampling method to obtain a representative learning base of all original data. The second part proposes to design a distributed architecture supported by new technologies solutions, which in turn works in a coherent and efficient way with the sampling strategy under the supervision of the Map-Reduce algorithm. Findings The representative learning base obtained by the integration of two learning bases, the partial base and the shared base, presents an excellent representation of the original data set and gives very good results of the Big Data predictive analytics. Furthermore, these results were supported by the improved random forests supervised learning method, which played a key role in this context. Originality/value All companies are concerned, especially those with large amounts of information and want to screen them to improve their knowledge for the customer and optimize their campaigns.

Download Full-text

Open Source Platform for Federated Spatiotemporal Analysis

10.5194/egusphere-egu2020-4203 ◽

2020 ◽

Author(s):

Thomas Huang

Keyword(s):

Open Source ◽

Programming Languages ◽

Data Analytics ◽

Earth Science ◽

Big Data Analytics ◽

Spatiotemporal Analysis ◽

Science Data ◽

Scientific Investigations ◽

Data Infrastructures ◽

Global Data

<p>In recent years, NASA has invested significantly in developing an Analytics Center Framework (ACF) to encapsulate the scalable computational and data infrastructures and to harmonize data, tools and computation resources to enable scientific investigations. Since 2017, the Apache&#8217;s Science Data Analytics Platform (SDAP) (https://sdap.apache.org) has been adapted by various NASA-funded projects, including the NASA Sea Level Change Portal, GRACE and GRACE-FO missions, the CEOS Ocean Variables Enabling Research and Applications for GEO (COVERAGE) Initiative, etc. With much of existing approaches to Earth Science analysis are focusing on collocating all the relevant data under one system, running on the cloud, this open source platform empowers the global data centers to take on a federated analytics approach. With the growing community of SDAP centers, it is now possible for researcher to interactively analyze observational and model data hosted on different centers without having to collocate or download data to their own computing environment. This talk discusses the application of this professional open source big data analytics platform to establish a growing community of SDAP-based ACFs to enable distributed spatiotemporal analysis from any platform, using any programming languages.</p>

Download Full-text

COMBAT-TB-NeoDB: fostering tuberculosis research through integrative analysis using graph database technologies

Bioinformatics ◽

10.1093/bioinformatics/btz658 ◽

2019 ◽

Author(s):

Thoba Lose ◽

Peter van Heusden ◽

Alan Christoffels

Keyword(s):

Graph Model ◽

Cost Effective ◽

Supplementary Information ◽

Omics Data ◽

Graph Database ◽

Biological Databases ◽

Supplementary Data ◽

Genomic Technologies ◽

Effective Generation ◽

Federated Queries

Abstract Motivation Recent advancements in genomic technologies have enabled high throughput cost-effective generation of ‘omics’ data from M.tuberculosis (M.tb) isolates, which then gets shared via a number of heterogeneous publicly available biological databases. Albeit useful, fragmented curation negatively impacts the researcher’s ability to leverage the data via federated queries. Results We present Combat-TB-NeoDB, an integrated M.tb ‘omics’ knowledge-base. Combat-TB-NeoDB is based on Neo4j and was created by binding the labeled property graph model to a suitable ontology namely Chado. Combat-TB-NeoDB enables researchers to execute complex federated queries by linking prominent biological databases, and supplementary M.tb variants data from published literature. Availability and implementation The Combat-TB-NeoDB (https://neodb.sanbi.ac.za) repository and all tools mentioned in this manuscript are freely available at https://github.com/COMBAT-TB. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

DESIGN AND EVALUATION OF A BIM-GIS INTEGRATED INFORMATION MODEL USING RDF GRAPH DATABASE

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-viii-4-w2-2021-175-2021 ◽

2021 ◽

Vol VIII-4/W2-2021 ◽

pp. 175-182

Author(s):

A.-H. Hor ◽

G. Sohn

Keyword(s):

Performance Metrics ◽

Graph Model ◽

Information Model ◽

Query Languages ◽

Semantic Integration ◽

Graph Database ◽

Graph Databases ◽

Rdf Graph ◽

Functional Components ◽

Integrated Information

Abstract. The semantic integration modeling of BIM industry foundations classes and GIS City-geographic markup language are a milestone for many applications that involve both domains of knowledge. In this paper, we propose a system design architecture, and implementation of Extraction, Transformation and Loading (ETL) workflows of BIM and GIS model into RDF graph database model, these workflows were created from functional components and ontological frameworks supporting RDF SPARQL and graph databases Cypher query languages. This paper is about full understanding of whether RDF graph database is suitable for a BIM-GIS integrated information model, and it looks deeper into the assessment of translation workflows and evaluating performance metrics of a BIM-GIS integrated data model managed in an RDF graph database, the process requires designing and developing various pipelines of workflows with semantic tools in order to get the data and its structure into an appropriate format and demonstrate the potential of using RDF graph databases to integrate, manage and analyze information and relationships from both GIS and BIM models, the study also has introduced the concepts of Graph-Model occupancy indexes of nodes, attributes and relationships to measure queries outputs and giving insights on data richness and performance of the resulting BIM-GIS semantically integrated model.

Download Full-text

The Art of Data Science and Big Data Analytics: Inspecting and Transforming Data

Asian Journal of Computer Science and Technology ◽

10.51983/ajcst-2020.9.1.2151 ◽

2020 ◽

Vol 9 (1) ◽

pp. 45-56

Author(s):

Akella Subhadra

Keyword(s):

Big Data ◽

Data Analysis ◽

Data Storage ◽

Data Analytics ◽

Data Science ◽

New Technologies ◽

Big Data Analytics ◽

Business Strategies ◽

Science Data ◽

Using Data

Data Science is associated with new discoveries, the discovery of value from the data. It is a practice of deriving insights and developing business strategies through transformation of data in to useful information. It has been evaluated as a scientific field and research evolution in disciplines like statistics, computing science, intelligence science, and practical transformation in the domains like science, engineering, public sector, business and lifestyle. The field encompasses the larger areas of artificial intelligence, data analytics, machine learning, pattern recognition, natural language understanding, and big data manipulation. It also tackles related new scientific challenges, ranging from data capture, creation, storage, retrieval, sharing, analysis, optimization, and visualization, to integrative analysis across heterogeneous and interdependent complex resources for better decision-making, collaboration, and, ultimately, value creation. In this paper we entitled epicycles of analysis, formal modeling, from data analysis to data science, data analytics -A keystone of data science, The Big data is not a single technology but an amalgamation of old and new technologies that assistance companies gain actionable awareness. The big data is vital because it manages, store and manipulates large amount of data at the desirable speed and time. Big data addresses detached requirements, in other words the amalgamate of multiple un-associated datasets, processing of large amounts of amorphous data and harvesting of unseen information in a time-sensitive generation. As businesses struggle to stay up with changing market requirements, some companies are finding creative ways to use Big Data to their growing business needs and increasingly complex problems. As organizations evolve their processes and see the opportunities that Big Data can provide, they struggle to beyond traditional Business Intelligence activities, like using data to populate reports and dashboards, and move toward Data Science- driven projects that plan to answer more open-ended and sophisticated questions. Although some organizations are fortunate to have data scientists, most are not, because there is a growing talent gap that makes finding and hiring data scientists in a timely manner is difficult. This paper, aimed to demonstrate a close view about Data science, big data, including big data concepts like data storage, data processing, and data analysis of these technological developments, we also provide brief description about big data analytics and its characteristics , data structures, data analytics life cycle, emphasizes critical points on these issues.

Download Full-text

Research on the integrated development of marginal characteristic villages from the perspective of popular science tourism

E3S Web of Conferences ◽

10.1051/e3sconf/202125103045 ◽

2021 ◽

Vol 251 ◽

pp. 03045

Author(s):

Qiang Liu ◽

Hanfang Liu

Keyword(s):

National Level ◽

Popular Science ◽

Theoretical Research ◽

Initial Slope ◽

Planning Team ◽

Double Edge ◽

Rural Revitalization ◽

The Government ◽

Integration Planning ◽

The Village

Since the government elevated the rural revitalization strategy to the national level in 2018, the rural revitalization work has been effectively promoted and developed across the country, but there is a “double-edge” rural type that deserves attention.They are on the edge of social attention and investment because they are not listed in the List of Traditional Villages or Beautiful Village, and they often have great potential for tourism development by virtue of geographical advantages, rich regional resources and unique cultural resources.This paper focuses on the tourism integration planning and development of this kind of “double-edge” villages. From the micro-scale of the village, combined with the rural revitalization, this paper takes the ancient village of Zhengjiawopo in Jinan as the specific foothold to carry out the planning pilot study. After nearly two years of theoretical research and practical exploration by a multidisciplinary planning team, with the beep nest village planning and development results of the initial slope.

Download Full-text