scholarly journals Constructing Crop Portraits Based on Graph Databases Is Essential to Agricultural Data Mining

Information ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 227
Author(s):  
Yue-Xin Shi ◽  
Bo-Kai Zhang ◽  
Yong-Xiang Wang ◽  
Han-Qian Luo ◽  
Xiang Li

Neo4j is a graph database that can use not only data, but also data relationships. Crop portraits, a kind of property graph, model the crop entity in the real world based on data to realize the networked management of crop knowledge. The existing crop knowledge base has shortcomings such as single crop variety, incomplete description, and lack of agricultural knowledge. Constructing crop portraits can provide a comprehensive description of crops and make up for these shortcomings. This research used agricultural question-and-answer data and popular science data obtained by text crawling as the original data, selected labels to establish a crop portrait that including three categories (crops, pesticides, and diseases and pests), and used the graph database (Neo4j) to store and display these portrait data. Information mining found that the crop portrait revealed the occurrence trend of diseases and pests, exhibited a nonintrinsic connection between different diseases and pests, and provided a variety of pesticides to choose from for control of diseases and pests. The results showed that constructing crop portraits is beneficial to agricultural analysis, and has practical application values and theoretical research prospects in the field of big data analytics.

2021 ◽  
Vol 9 (1) ◽  
pp. 16-44
Author(s):  
Weiqing Zhuang ◽  
Morgan C. Wang ◽  
Ichiro Nakamoto ◽  
Ming Jiang

Abstract Big data analytics (BDA) in e-commerce, which is an emerging field that started in 2006, deeply affects the development of global e-commerce, especially its layout and performance in the U.S. and China. This paper seeks to examine the relative influence of theoretical research of BDA in e-commerce to explain the differences between the U.S. and China by adopting a statistical analysis method on the basis of samples collected from two main literature databases, Web of Science and CNKI, aimed at the U.S. and China. The results of this study help clarify doubts regarding the development of China’s e-commerce, which exceeds that of the U.S. today, in view of the theoretical comparison of BDA in e-commerce between them.


2021 ◽  
Vol 2021 ◽  
pp. 1-20
Author(s):  
Weiqing Zhuang

Big data analytics (BDA) is a wide and deep application in e-commerce, which impacts positively on the global economy, especially the U.S. and China who have done well. This paper seeks to examine the relative influence of theoretical research and practical activities of BDA in e-commerce to explain the differences between the U.S. and China according to the two main literature databases, Web of Science and CNKI, respectively, and by employing other samples that present retail e-commerce sales and the number of some data companies founded in the U.S. and China each year. We further determine the reasons leading to the difference between the U.S. and China in BDA in e-commerce, which can help managers devise appropriate business strategies in e-commerce for each of them, and provide a proof of the significant relationship of theoretical research and practical activities in BDA in e-commerce. In addition, the variables related to big data companies show a moderation effect rather than mediating effect relative to the practice of theoretical research in e-commerce in the United States, but they show a moderate effect and mediating effects in China. The results of this study help clarify doubts regarding the development of China’s e-commerce. Moreover, three orientations in e-commerce using BDA and the use of quantum computing in e-commerce to solve existing e-commerce problems are explored to provide better evidence for decision-making that could be valuable in future research.


Author(s):  
Seta Murdha Pamungkas ◽  
Muhammad Ainul Yaqin ◽  
Kurnia Z. Matondang ◽  
Asfilia N. Anggraini ◽  
Abd. Charis Fauzan

Paper ini bertujuan untuk merancang sebuah WordNet dengan menggunakan database bermetode graph yang akan dirancang dengan tahap awal pengelompokan kata hingga tahap akhir yaitu  pengimplementasian dalam bentuk aplikasi siap pakai. Penelitian ini merancang aplikasi WordNet siap pakai yang mengimplementasikan database berbasis graph dengan menggunakan Metode Waterfall yang akan menjadi dasar alur dalam perancangan aplikasi. Uji coba dalam penelitian ini berupa: pemilahan kata, input kata ke dalam database, pemberian relasi antar kata, uji coba setiap kata dengan kata yang lain, dan implementasi database ke dalam sebuah aplikasi yang siap pakai. Pada penelitian ini menghasilkan sebuah rancangan WordNet yang menggunakan Bahasa Indonesia dengan penggunaan database berbasis graph model sebagai tempat penyimpanan data. Rancangan WordNet tersebut lalu diimplementasikan ke dalam sebuah aplikasi yang siap pakai. Data yang digunakan dalam penelitian didapatkan dari KBBI dan Tesaurus yang dapat diakses melalui media online. Data dikumpulkan secara berelasi hingga membentuk relasi Sinonim, Antonim, Imbuhan, dan Kata Dasar. Penelitian ini menggunakan relasi yang spesifik untuk menghubungkan antar node, tetapi tidak menggunakannya untuk pencarian jarak antar node. Penelitian ini menggabungkan hasil dari penelitian perancangan WordNet dan perancangan database dengan metode graph hingga membentuk sebuah aplikasi siap pakai.


Author(s):  
Hao Chen ◽  
Maria Vasardani ◽  
Stephan Winter ◽  
Martin Tomko

Everyday place descriptions provide a rich source of knowledge about places and their relative locations. This research proposes a place graph model for modeling this spatial, non-spatial, and contextual knowledge from place descriptions. The model extends a prior place graph, and overcomes a number of limitations. The model is implemented using the Neo4j graph database, and a management system has also been developed that allows operations including querying, mapping, and visualizing the stored knowledge in an extended place graph. Then three experimental tasks, namely georeferencing, reasoning, and querying, are selected to demonstrate the superiority of the extended model.


2018 ◽  
Vol 46 (3) ◽  
pp. 147-160 ◽  
Author(s):  
Laouni Djafri ◽  
Djamel Amar Bensaber ◽  
Reda Adjoudj

Purpose This paper aims to solve the problems of big data analytics for prediction including volume, veracity and velocity by improving the prediction result to an acceptable level and in the shortest possible time. Design/methodology/approach This paper is divided into two parts. The first one is to improve the result of the prediction. In this part, two ideas are proposed: the double pruning enhanced random forest algorithm and extracting a shared learning base from the stratified random sampling method to obtain a representative learning base of all original data. The second part proposes to design a distributed architecture supported by new technologies solutions, which in turn works in a coherent and efficient way with the sampling strategy under the supervision of the Map-Reduce algorithm. Findings The representative learning base obtained by the integration of two learning bases, the partial base and the shared base, presents an excellent representation of the original data set and gives very good results of the Big Data predictive analytics. Furthermore, these results were supported by the improved random forests supervised learning method, which played a key role in this context. Originality/value All companies are concerned, especially those with large amounts of information and want to screen them to improve their knowledge for the customer and optimize their campaigns.


2020 ◽  
Author(s):  
Thomas Huang

<p>In recent years, NASA has invested significantly in developing an Analytics Center Framework (ACF) to encapsulate the scalable computational and data infrastructures and to harmonize data, tools and computation resources to enable scientific investigations. Since 2017, the Apache’s Science Data Analytics Platform (SDAP) (https://sdap.apache.org) has been adapted by various NASA-funded projects, including the NASA Sea Level Change Portal, GRACE and GRACE-FO missions, the CEOS Ocean Variables Enabling Research and Applications for GEO (COVERAGE) Initiative, etc. With much of existing approaches to Earth Science analysis are focusing on collocating all the relevant data under one system, running on the cloud, this open source platform empowers the global data centers to take on a federated analytics approach. With the growing community of SDAP centers, it is now possible for researcher to interactively analyze observational and model data hosted on different centers without having to collocate or download data to their own computing environment. This talk discusses the application of this professional open source big data analytics platform to establish a growing community of SDAP-based ACFs to enable distributed spatiotemporal analysis from any platform, using any programming languages.</p>


2019 ◽  
Author(s):  
Thoba Lose ◽  
Peter van Heusden ◽  
Alan Christoffels

Abstract Motivation Recent advancements in genomic technologies have enabled high throughput cost-effective generation of ‘omics’ data from M.tuberculosis (M.tb) isolates, which then gets shared via a number of heterogeneous publicly available biological databases. Albeit useful, fragmented curation negatively impacts the researcher’s ability to leverage the data via federated queries. Results We present Combat-TB-NeoDB, an integrated M.tb ‘omics’ knowledge-base. Combat-TB-NeoDB is based on Neo4j and was created by binding the labeled property graph model to a suitable ontology namely Chado. Combat-TB-NeoDB enables researchers to execute complex federated queries by linking prominent biological databases, and supplementary M.tb variants data from published literature. Availability and implementation The Combat-TB-NeoDB (https://neodb.sanbi.ac.za) repository and all tools mentioned in this manuscript are freely available at https://github.com/COMBAT-TB. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
A.-H. Hor ◽  
G. Sohn

Abstract. The semantic integration modeling of BIM industry foundations classes and GIS City-geographic markup language are a milestone for many applications that involve both domains of knowledge. In this paper, we propose a system design architecture, and implementation of Extraction, Transformation and Loading (ETL) workflows of BIM and GIS model into RDF graph database model, these workflows were created from functional components and ontological frameworks supporting RDF SPARQL and graph databases Cypher query languages. This paper is about full understanding of whether RDF graph database is suitable for a BIM-GIS integrated information model, and it looks deeper into the assessment of translation workflows and evaluating performance metrics of a BIM-GIS integrated data model managed in an RDF graph database, the process requires designing and developing various pipelines of workflows with semantic tools in order to get the data and its structure into an appropriate format and demonstrate the potential of using RDF graph databases to integrate, manage and analyze information and relationships from both GIS and BIM models, the study also has introduced the concepts of Graph-Model occupancy indexes of nodes, attributes and relationships to measure queries outputs and giving insights on data richness and performance of the resulting BIM-GIS semantically integrated model.


2020 ◽  
Vol 9 (1) ◽  
pp. 45-56
Author(s):  
Akella Subhadra

Data Science is associated with new discoveries, the discovery of value from the data. It is a practice of deriving insights and developing business strategies through transformation of data in to useful information. It has been evaluated as a scientific field and research evolution in disciplines like statistics, computing science, intelligence science, and practical transformation in the domains like science, engineering, public sector, business and lifestyle. The field encompasses the larger areas of artificial intelligence, data analytics, machine learning, pattern recognition, natural language understanding, and big data manipulation. It also tackles related new scientific challenges, ranging from data capture, creation, storage, retrieval, sharing, analysis, optimization, and visualization, to integrative analysis across heterogeneous and interdependent complex resources for better decision-making, collaboration, and, ultimately, value creation. In this paper we entitled epicycles of analysis, formal modeling, from data analysis to data science, data analytics -A keystone of data science, The Big data is not a single technology but an amalgamation of old and new technologies that assistance companies gain actionable awareness. The big data is vital because it manages, store and manipulates large amount of data at the desirable speed and time. Big data addresses detached requirements, in other words the amalgamate of multiple un-associated datasets, processing of large amounts of amorphous data and harvesting of unseen information in a time-sensitive generation. As businesses struggle to stay up with changing market requirements, some companies are finding creative ways to use Big Data to their growing business needs and increasingly complex problems. As organizations evolve their processes and see the opportunities that Big Data can provide, they struggle to beyond traditional Business Intelligence activities, like using data to populate reports and dashboards, and move toward Data Science- driven projects that plan to answer more open-ended and sophisticated questions. Although some organizations are fortunate to have data scientists, most are not, because there is a growing talent gap that makes finding and hiring data scientists in a timely manner is difficult. This paper, aimed to demonstrate a close view about Data science, big data, including big data concepts like data storage, data processing, and data analysis of these technological developments, we also provide brief description about big data analytics and its characteristics , data structures, data analytics life cycle, emphasizes critical points on these issues.


2021 ◽  
Vol 251 ◽  
pp. 03045
Author(s):  
Qiang Liu ◽  
Hanfang Liu

Since the government elevated the rural revitalization strategy to the national level in 2018, the rural revitalization work has been effectively promoted and developed across the country, but there is a “double-edge” rural type that deserves attention.They are on the edge of social attention and investment because they are not listed in the List of Traditional Villages or Beautiful Village, and they often have great potential for tourism development by virtue of geographical advantages, rich regional resources and unique cultural resources.This paper focuses on the tourism integration planning and development of this kind of “double-edge” villages. From the micro-scale of the village, combined with the rural revitalization, this paper takes the ancient village of Zhengjiawopo in Jinan as the specific foothold to carry out the planning pilot study. After nearly two years of theoretical research and practical exploration by a multidisciplinary planning team, with the beep nest village planning and development results of the initial slope.


Sign in / Sign up

Export Citation Format

Share Document