Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps

Data mining for building knowledge bases: techniques, architectures and applications

The Knowledge Engineering Review ◽

10.1017/s0269888916000047 ◽

2016 ◽

Vol 31 (2) ◽

pp. 97-123 ◽

Cited By ~ 4

Author(s):

Alfred Krzywicki ◽

Wayne Wobcke ◽

Michael Bain ◽

John Calvo Martinez ◽

Paul Compton

Keyword(s):

Data Mining ◽

Knowledge Base ◽

Question Answering ◽

Knowledge Bases ◽

Event Extraction ◽

Data Sources ◽

Small Scale ◽

Knowledge Mining ◽

Practical Applications ◽

Unstructured Text

AbstractData mining techniques for extracting knowledge from text have been applied extensively to applications including question answering, document summarisation, event extraction and trend monitoring. However, current methods have mainly been tested on small-scale customised data sets for specific purposes. The availability of large volumes of data and high-velocity data streams (such as social media feeds) motivates the need to automatically extract knowledge from such data sources and to generalise existing approaches to more practical applications. Recently, several architectures have been proposed for what we callknowledge mining: integrating data mining for knowledge extraction from unstructured text (possibly making use of a knowledge base), and at the same time, consistently incorporating this new information into the knowledge base. After describing a number of existing knowledge mining systems, we review the state-of-the-art literature on both current text mining methods (emphasising stream mining) and techniques for the construction and maintenance of knowledge bases. In particular, we focus on mining entities and relations from unstructured text data sources, entity disambiguation, entity linking and question answering. We conclude by highlighting general trends in knowledge mining research and identifying problems that require further research to enable more extensive use of knowledge bases.

Download Full-text

A gamma-ray spectrometry approach to field separation of illuviation-type WRB reference soil groups in northern Thailand

Journal of Plant Nutrition and Soil Science ◽

10.1002/jpln.200800323 ◽

2010 ◽

Vol 174 (4) ◽

pp. 536-544 ◽

Cited By ~ 10

Author(s):

Ulrich Schuler ◽

Petra Erbe ◽

Mehdi Zarei ◽

Wanida Rangubpit ◽

Adichat Surinkum ◽

...

Keyword(s):

Gamma Ray ◽

Gamma Ray Spectrometry ◽

Northern Thailand ◽

Reference Soil ◽

Field Separation ◽

Soil Groups

Download Full-text

Deriving World Reference Base Reference Soil Groups from the prospective Global Soil Map product — A case study on major soil types of Africa

Geoderma ◽

10.1016/j.geoderma.2015.07.005 ◽

2016 ◽

Vol 263 ◽

pp. 226-233 ◽

Cited By ~ 7

Author(s):

Vince Láng ◽

Márta Fuchs ◽

Tamás Szegi ◽

Ádám Csorba ◽

Erika Michéli

Keyword(s):

Soil Types ◽

Reference Base ◽

Soil Map ◽

Reference Soil ◽

Global Soil ◽

World Reference Base ◽

Base Reference ◽

Soil Groups

Download Full-text

Methods and resources to monitor internet censorship

Library Hi Tech ◽

10.1108/lht-11-2013-0156 ◽

2014 ◽

Vol 32 (4) ◽

pp. 723-739

Author(s):

Ina Fourie ◽

Constance Bitso ◽

Theo J.D. Bothma

Keyword(s):

Data Mining ◽

Design Methodology ◽

Freedom Of Expression ◽

Small Scale ◽

Extensive Literature ◽

Access To Information ◽

Content Type ◽

Internet Censorship ◽

Number Of Publications ◽

Country Specific

Purpose – The purpose of this paper is to raise awareness of the importance for library and information services (LIS) to take the responsibility to find a manageable way to regularly monitor internet censorship in their countries, and to suggest a framework for such monitoring and to encourage manageable on-going small scale research projects. Design/methodology/approach – The paper follows on contract research for the IFLA Committee on Freedom of Access to Information and Freedom of Expression on country specific trends in internet censorship. Based on an extensive literature survey (not fully reflected here) and data mining, a framework is suggested for regular monitoring of country specific negative and positive trends in internet censorship. The framework addresses search strategies and information resources; setting up alerting services; noting resources for data mining; a detailed break-down and systematic monitoring of negative and positive trends; the need for reflection on implications, assessment of need(s) for concern (or not) and generation of suggestions for actions; sharing findings with the LIS community and wider society; and raising sensitivity for internet censorship as well as advocacy and lobbying against internet censorship. Apart from monitoring internet censorship, the framework is intended to encourage manageable on-going small scale research. Findings – A framework of internet censorship monitoring can support the regular, systematic and comprehensive monitoring of known as well as emerging negative and positive trends in a country, and can promote timely expressions of concerns and appropriate actions by LIS. It can support sensitivity to the dangers of internet censorship and raise LIS’ levels of self-efficacy in dealing with internet censorship and doing manageable, small scale research in this regard. Originality/value – Although a number of publications have appeared on internet censorship these do not offer a framework for monitoring internet censorship and encouraging manageable on-going small scale research in this regard.

Download Full-text

Methodologies for Microstructures

10.1093/oso/9780199672363.003.0009 ◽

2018 ◽

Author(s):

Phanish Puranam

Keyword(s):

Data Mining ◽

Field Experiment ◽

Critical Role ◽

Scale Up ◽

Small Scale ◽

Agent Based Model ◽

Agent Based ◽

Careful Observation ◽

Underlying Mechanisms ◽

Multiple Stages

I review developments in theory and methodology that may allow us to begin creating innovative forms of organizing, rather than rest content with studying them after they have emerged. We now have the conceptual and technical apparatus to prototype organization designs at small scale, cheaply and fast. The process of organization re-design can be seen in terms of multiple stages. It begins with careful observation of phenomena. Qualitative or indeed quantitative induction (i.e. data mining) can play a critical role here. Once we have some understanding or at least conjectures about underlying mechanisms, we can use the behavioral lab or an agent-based model to run cheap experiments to adjust the design. Once we have formulated a new design, we may want to run a field experiment with randomization. If the results look satisfactory, we can scale up and implement.

Download Full-text

Applicability of correlational data-mining to small-scale turbojet performance map generation

International Journal of Turbo and Jet Engines ◽

10.1515/tjj-2021-0062 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Francisco Villarreal-Valderrama ◽

Pedro Juárez-Pérez ◽

Ulises García-Pérez ◽

Luis Amezquita-Brooks

Keyword(s):

Data Mining ◽

Mean Squared Error ◽

Principal Component ◽

Operating Conditions ◽

Small Scale ◽

Inlet Temperature ◽

Large Set ◽

Data Mining Algorithms ◽

Correlational Data ◽

Mining Algorithms

Abstract Turbojet applications benefit from accurate performance models. The aim of this study is to explore the applicability of data-mining algorithms to determine relationships between the generated thrust, the environmental conditions (free stream air-speed, inlet temperature and pressure) and the operating conditions (input fuel flow and shaft angular speed). For this purpose, experimental tests were carried out within wind-tunnel facilities using an experimental single-spool turbojet test bench. It is well-known that a large set of data-mining approaches relies on establishing linear correlations among input and output variables. The scope of this article is to assess the applicability of correlational data-mining approaches by i) an exploratory data analysis to find underlying data patterns and ii) principal component regressions to obtain a suitable predictive model for the generated thrust. Validation experiments demonstrated that the data-based model allows capturing the effects of the environmental and operating conditions with good accuracy (Root Mean Squared Error RMSE = 3.5100%), while maintaining a low complexity in the resulting structure. These results show that it is possible to generate turbojet experimental performance maps through data-mining algorithms with a correlational approach.

Download Full-text

Predicting reference soil groups using legacy data: A data pruning and Random Forest approach for tropical environment (Dano catchment, Burkina Faso)

Scientific Reports ◽

10.1038/s41598-018-28244-w ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 10

Author(s):

Kpade O. L. Hounkpatin ◽

Karsten Schmidt ◽

Felix Stumpf ◽

Gerald Forkuor ◽

Thorsten Behrens ◽

...

Keyword(s):

Random Forest ◽

Burkina Faso ◽

Tropical Environment ◽

Reference Soil ◽

Legacy Data ◽

Data Pruning ◽

Soil Groups

Download Full-text

Accuracy assessment of different soil databases concerning WRB reference soil groups

Landscape & Environment ◽

10.21120/le/10/1/1 ◽

2016 ◽

Vol 10 (1) ◽

pp. 1-12

Author(s):

Dániel Balla ◽

Orsolya Varga ◽

Marianna Zichar

Keyword(s):

Accuracy Assessment ◽

National Level ◽

Data Access ◽

Kappa Index ◽

Reference Base ◽

Soil Database ◽

Continental Scale ◽

Reference Soil ◽

World Reference Base ◽

Soil Groups

As a result of international cooperation, the conditions of data access and data usage have been significantly improved during the last two decades. Also, the establishment of web-based geoinformatic infrastructure allowed researchers to share their results with the scientific community more efficiently on the international level. The aim of this study is to investigate the accuracy of databases with different spatial resolutions, using the reference profiles of LUCAS topsoil database. In our study, we investigated the accuracy of World Reference Base for Soil Resources (WRB) Reference Soil Groups (RSG) groups stored in freely accessible soil databases (European Soil Database (ESDB), International Soil Reference and Information Centre (ISRIC)) in Hungary. The study concluded that the continental scale database tends to be more accurate. We used the Kappa Index of Agreement (KIA) statistical index to evaluate accuracy. The European and the international databases showed a value of 0.9643 and 0.3968, respectively. Considering the results, we can conclude that the spatial resolution has a relevant impact on the accuracy of databases, however, the study should be extended to the national level and the indices should be assessed together.

Download Full-text

Application of data mining in Iran's Artisanal and Small-Scale mines challenges analysis

Resources Policy ◽

10.1016/j.resourpol.2021.102337 ◽

2021 ◽

Vol 74 ◽

pp. 102337

Author(s):

Reza ShakorShahabi ◽

Ali Nouri Qarahasanlou ◽

Seyed Reza Azimi ◽

Adel Mottahedi

Keyword(s):

Data Mining ◽

Small Scale

Download Full-text

Detailed Study of Clustering Technique In Data Mining with Principle of Data Mining

Issue 4 - Journal of Science and Technology ◽

10.46243/jst.2020.v5.i6.pp111-124 ◽

2020 ◽

pp. 111-124

Keyword(s):

Data Mining ◽

Large Scale ◽

Global Scale ◽

Reliable Data ◽

The Other ◽

Small Scale ◽

Main Approach ◽

Raw Data ◽

Technical Aspects ◽

Clustering Technique

Clustering technique in data mining is a main approach to deal with the data an extraction of useful patterns and knowledge from it. Clustering is involved in the datamining process. Datamining is the way of pulling out the knowledge, information, useful patterns and a reliable data from a huge gigantic amount of raw data as per the needs of the targeted sector. In technical aspects the Data Mining is a way of finding out the useful patterns from the raw data by using the suitable techniques of statistics, Machine learning, and Database techniques. Data mining target two major aspects of extraction of meaning full pattern data for concern of large-scale for better understanding of shapes and profitable patterns of data which impacts globally and the other is small-scale which deals with the lesser impact on the global scale. This paper give a brief overview of Clustering technique under the Data mining process their features and functionality. Majorly concentrate on Clustering technique and their algorithms with the pro’s & con’s and understand the need of clustering and its importance in Data mining process. The Data mining principle is also explained briefly just to build a base to understand the techniques and their importance which has to be discussed

Download Full-text