Data mining process for modeling hydrological time series

2012 ◽  
Vol 44 (1) ◽  
pp. 78-88 ◽  
Author(s):  
M. Erol Keskin ◽  
Dilek Taylan ◽  
Ecir Ugur Kucuksille

The main purpose of this study was to develop an optimum flow prediction model, based on data mining process. The data mining process was applied to predict river flow of Seyhan Stream in the southern part of Turkey. Hydrological time series modeling was applied using monthly historical flow records to predict Seyhan Stream flows. Seyhan Stream flows were modeled by Markov models and it was seen that it adapted AR(2). Hence, Ft–2 and Ft–1 flows in (t–2) and (t–1) months were the taken inputs. For monthly streamflow predictions, data were taken from the General Directorate of Electrical Power Resources Survey and Development Administration. Used data covered 35 years between 1969 and 2003 for monthly streamflows. Furthermore, for the effect of monthly periodicity in hydrological time series cos (2πi/12), sin (2πi/12) (I = 1, 2,…, 12) were included as inputs. Then, Ft flows in (t) months were modeled by data mining process. It was concluded that with using data mining process for streamflow prediction, it was possible to estimate missing or unmeasured data.

Author(s):  
Goran Klepac

A business case describes a problem present in all insurance companies: portfolio risk evaluation. Such analysis deals with determining the risk level as well as main risk factors. In the specific case, an insurance company is faced with market share growth and profit decline. Discovered knowledge about the level of risk and main risk factors was not used to increase premium for the riskiest portfolio segments due to a specific market situation, which could lead to loss of clients in the long run. Instead, additional analysis was conducted using data mining methods resulting in a solution, which stopped further profit decline and lowered the risk level for the riskiest portfolio segments. The central role for the unexpected revealed knowledge in the chapter acts as the REFII model. The REFII model is an authorial mathematical model for time series data mining. The main purpose of that model is to automate time series analysis, through a unique transformation model of time series.


2019 ◽  
Vol 41 (7) ◽  
pp. 2457-2476
Author(s):  
J. P. S. Werner ◽  
S. R. De M. Oliveira ◽  
J. C. D. M. Esquerdo

Author(s):  
Arnis Kirshners

The article examines the problem of processing short time series for bioinformatics tasks using data mining methods in the field of pharmacology. The experiments were conducted using heart contraction (contraction and relaxation) power data that were obtained in experiments with laboratory animals with the goal of registering the power changes of heart contractions in different stages of experiment in a given period of time. The selected data were treated using data preprocessing technologies. The short time series were compared using various time-point similarity search methods using agglomerative hierarchical clustering, k- means clustering, modified k-means clustering and expectation-maximization clustering algorithms. Based on the clustering result evaluation the most suitable algorithm was chosen and the optimal number of clusters was determined for the least clustering error. The acquired clusters were used for to create cluster prototypes that aggregate the groups of similar heart contraction power objects. The article offers an examination of the errors produced by algorithms and methods as well as a discussion of the obtained clustering results using different evaluation methodologies. It also gives conclusions about the application of data mining methods in solving bioinformatics tasks and outlines further research directions.


2018 ◽  
Vol 4 (5) ◽  
pp. 1135 ◽  
Author(s):  
Hamed Zamanisabzi ◽  
James Phillip King ◽  
Naci Dilekli ◽  
Bahareh Shoghli ◽  
Shalamu Abudu

This study illustrates the benefits of data pre-processing through supervised data-mining techniques and utilizing those processed data in an artificial neural networks (ANNs) for streamflow prediction. Two major categories of physical parameters such as snowpack data and time-dependent trend indices were utilized as predictors of streamflow values.  Correlation analysis of different models indicate that, for the period of January to June, using fewer predictors led to simpler modeling with equivalent accuracy on daily prediction models. This did not hold in all periods. For monthly prediction models, accuracy was improved compared to earlier works done to predict monthly streamflow for the same case of Elephant Butte Reservoir (EB), NM. Overall, superior prediction performance was achieved by utilizing data-mining techniques for pre-processing historical data, extracting the most effective predictors, correlation analysis, extracting and utilizing combined climate variability indices, physical indices, and employing several developed ANNs for different prediction periods of the year.


As time-series data are eventually large the discovery of knowledge from these massive data seems to be a challenge issue. The similarity measure plays a primary role in time series data mining, which improves the accuracy of data mining task. Time series data mining is used to mine all useful knowledge from the profile of data. Obviously, we have a potential to perform these works, but it leads to a vague crisis. This paper involves a survey regarding time series technique and its related issues like challenges, preprocessing methods, pattern mining and rule discovery using data mining. Streaming of data is one of the difficult tasks that should be managed over time. Thus, this paper can provide a basic and prominent knowledge about time series in data mining research field.


Author(s):  
Shivani K. Purohit ◽  
Ashish K. Sharma

Quality Function Deployment (QFD) is widely used customer driven process for product development. Thus, Customer Requirements (CRs) play a key role in QFD process. However, the diversification in marketplace makes these CRs more dynamic and changing, giving rise the need to forecast CRs to improve competitiveness and increase customer satisfaction. The purpose can be served by using Data Mining techniques of forecasting. With the pool of forecasting techniques available, it is important to evaluate a suitable one for more effective results. To this end, the paper presents a novel software tool to efficiently forecast CRs in QFD. The tool allows for forecasting using various data mining based time series analysis techniques that strongly assists in doing comparative analysis and evaluating out the most apt technique for forecasting of CRs. The tool is developed using VB.Net and MS-Access. Finally, an example is presented to demonstrate the practicability of proposed software tool.


Sign in / Sign up

Export Citation Format

Share Document