scholarly journals J48SS: A Novel Decision Tree Approach for the Handling of Sequential and Time Series Data

Computers ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 21 ◽  
Author(s):  
Andrea Brunello ◽  
Enrico Marzano ◽  
Angelo Montanari ◽  
Guido Sciavicco

Temporal information plays a very important role in many analysis tasks, and can be encoded in at least two different ways. It can be modeled by discrete sequences of events as, for example, in the business intelligence domain, with the aim of tracking the evolution of customer behaviors over time. Alternatively, it can be represented by time series, as in the stock market to characterize price histories. In some analysis tasks, temporal information is complemented by other kinds of data, which may be represented by static attributes, e.g., categorical or numerical ones. This paper presents J48SS, a novel decision tree inducer capable of natively mixing static (i.e., numerical and categorical), sequential, and time series data for classification purposes. The novel algorithm is based on the popular C4.5 decision tree learner, and it relies on the concepts of frequent pattern extraction and time series shapelet generation. The algorithm is evaluated on a text classification task in a real business setting, as well as on a selection of public UCR time series datasets. Results show that it is capable of providing competitive classification performances, while generating highly interpretable models and effectively reducing the data preparation effort.

Author(s):  
Anne Denton

Time series data is of interest to most science and engineering disciplines and analysis techniques have been developed for hundreds of years. There have, however, in recent years been new developments in data mining techniques, such as frequent pattern mining, that take a different perspective of data. Traditional techniques were not meant for such pattern-oriented approaches. There is, as a result, a significant need for research that extends traditional time-series analysis, in particular clustering, to the requirements of the new data mining algorithms.


Infotekmesin ◽  
2021 ◽  
Vol 12 (2) ◽  
pp. 150-154
Author(s):  
Yunita Ardilla ◽  
Wilda Imama Sabilla ◽  
Nurissaidah Ulinnuha

Classification is a field of data mining that has many methods, one of them is decision tree. Decision tree is proven to be able to classify many kinds of data such as image data and time series data. However, there are several obstacles that are often encountered in the decision tree method. Running time required for the execution of this algorithm is quite long, so this study proposed to use the ant tree miner algorithm which is a development algorithm from the C4.5 decision tree. Ant tree miner works by utilizing ant colony optimization in the process of building its tree structure. Use ant colony optimization expected can optimize the tree that will be formed. From the testing that have been carried out, an accuracy of about 95% is obtained in the process of classifying Zoo dataset with the number of ants between 60 - 90.


2020 ◽  
Author(s):  
Mark Amo-Boateng

ABSTRACTThe novel coronavirus disease (COVID-19) and pandemic has taken the world by surprise and simultaneously challenged the health infrastructure of every country. Governments have resorted to draconian measures to contain the spread of the disease despite its devastating effect on their economies and education. Tracking the novel coronavirus 2019 disease remains vital as it influences the executive decisions needed to tighten or ease restrictions meant to curb the pandemic. One-Dimensional (1D) Convolution Neural Networks (CNN) have been used classify and predict several time-series and sequence data. Here 1D-CNN is applied to the time-series data of confirmed COVID-19 cases for all reporting countries and territories. The model performance was 90.5% accurate. The model was used to develop an automated AI tracker web app (AI Country Monitor) and is hosted on https://aicountrymonitor.org. This article also presents a novel concept of pandemic response curves based on cumulative confirmed cases that can be use to classify the stage of a country or reporting territory. It is our firm believe that this Artificial Intelligence COVID-19 tracker can be extended to other domains such as the monitoring/tracking of Sustainable Development Goals (SDGs) in addition to monitoring and tracking pandemics.


2021 ◽  
Vol 1 (3) ◽  
pp. 166-181
Author(s):  
Muhammad Adib Uz Zaman ◽  
Dongping Du

Electronic health records (EHRs) can be very difficult to analyze since they usually contain many missing values. To build an efficient predictive model, a complete dataset is necessary. An EHR usually contains high-dimensional longitudinal time series data. Most commonly used imputation methods do not consider the importance of temporal information embedded in EHR data. Besides, most time-dependent neural networks such as recurrent neural networks (RNNs) inherently consider the time steps to be equal, which in many cases, is not appropriate. This study presents a method using the gated recurrent unit (GRU), neural ordinary differential equations (ODEs), and Bayesian estimation to incorporate the temporal information and impute sporadically observed time series measurements in high-dimensional EHR data.


Author(s):  
Nicholas Hoernle ◽  
Kobi Gal ◽  
Barbara Grosz ◽  
Leilah Lyons ◽  
Ada Ren ◽  
...  

This paper describes methods for comparative evaluation of the interpretability of models of high dimensional time series data inferred by unsupervised machine learning algorithms. The time series data used in this investigation were logs from an immersive simulation like those commonly used in education and healthcare training. The structures learnt by the models provide representations of participants' activities in the simulation which are intended to be meaningful to people's interpretation. To choose the model that induces the best representation, we designed two interpretability tests, each of which evaluates the extent to which a model’s output aligns with people’s expectations or intuitions of what has occurred in the simulation. We compared the performance of the models on these interpretability tests to their performance on statistical information criteria. We show that the models that optimize interpretability quality differ from those that optimize (statistical) information theoretic criteria. Furthermore, we found that a model using a fully Bayesian approach performed well on both the statistical and human-interpretability measures. The Bayesian approach is a good candidate for fully automated model selection, i.e., when direct empirical investigations of interpretability are costly or infeasible.


2017 ◽  
Vol 23 (4) ◽  
pp. 1563-1585 ◽  
Author(s):  
Markus Eberhardt

I revisit the popular concern over a nonlinearity or threshold in the relationship between public debt and growth employing long time series data from up to 27 countries. My empirical approach recognizes that standard time series arguments for long-run equilibrium relations between integrated variables (cointegration) break down in nonlinear specifications such as those predominantly applied in the existing debt–growth literature. Adopting the novel cosummability approach, my analysis overcomes these difficulties to find no evidence for a systematic long-run relationship between debt and growth in the bivariate and economic theory-based multivariate specifications popular in this literature.


Author(s):  
Anne Denton

Time series data is of interest to most science and engineering disciplines and analysis techniques have been developed for hundreds of years. There have, however, in recent years been new developments in data mining techniques, such as frequent pattern mining, which take a different perspective of data. Traditional techniques were not meant for such pattern-oriented approaches. There is, as a result, a significant need for research that extends traditional time-series analysis, in particular clustering, to the requirements of the new data mining algorithms.


Sign in / Sign up

Export Citation Format

Share Document