Forecasting of Categorical Time Series Using a Regression Model

The interdisciplinary nature of sports and the presence of various systemic and non-systemic factors introduce challenges in predicting sports match outcomes using a single disciplinary approach. In contrast to previous studies that use sports performance metrics and statistical models, this study is the first to apply a deep learning approach in financial time series modeling to predict sports match outcomes. The proposed approach has two main components: a convolutional neural network (CNN) classifier for implicit pattern recognition and a logistic regression model for match outcome judgment. First, the raw data used in the prediction are derived from the betting market odds and actual scores of each game, which are transformed into sports candlesticks. Second, CNN is used to classify the candlesticks time series on a graphical basis. To this end, the original 1D time series are encoded into 2D matrix images using Gramian angular field and are then fed into the CNN classifier. In this way, the winning probability of each matchup team can be derived based on historically implied behavioral patterns. Third, to further consider the differences between strong and weak teams, the CNN classifier adjusts the probability of winning the match by using the logistic regression model and then makes a final judgment regarding the match outcome. We empirically test this approach using 18,944 National Football League game data spanning 32 years and find that using the individual historical data of each team in the CNN classifier for pattern recognition is better than using the data of all teams. The CNN in conjunction with the logistic regression judgment model outperforms the CNN in conjunction with SVM, Naïve Bayes, Adaboost, J48, and random forest, and its accuracy surpasses that of betting market prediction.

Download Full-text

Analyzing categorical time series in the presence of missing observations

Statistics in Medicine ◽

10.1002/sim.9089 ◽

2021 ◽

Author(s):

Christian H. Weiß

Keyword(s):

Time Series ◽

Missing Observations ◽

Categorical Time Series

Download Full-text

The predictive performance of the time-series model and the regression model of the income velocity of money

Journal of Banking & Finance ◽

10.1016/s0378-4266(84)80060-6 ◽

1984 ◽

Vol 8 (3) ◽

pp. 389-415 ◽

Cited By ~ 1

Author(s):

Francis W. Ahking

Keyword(s):

Time Series ◽

Regression Model ◽

Predictive Performance ◽

Time Series Model ◽

Velocity Of Money

Download Full-text

Using the AR–SVR–CPSO hybrid model to forecast vibration signals in a high-speed train transmission system

Proceedings of the Institution of Mechanical Engineers Part F Journal of Rail and Rapid Transit ◽

10.1177/0954409718804908 ◽

2018 ◽

Vol 233 (7) ◽

pp. 701-714

Author(s):

Yumei Liu ◽

Ningguo Qiao ◽

Congcong Zhao ◽

Jiaojiao Zhuang ◽

Guangdong Tian

Keyword(s):

Time Series ◽

Regression Model ◽

Support Vector Regression ◽

Hybrid Model ◽

High Speed ◽

Support Vector ◽

High Speed Train ◽

Support Vector Regression Model ◽

Vibration Time ◽

Auto Regression

Accurate vibration time series modeling can mine the internal law of data and provide valuable references for reliability assessment. To improve the prediction accuracy, this study proposes a hybrid model – called the AR–SVR–CPSO hybrid model – that combines the auto regression (AR) and support vector regression (SVR) models, with the weights optimized by the chaotic particle swarm optimization (CPSO) algorithm. First, the auto regression model with the difference method is employed to model the vibration time series. Second, the support vector regression model with the phase space reconstruction is constructed for predicting the vibration time series once more. Finally, the predictions of the AR and SVR models are weighted and summed together, with the weights being optimized by the CPSO. In addition, the data collected from the reliability test platform of high-speed train transmission systems and the “NASA prognostics data repository” are used to validate the hybrid model. The experimental results demonstrate that the hybrid model proposed in this study outperforms the traditional AR and SVR models.

Download Full-text

Forecasting stock market price by using fuzzified Choquet integral based fuzzy measures with genetic algorithm for parameter optimization

RAIRO - Operations Research ◽

10.1051/ro/2019117 ◽

2020 ◽

Vol 54 (2) ◽

pp. 597-614

Author(s):

Shanoli Samui Pal ◽

Samarjit Kar

Keyword(s):

Genetic Algorithm ◽

Time Series ◽

Linear Regression ◽

Regression Model ◽

Choquet Integral ◽

Time Series Forecasting ◽

Fuzzy Measure ◽

Model Parameters ◽

Forecasting Models ◽

Non Linear

In this paper, fuzzified Choquet integral and fuzzy-valued integrand with respect to separate measures like fuzzy measure, signed fuzzy measure and intuitionistic fuzzy measure are used to develop regression model for forecasting. Fuzzified Choquet integral is used to build a regression model for forecasting time series with multiple attributes as predictor attributes. Linear regression based forecasting models are suffering from low accuracy and unable to approximate the non-linearity in time series. Whereas Choquet integral can be used as a general non-linear regression model with respect to non classical measures. In the Choquet integral based regression model parameters are optimized by using a real coded genetic algorithm (GA). In these forecasting models, fuzzified integrands denote the participation of an individual attribute or a group of attributes to predict the current situation. Here, more generalized Choquet integral, i.e., fuzzified Choquet integral is used in case of non-linear time series forecasting models. Three different real stock exchange data are used to predict the time series forecasting model. It is observed that the accuracy of prediction models highly depends on the non-linearity of the time series.

Download Full-text

A time series analysis of bulk tank somatic cell counts of dairy herds located in Brazil and the United States

Ciência Rural ◽

10.1590/0103-8478cr20160618 ◽

2017 ◽

Vol 47 (4) ◽

Cited By ~ 1

Author(s):

Liz Gonçalves Rodrigues ◽

Maria Helena Cosendey de Aquino ◽

Márcio Roberto Silva ◽

Letícia Caldas Mendonça ◽

Juliana França Monteiro de Mendonça ◽

...

Keyword(s):

Time Series ◽

Linear Regression ◽

Regression Model ◽

Linear Regression Model ◽

Southeastern Brazil ◽

Dairy Herds ◽

Somatic Cell Counts ◽

Cell Counts ◽

The Usa ◽

Bulk Tank

ABSTRACT: Bulk tank somatic cell counts (BTSCC) is widely used to monitore the mammary gland health at the herd and regional level. The BTSCC time series from specific regions or countries can be used to compare the mammary gland health and estimate the trend of subclinical mastitis at the regional level. Three time series of BTSCC from dairy herds located in the USA and the Southeastern Brazil were evaluated from 1995 to 2014. Descriptive statistics and a linear regression model were used to evaluate the data of the BTSCC time series. The mean of annual geometric mean of BTSCC (AGM) and the percentage of dairy herds with a BTSCC greater than 400,000 cells mL-1 (%>400) were significantly different (P<0.05) according to the countries and the times series. Linear regression model used for the USA time series was statistically significant for AGM and the %>400 (P<0.05). The first and second USA time series presented an increasing and decreasing trend for AGM and the %>400, respectively. The linear regression model for the Brazil time series was not significant (P>0.05) for both dependent variables (AGM and %>400). The Brazil time series showed no increasing or decreasing trend for the AGM and %>400. Consequently, approximately 40 to 50% of the dairy herds from southeastern Brazil will not achieve the regulatory limits for BTSCC over the next years.

Download Full-text

Real-Time Prediction of the COVID-19 Epidemic in Thailand using Simple Model-Free Method and Time Series Regression Model

Walailak Journal of Science and Technology (WJST) ◽

10.48048/wjst.2021.10028 ◽

2021 ◽

Vol 18 (14) ◽

Author(s):

Rati WONGSATHAN

Keyword(s):

Genetic Algorithm ◽

Time Series ◽

Regression Model ◽

Regression Models ◽

Gaussian Function ◽

Logistic Function ◽

Hyperbolic Tangent ◽

Time Series Regression ◽

Model Free ◽

Tangent Function

The novel coronavirus 2019 (COVID-19) pandemic was declared a global health crisis. The real-time accurate and predictive model of the number of infected cases could help inform the government of providing medical assistance and public health decision-making. This work is to model the ongoing COVID-19 spread in Thailand during the 1st and 2nd phases of the pandemic using the simple but powerful method based on the model-free and time series regression models. By employing the curve fitting, the model-free method using the logistic function, hyperbolic tangent function, and Gaussian function was applied to predict the number of newly infected patients and accumulate the total number of cases, including peak and viral cessation (ending) date. Alternatively, with a significant time-lag of historical data input, the regression model predicts those parameters from 1-day-ahead to 1-month-ahead. To obtain optimal prediction models, the parameters of the model-free method are fine-tuned through the genetic algorithm, whereas the generalized least squares update the parameters of the regression model. Assuming the future trend continues to follow the past pattern, the expected total number of patients is approximately 2,689 - 3,000 cases. The estimated viral cessation dates are May 2, 2020 (using Gaussian function), May 4, 2020 (using a hyperbolic function), and June 5, 2020 (using a logistic function), whereas the peak time occurred on April 5, 2020. Moreover, the model-free method performs well for long-term prediction, whereas the regression model is suitable for short-term prediction. Furthermore, the performances of the regression models yield a highly accurate forecast with lower RMSE and higher R2 up to 1-week-ahead. HIGHLIGHTS COVID-19 model for Thailand during the first and second phases of the epidemic The model-free method using the logistic function, hyperbolic tangent function, and Gaussian function applied to predict the basic measures of the outbreak Regression model predicts those measures from one-day-ahead to one-month-ahead The parameters of the model-free method are fine-tuned through the genetic algorithm GRAPHICAL ABSTRACT

Download Full-text

COMPARATIVE ANALYSIS OF DIFFERENT CRITERIA OF SITE STRUCTURAL STABILITY IN THE TIME SERIES

VESTNIK OF ASTRAKHAN STATE TECHNICAL UNIVERSITY SERIES MANAGEMENT COMPUTER SCIENCE AND INFORMATICS ◽

10.24143/2072-9502-2019-1-119-128 ◽

2019 ◽

pp. 119-128

Author(s):

Liliya Andreevna Landman ◽

Andrei Vladimirovich Faddeenkov

Keyword(s):

Time Series ◽

Regression Model ◽

Structural Changes ◽

Least Squares Method ◽

Complex Structure ◽

Standard Normal Distribution ◽

Regression Equations ◽

Unknown Parameters ◽

Chow Test ◽

Wide Range

The concept of structure is used to describe a set of stable relations between the main parts of the object, which describe its integrity and identity, i.e, preserving the basic properties for a wide range of internal and external changes. This concept usually relates to the concepts of system and organization. The structure expresses a stable part of the system that is slightly changed during different reforms. Over the years structural changes take place because of active economic policy or as a result of spontaneous, uncontrollable processes. Therefore, it seems to be quite natural to find out whether there have been structural changes in the observation period, and to find them reflected in the specification of the model. The basic ideas of methods for determining structural changes in the time series dynamics have been considered, such as Chow test, Gujarati test and Poirier method. The power study was conducted for the three possible cases of change in time series trends. The random error was modeled according to the standard normal distribution. A linear multiple regression model with three independent variables was used as a time series model. Estimation of the vector of unknown parameters of the model was conducted using least squares method. For each of the three criteria the of test the null hypothesis about time series instability was carried out using the F -criterion, which involves finding the residual sum of squares of a regression model and analysis of correlation between its decline and the loss of degrees of freedom. It can be noted that Gujarati and Poirier equations have a more complex structure than equation of Chow test; however, using Chow test assumes estimation of the parameters of the three regression equations.

Download Full-text

LIMITATIONS OF THE PANEL REGRESSION MODEL APPLICATION: THE EXAMPLE OF THE WESTERN BALKAN COUNTRIES

ACTA ECONOMICA ◽

10.7251/ace2032151b ◽

2021 ◽

Vol 18 (32) ◽

Author(s):

Stanko Stanić ◽

Bojan Baškot

Keyword(s):

Time Series ◽

Regression Model ◽

Time Frame ◽

Panel Regression ◽

Time Dimension ◽

Data Set ◽

Short Time Series ◽

Western Balkan Region ◽

Balkan Region ◽

Short Time

Panel regression model may seem like an appealing solution in conditions of limited time series. This is often used as a shortcut to achieve deeper data set by setting several individual cases on the same time dimension, where cross units visually but not really multiply a time frame. Macroeconometrics of the Western Balkan region assumes short time series issue. Additionally, the structural brakes are numerous. Panel regression may seem like a solution, but there are some limitations that should be considered.

Download Full-text

Application of artificial neural network to control the coagulant dosing in water treatment plant

Water Science & Technology ◽

10.2166/wst.2000.0410 ◽

2000 ◽

Vol 42 (3-4) ◽

pp. 403-408 ◽

Cited By ~ 8

Author(s):

R.-F. Yu ◽

S.-F. Kang ◽

S.-L. Liaw ◽

M.-c. Chen

Keyword(s):

Neural Network ◽

Time Series ◽

Artificial Neural Network ◽

Water Treatment ◽

Regression Model ◽

Treatment Plant ◽

Water Treatment Plant ◽

Ann Model ◽

Coagulant Dosage ◽

Ann Models

Coagulant dosing is one of the major operation costs in water treatment plant, and conventional control of this process for most plants is generally determined by the jar test. However, this method can only provide periodic information and is difficult to apply to automatic control. This paper presents the feasibility of applying artificial neural network (ANN) to automatically control the coagulant dosing in water treatment plant. Five on-line monitoring variables including turbidity (NTUin), pH (pHin) and conductivity (Conin) in raw water, effluent turbidity (NTUout) of settling tank, and alum dosage (Dos) were used to build the coagulant dosing prediction model. Three methods including regression model, time series model and ANN models were used to predict alum dosage. According to the result of this study, the regression model performed a poor prediction on coagulant dosage. Both time-series and ANN models performed precise prediction results of dosage. The ANN model with ahead coagulant dosage performed the best prediction of alum dosage with a R2 of 0.97 (RMS=0.016), very low average predicted error of 0.75 mg/L of alum were also found in the ANN model. Consequently, the application of ANN model to control the coagulant dosing is feasible in water treatment.

Download Full-text