Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm

Emanuele Cavenaghi; Gabriele Sottocornola; Fabio Stella; Markus Zanker

doi:10.3390/e23030380

Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm

Entropy ◽

10.3390/e23030380 ◽

2021 ◽

Vol 23 (3) ◽

pp. 380

Author(s):

Emanuele Cavenaghi ◽

Gabriele Sottocornola ◽

Fabio Stella ◽

Markus Zanker

Keyword(s):

Real World ◽

Concept Drift ◽

Empirical Evaluation ◽

Sliding Window ◽

Discount Factor ◽

Data Streaming ◽

Sources Of Information ◽

Sequential Decision ◽

Time Step ◽

Thompson Sampling

The Multi-Armed Bandit (MAB) problem has been extensively studied in order to address real-world challenges related to sequential decision making. In this setting, an agent selects the best action to be performed at time-step t, based on the past rewards received by the environment. This formulation implicitly assumes that the expected payoff for each action is kept stationary by the environment through time. Nevertheless, in many real-world applications this assumption does not hold and the agent has to face a non-stationary environment, that is, with a changing reward distribution. Thus, we present a new MAB algorithm, named f-Discounted-Sliding-Window Thompson Sampling (f-dsw TS), for non-stationary environments, that is, when the data streaming is affected by concept drift. The f-dsw TS algorithm is based on Thompson Sampling (TS) and exploits a discount factor on the reward history and an arm-related sliding window to contrast concept drift in non-stationary environments. We investigate how to combine these two sources of information, namely the discount factor and the sliding window, by means of an aggregation function f(.). In particular, we proposed a pessimistic (f=min), an optimistic (f=max), as well as an averaged (f=mean) version of the f-dsw TS algorithm. A rich set of numerical experiments is performed to evaluate the f-dsw TS algorithm compared to both stationary and non-stationary state-of-the-art TS baselines. We exploited synthetic environments (both randomly-generated and controlled) to test the MAB algorithms under different types of drift, that is, sudden/abrupt, incremental, gradual and increasing/decreasing drift. Furthermore, we adapt four real-world active learning tasks to our framework—a prediction task on crimes in the city of Baltimore, a classification task on insects species, a recommendation task on local web-news, and a time-series analysis on microbial organisms in the tropical air ecosystem. The f-dsw TS approach emerges as the best performing MAB algorithm. At least one of the versions of f-dsw TS performs better than the baselines in synthetic environments, proving the robustness of f-dsw TS under different concept drift types. Moreover, the pessimistic version (f=min) results as the most effective in all real-world tasks.

Download Full-text

Sliding-Window Thompson Sampling for Non-Stationary Settings

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.11407 ◽

2020 ◽

Vol 68 ◽

pp. 311-364

Author(s):

Francesco Trovo ◽

Stefano Paladino ◽

Marcello Restelli ◽

Nicola Gatti

Keyword(s):

Real World ◽

State Of The Art ◽

Sliding Window ◽

Upper Bounds ◽

Decision Problems ◽

Sequential Decision ◽

Thompson Sampling ◽

The Past ◽

Real World Applications ◽

Window Approach

Multi-Armed Bandit (MAB) techniques have been successfully applied to many classes of sequential decision problems in the past decades. However, non-stationary settings -- very common in real-world applications -- received little attention so far, and theoretical guarantees on the regret are known only for some frequentist algorithms. In this paper, we propose an algorithm, namely Sliding-Window Thompson Sampling (SW-TS), for nonstationary stochastic MAB settings. Our algorithm is based on Thompson Sampling and exploits a sliding-window approach to tackle, in a unified fashion, two different forms of non-stationarity studied separately so far: abruptly changing and smoothly changing. In the former, the reward distributions are constant during sequences of rounds, and their change may be arbitrary and happen at unknown rounds, while, in the latter, the reward distributions smoothly evolve over rounds according to unknown dynamics. Under mild assumptions, we provide regret upper bounds on the dynamic pseudo-regret of SW-TS for the abruptly changing environment, for the smoothly changing one, and for the setting in which both the non-stationarity forms are present. Furthermore, we empirically show that SW-TS dramatically outperforms state-of-the-art algorithms even when the forms of non-stationarity are taken separately, as previously studied in the literature.

Download Full-text

Random Tree Data Stream Classifier With Sliding Window Estimator And Concept Drift

Bioscience Biotechnology Research Communications ◽

10.21786/bbrc/12.1/25 ◽

2019 ◽

Vol 12 (1) ◽

pp. 219-228

Author(s):

Ebtesam Almalki ◽

Manal Abdullah

Keyword(s):

Data Stream ◽

Concept Drift ◽

Sliding Window ◽

Random Tree ◽

Tree Data

Download Full-text

Concept Drift Adaptation Techniques in Distributed Environment for Real-World Data Streams

Smart Cities ◽

10.3390/smartcities4010021 ◽

2021 ◽

Vol 4 (1) ◽

pp. 349-371

Author(s):

Hassan Mehmood ◽

Panos Kostakos ◽

Marta Cortes ◽

Theodoros Anagnostopoulos ◽

Susanna Pirttikangas ◽

...

Keyword(s):

Real World ◽

Data Streams ◽

Smart City ◽

Smart Cities ◽

Concept Drift ◽

Distributed Environment ◽

Real World Data ◽

Unique Challenge ◽

World Data ◽

Concept Drift Detection

Real-world data streams pose a unique challenge to the implementation of machine learning (ML) models and data analysis. A notable problem that has been introduced by the growth of Internet of Things (IoT) deployments across the smart city ecosystem is that the statistical properties of data streams can change over time, resulting in poor prediction performance and ineffective decisions. While concept drift detection methods aim to patch this problem, emerging communication and sensing technologies are generating a massive amount of data, requiring distributed environments to perform computation tasks across smart city administrative domains. In this article, we implement and test a number of state-of-the-art active concept drift detection algorithms for time series analysis within a distributed environment. We use real-world data streams and provide critical analysis of results retrieved. The challenges of implementing concept drift adaptation algorithms, along with their applications in smart cities, are also discussed.

Download Full-text

Empirical evaluation of feature subset selection based on a real-world data set

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2004.03.005 ◽

2004 ◽

Vol 17 (3) ◽

pp. 285-288 ◽

Cited By ~ 5

Author(s):

Petra Perner ◽

Chid Apte

Keyword(s):

Real World ◽

Empirical Evaluation ◽

Subset Selection ◽

Feature Subset Selection ◽

Feature Subset ◽

Real World Data ◽

Data Set ◽

World Data

Download Full-text

Deep learning framework for handling concept drift and class imbalanced complex decision-making on streaming data

Complex & Intelligent Systems ◽

10.1007/s40747-021-00456-0 ◽

2021 ◽

Author(s):

S. Priya ◽

R. Annie Uthra

Keyword(s):

Decision Making ◽

Deep Learning ◽

Concept Drift ◽

Class Imbalance ◽

Streaming Data ◽

Superior Performance ◽

Data Streaming ◽

Minority Class ◽

Concept Drift Detection

AbstractIn present times, data science become popular to support and improve decision-making process. Due to the accessibility of a wide application perspective of data streaming, class imbalance and concept drifting become crucial learning problems. The advent of deep learning (DL) models finds useful for the classification of concept drift in data streaming applications. This paper presents an effective class imbalance with concept drift detection (CIDD) using Adadelta optimizer-based deep neural networks (ADODNN), named CIDD-ADODNN model for the classification of highly imbalanced streaming data. The presented model involves four processes namely preprocessing, class imbalance handling, concept drift detection, and classification. The proposed model uses adaptive synthetic (ADASYN) technique for handling class imbalance data, which utilizes a weighted distribution for diverse minority class examples based on the level of difficulty in learning. Next, a drift detection technique called adaptive sliding window (ADWIN) is employed to detect the existence of the concept drift. Besides, ADODNN model is utilized for the classification processes. For increasing the classifier performance of the DNN model, ADO-based hyperparameter tuning process takes place to determine the optimal parameters of the DNN model. The performance of the presented model is evaluated using three streaming datasets namely intrusion detection (NSL KDDCup) dataset, Spam dataset, and Chess dataset. A detailed comparative results analysis takes place and the simulation results verified the superior performance of the presented model by obtaining a maximum accuracy of 0.9592, 0.9320, and 0.7646 on the applied KDDCup, Spam, and Chess dataset, respectively.

Download Full-text

Emergent Tangled Program Graphs in Partially Observable Recursive Forecasting and ViZDoom Navigation Tasks

ACM Transactions on Evolutionary Learning and Optimization ◽

10.1145/3468857 ◽

2021 ◽

Vol 1 (3) ◽

pp. 1-41

Author(s):

Stephen Kelly ◽

Robert J. Smith ◽

Malcolm I. Heywood ◽

Wolfgang Banzhaf

Keyword(s):

Empirical Evaluation ◽

Complete Information ◽

Visual Navigation ◽

Sources Of Information ◽

Variation Operators ◽

Incremental Construction ◽

Partially Observable ◽

First Person Shooter ◽

With Memory ◽

Program Graph

Modularity represents a recurring theme in the attempt to scale evolution to the design of complex systems. However, modularity rarely forms the central theme of an artificial approach to evolution. In this work, we report on progress with the recently proposed Tangled Program Graph (TPG) framework in which programs are modules. The combination of the TPG representation and its variation operators enable both teams of programs and graphs of teams of programs to appear in an emergent process. The original development of TPG was limited to tasks with, for the most part, complete information. This work details two recent approaches for scaling TPG to tasks that are dominated by partially observable sources of information using different formulations of indexed memory. One formulation emphasizes the incremental construction of memory, again as an emergent process, resulting in a distributed view of state. The second formulation assumes a single global instance of memory and develops it as a communication medium, thus a single global view of state. The resulting empirical evaluation demonstrates that TPG equipped with memory is able to solve multi-task recursive time-series forecasting problems and visual navigation tasks expressed in two levels of a commercial first-person shooter environment.

Download Full-text

Data–Driven Indirect Adaptive Model Predictive Control

Jurnal Teknologi ◽

10.11113/jt.v54.807 ◽

2012 ◽

Cited By ~ 2

Author(s):

Norhaliza Wahab ◽

Mohamed Reza Katebi ◽

Mohd Fua’ad Rahmat ◽

Salinda Bunyamin

Keyword(s):

Model Predictive Control ◽

Activated Sludge ◽

State Space ◽

Predictive Control ◽

Sliding Window ◽

Subspace Identification ◽

Activated Sludge Process ◽

Adaptive Model ◽

Time Step ◽

Adaptive Model Predictive Control

Kertas kerja ini membincangkan tentang reka bentuk Pengawal Ramalan Model Suai menggunakan kaedah Pengenalpastian Model Keadaan Ruang Sub–ruang bagi proses enapcemar teraktif. Penggunaan teknik Pengenalpastian Model Keadaan Ruang Sub–ruang di dalam kaedah kawalan tingkat gelangsar suai dibincangkan di mana pengenalpastian sub–ruang dalam talian menggunakan algoritma N4SID di perkenalkan bersama dengan rekabentuk Pengawal ramalan model. Pembangunan N4SID dalam talian di dalam kertas kerja ini menggunakan pengemaskini QR di mana gabungan di antara teknik kemaskini dan kemasbawah membolehkan pengadaptasi tingkap gelangsar. Di sini, untuk setiap langkah masa, bagi setiap data baru akan dimasukkan ke faktor R manakala data yang lama dibuang. Begitu juga, strategi bagi uraian nilai tunggal diperkenalkan ke dalam Pengawal Ramalan Model Suai tak langsung untuk masukan tambahan kawalan bagi sistem terkekang tak lelurus. Beberapa kajian simulasi bagi parameter kawalan berlainan di dalam pengawal/pengenalpastian algoritma dilaksanakan. Bagi reka bentuk Pengawal Ramalan Model Suai tak langsung, pengiraan masa yang terlibat dengan menggunakan pendekatan uraian nilai tunggal kurang berbanding dengan kaedah perancangan kuadratik dan keputusan yang memberangsangkan ini adalah sumbangan utama di dalam kertas kerja ini. Kata kunci: Pengawal suai; proses enapcemar teraktif; pengawal ramalan model; pengenalpastian sub–ruang This paper explores the design of Adaptive Model Predictive Control (AMPC) using Subspace State–space Model Identification (SMI) techniques for an activated sludge process. The implementation of SMI techniques in the adaptive sliding window control methods are discussed where the online subspace identification using Numerical State–space Subspace System Identification (N4SID) algorithm is proposed along with Model Predictive Control (MPC) design method. The online N4SID algorithm developed in this study makes use of the QR–updating where the combination of update and down date techniques enables sliding window adaptation. Here, at each time step, for the new experimental data added into R factor, the oldest data are removed. Also, the Singular Value Decomposition (SVD–based) strategy is proposed into Indirect AMPC (IAMPC) for the control increment input constrained nonlinear system. Several simulation studies for different control parameters in control/identification algorithm are performed. For the IAMPC control design, the computational times involved using an SVD approach shows less burdensome compared to Quadratic Programming (QP) method and such an interesting result is considered as one of the main contribution in this paper. Key words: Adaptive control; activated sludge process; model predictive control; subspace identification

Download Full-text

NONLINEAR GATED EXPERTS FOR TIME SERIES: DISCOVERING REGIMES AND AVOIDING OVERFITTING

International Journal of Neural Systems ◽

10.1142/s0129065795000251 ◽

1995 ◽

Vol 06 (04) ◽

pp. 373-399 ◽

Cited By ~ 161

Author(s):

ANDREAS S. WEIGEND ◽

MORGAN MANGEAS ◽

ASHOK N. SRIVASTAVA

Keyword(s):

Time Series ◽

Real World ◽

Markov Models ◽

Multilayer Perceptrons ◽

Time Step ◽

Local Complexity ◽

Previous State ◽

Segmentation Task ◽

Gating Network ◽

Update Rules

In the analysis and prediction of real-world systems, two of the key problems are nonstationarity (often in the form of switching between regimes), and overfitting (particularly serious for noisy processes). This article addresses these problems using gated experts, consisting of a (nonlinear) gating network, and several (also nonlinear) competing experts. Each expert learns to predict the conditional mean, and each expert adapts its width to match the noise level in its regime. The gating network learns to predict the probability of each expert, given the input. This article focuses on the case where the gating network bases its decision on information from the inputs. This can be contrasted to hidden Markov models where the decision is based on the previous state(s) (i.e. on the output of the gating network at the previous time step), as well as to averaging over several predictors. In contrast, gated experts soft-partition the input space, only learning to model their region. This article discusses the underlying statistical assumptions, derives the weight update rules, and compares the performance of gated experts to standard methods on three time series: (1) a computer-generated series, obtained by randomly switching between two nonlinear processes; (2) a time series from the Santa Fe Time Series Competition (the light intensity of a laser in chaotic state); and (3) the daily electricity demand of France, a real-world multivariate problem with structure on several time scales. The main results are: (1) the gating network correctly discovers the different regimes of the process; (2) the widths associated with each expert are important for the segmentation task (and they can be used to characterize the sub-processes); and (3) there is less overfitting compared to single networks (homogeneous multilayer perceptrons), since the experts learn to match their variances to the (local) noise levels. This can be viewed as matching the local complexity of the model to the local complexity of the data.

Download Full-text

Exploring Cinema Databases using multi-dimensional Image Measures

10.31219/osf.io/4xks7 ◽

2020 ◽

Author(s):

Robin G. C. Maack ◽

David H. Rogers ◽

Hans Hagen ◽

Christina Gillmann

Keyword(s):

Real World ◽

Time Step ◽

Dimensional Image ◽

World Cinema ◽

Simulation Time

Exa-scale simulations can be hard to analyze because it is nearly impossible to store all computed time-steps and other parameters. The Cinema Database provides a storage-saving solution, that captures images of each simulation time-step from a variety of camera angles. Still, the resulting number of images can be overwhelming and it is hard to find interesting images and features for further analysis. We present a zoom based approach where users can utilize arbitrary image measures to explore interesting images and further analyze their behaviour in detail. We showed the effectiveness of our approach by providing two real world Cinema datasets.

Download Full-text

Pervasive Computing in the Supermarket

Social and Organizational Impacts of Emerging Mobile Devices ◽

10.4018/978-1-4666-0194-9.ch010 ◽

2012 ◽

pp. 172-185 ◽

Cited By ~ 4

Author(s):

Darren Black ◽

Nils Jakob Clemmensen ◽

Mikael B. Skov

Keyword(s):

Empirical Study ◽

Real World ◽

Context Awareness ◽

Empirical Evaluation ◽

Context Aware ◽

Uniform Pattern ◽

User Attention ◽

Context Aware System ◽

Mobile Context ◽

Interactive Experience

Shopping in the real world is becoming an increasingly interactive experience as stores integrate various technologies to support shoppers. Based on an empirical study of supermarket shoppers, the authors designed a mobile context-aware system called the Context-Aware Shopping Trolley (CAST). The purpose of CAST is to support shopping in supermarkets through context-awareness and acquiring user attention, thus, the authors’ interactive trolley guides and directs shoppers in the handling and finding of groceries. An empirical evaluation showed that shoppers using CAST behaved differently than shoppers using a traditional trolley. Specifically, shoppers using CAST exhibited a more uniform pattern of product collection and found products more easily while travelling a shorter distance. As such, the study finds that CAST supported the supermarket shopping activity.

Download Full-text