An Empirical Study of the Impact of Data Splitting Decisions on the Performance of AIOps Solutions

2021 ◽  
Vol 30 (4) ◽  
pp. 1-38
Author(s):  
Yingzhe Lyu ◽  
Heng Li ◽  
Mohammed Sayagh ◽  
Zhen Ming (Jack) Jiang ◽  
Ahmed E. Hassan

AIOps (Artificial Intelligence for IT Operations) leverages machine learning models to help practitioners handle the massive data produced during the operations of large-scale systems. However, due to the nature of the operation data, AIOps modeling faces several data splitting-related challenges, such as imbalanced data, data leakage, and concept drift. In this work, we study the data leakage and concept drift challenges in the context of AIOps and evaluate the impact of different modeling decisions on such challenges. Specifically, we perform a case study on two commonly studied AIOps applications: (1) predicting job failures based on trace data from a large-scale cluster environment and (2) predicting disk failures based on disk monitoring data from a large-scale cloud storage environment. First, we observe that the data leakage issue exists in AIOps solutions. Using a time-based splitting of training and validation datasets can significantly reduce such data leakage, making it more appropriate than using a random splitting in the AIOps context. Second, we show that AIOps solutions suffer from concept drift. Periodically updating AIOps models can help mitigate the impact of such concept drift, while the performance benefit and the modeling cost of increasing the update frequency depend largely on the application data and the used models. Our findings encourage future studies and practices on developing AIOps solutions to pay attention to their data-splitting decisions to handle the data leakage and concept drift challenges.

Author(s):  
Yangyang Zhao ◽  
Zhenliang Ma ◽  
Xinguo Jiang ◽  
Haris N. Koutsopoulos

Unplanned events present significant challenges for operations and management in metro systems. Short-term ridership prediction can help agencies to better design contingency strategies under unplanned events. Though many short-term prediction methods have been proposed in the literature, most studies focused on typical situations or planned events. The study develops methods for the short-term metro ridership prediction under unplanned events. It explores event impact representation mechanisms and deals with the imbalanced data training problem in building the prediction model under unplanned events. Typical machine learning and deep learning methods are developed for exploration. A large-scale automatic fare collection (AFC) dataset and event record data for a heavily used metro system are used for empirical studies. The analysis found that the same type of unplanned event shares a similar and consistent demand change pattern (with respect to the demand under typical situations) at the station level. The synthetic minority oversampling technique (SMOTE) can enrich the ridership observations under unplanned events and generate a balanced dataset for model training. Given the occurrence of unplanned events, the results show that a combination of demand change ratio and the SMOTE oversampling technique enables the prediction models to learn the impact of unplanned events and improve the prediction accuracy under unplanned events. However, the oversampling methods (i.e., SMOTE and replication) slightly deteriorate the prediction accuracy for ridership under normal conditions. The findings provide insights into mechanisms for disruption impact representation and oversampling imbalanced data in model training, and guide the development of models for short-term prediction under unplanned events.


Entropy ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. 849
Author(s):  
Weronika Wegier ◽  
Pawel Ksieniewicz

In the era of a large number of tools and applications that constantly produce massive amounts of data, their processing and proper classification is becoming both increasingly hard and important. This task is hindered by changing the distribution of data over time, called the concept drift, and the emergence of a problem of disproportion between classes—such as in the detection of network attacks or fraud detection problems. In the following work, we propose methods to modify existing stream processing solutions—Accuracy Weighted Ensemble (AWE) and Accuracy Updated Ensemble (AUE), which have demonstrated their effectiveness in adapting to time-varying class distribution. The introduced changes are aimed at increasing their quality on binary classification of imbalanced data. The proposed modifications contain the inclusion of aggregate metrics, such as F1-score, G-mean and balanced accuracy score in calculation of the member classifiers weights, which affects their composition and final prediction. Moreover, the impact of data sampling on the algorithm’s effectiveness was also checked. Complex experiments were conducted to define the most promising modification type, as well as to compare proposed methods with existing solutions. Experimental evaluation shows an improvement in the quality of classification compared to the underlying algorithms and other solutions for processing imbalanced data streams.


2006 ◽  
Vol 6 (3) ◽  
pp. 4375-4414 ◽  
Author(s):  
B. Bregman ◽  
E. Meijer ◽  
R. Scheele

Abstract. This study describes key aspects of global chemistry-transport models and the impact on stratospheric tracer transport. We concentrate on global models that use assimilated winds from numerical weather predictions, but the results also apply to tracer transport in general circulation models. We examined grid resolution, numerical diffusion and dispersion of the winds fields, the meteorology update time intervals, update frequency, and time interpolation. For this study we applied the three-dimensional chemistry-transport Tracer Model version 5 (TM5) and a trajectory model and performed several diagnoses focusing on different transport regimes. Covering different time and spatial scales, we examined (1) polar vortex dynamics during the Arctic winter, (2) the large-scale stratospheric meridional circulation, and (3) air parcel dispersion in the tropical lower stratosphere. Tracer distributions inside the Arctic polar vortex show considerably worse agreement with observations when the model grid resolution in the polar region is reduced to avoid numerical instability. Using time interpolated winds improve the tracer distributions only marginally. Considerable improvement is found when the update frequency of the assimilated winds is increased from 6 to 3h, both in the large-scale tracer distribution and the polar regions. It further reduces in particular the vertical dispersion of air parcels in the tropical lower stratosphere. The results in this study demonstrates significant progress in the use of assimilated meteorology in chemistry-transport models, which is important for both short- and long-term integrations.


2020 ◽  
Vol 59 (04) ◽  
pp. 294-299 ◽  
Author(s):  
Lutz S. Freudenberg ◽  
Ulf Dittmer ◽  
Ken Herrmann

Abstract Introduction Preparations of health systems to accommodate large number of severely ill COVID-19 patients in March/April 2020 has a significant impact on nuclear medicine departments. Materials and Methods A web-based questionnaire was designed to differentiate the impact of the pandemic on inpatient and outpatient nuclear medicine operations and on public versus private health systems, respectively. Questions were addressing the following issues: impact on nuclear medicine diagnostics and therapy, use of recommendations, personal protective equipment, and organizational adaptations. The survey was available for 6 days and closed on April 20, 2020. Results 113 complete responses were recorded. Nearly all participants (97 %) report a decline of nuclear medicine diagnostic procedures. The mean reduction in the last three weeks for PET/CT, scintigraphies of bone, myocardium, lung thyroid, sentinel lymph-node are –14.4 %, –47.2 %, –47.5 %, –40.7 %, –58.4 %, and –25.2 % respectively. Furthermore, 76 % of the participants report a reduction in therapies especially for benign thyroid disease (-41.8 %) and radiosynoviorthesis (–53.8 %) while tumor therapies remained mainly stable. 48 % of the participants report a shortage of personal protective equipment. Conclusions Nuclear medicine services are notably reduced 3 weeks after the SARS-CoV-2 pandemic reached Germany, Austria and Switzerland on a large scale. We must be aware that the current crisis will also have a significant economic impact on the healthcare system. As the survey cannot adapt to daily dynamic changes in priorities, it serves as a first snapshot requiring follow-up studies and comparisons with other countries and regions.


2020 ◽  
Vol 6 (5) ◽  
pp. 1183-1189
Author(s):  
Dr. Tridibesh Tripathy ◽  
Dr. Umakant Prusty ◽  
Dr. Chintamani Nayak ◽  
Dr. Rakesh Dwivedi ◽  
Dr. Mohini Gautam

The current article of Uttar Pradesh (UP) is about the ASHAs who are the daughters-in-law of a family that resides in the same community that they serve as the grassroots health worker since 2005 when the NRHM was introduced in the Empowered Action Group (EAG) states. UP is one such Empowered Action Group (EAG) state. The current study explores the actual responses of Recently Delivered Women (RDW) on their visits during the first month of their recent delivery. From the catchment area of each of the 250 ASHAs, two RDWs were selected who had a child in the age group of 3 to 6 months during the survey. The response profiles of the RDWs on the post- delivery first month visits are dwelled upon to evolve a picture representing the entire state of UP. The relevance of the study assumes significance as detailed data on the modalities of postnatal visits are available but not exclusively for the first month period of their recent delivery. The details of the post-delivery first month period related visits are not available even in large scale surveys like National Family Health Survey 4 done in 2015-16. The current study gives an insight in to these visits with a five-point approach i.e. type of personnel doing the visit, frequency of the visits, visits done in a particular week from among those four weeks separately for the three visits separately. The current study is basically regarding the summary of this Penta approach for the post- delivery one-month period.     The first month period after each delivery deals with 70% of the time of the postnatal period & the entire neonatal period. Therefore, it does impact the Maternal Mortality Rate & Ratio (MMR) & the Neonatal Mortality Rates (NMR) in India and especially in UP through the unsafe Maternal & Neonatal practices in the first month period after delivery. The current MM Rate of UP is 20.1 & MM Ratio is 216 whereas the MM ratio is 122 in India (SRS, 2019). The Sample Registration System (SRS) report also mentions that the Life Time Risk (LTR) of a woman in pregnancy is 0.7% which is the highest in the nation (SRS, 2019). This means it is very risky to give birth in UP in comparison to other regions in the country (SRS, 2019). This risk is at the peak in the first month period after each delivery. Similarly, the current NMR in India is 23 per 1000 livebirths (UNIGME,2018). As NMR data is not available separately for states, the national level data also hold good for the states and that’s how for the state of UP as well. These mortalities are the impact indicators and such indicators can be reduced through long drawn processes that includes effective and timely visits to RDWs especially in the first month period after delivery. This would help in making their post-natal & neonatal stage safe. This is the area of post-delivery first month visit profile detailing that the current article helps in popping out in relation to the recent delivery of the respondents.   A total of four districts of Uttar Pradesh were selected purposively for the study and the data collection was conducted in the villages of the respective districts with the help of a pre-tested structured interview schedule with both close-ended and open-ended questions.  The current article deals with five close ended questions with options, two for the type of personnel & frequency while the other three are for each of the three visits in the first month after the recent delivery of respondents. In addition, in-depth interviews were also conducted amongst the RDWs and a total 500 respondents had participated in the study.   Among the districts related to this article, the results showed that ASHA was the type of personnel who did the majority of visits in all the four districts. On the other hand, 25-40% of RDWs in all the 4 districts replied that they did not receive any visit within the first month of their recent delivery. Regarding frequency, most of the RDWs in all the 4 districts received 1-2 times visits by ASHAs.   Regarding the first visit, it was found that the ASHAs of Barabanki and Gonda visited less percentage of RDWs in the first week after delivery. Similarly, the second visit revealed that about 1.2% RDWs in Banda district could not recall about the visit. Further on the second visit, the RDWs responded that most of them in 3 districts except Gonda district did receive the second postnatal visit in 7-15 days after their recent delivery. Less than half of RDWs in Barabanki district & just more than half of RDWs in Gonda district received the third visit in 15-21 days period after delivery. For the same period, the majority of RDWs in the rest two districts responded that they had been entertained through a home visit.


e-Finanse ◽  
2018 ◽  
Vol 14 (4) ◽  
pp. 67-76
Author(s):  
Piotr Bartkiewicz

AbstractThe article presents the results of the review of the empirical literature regarding the impact of quantitative easing (QE) on emerging markets (EMs). The subject is of interest to policymakers and researchers due to the increasingly larger role of EMs in the world economy and the large-scale capital flows occurring after 2009. The review is conducted in a systematic manner and takes into consideration different methodological choices, samples and measurement issues. The paper puts the summarized results in the context of transmission channels identified in the literature. There are few distinct methodological approaches present in the literature. While there is a consensus regarding the direction of the impact of QE on EMs, its size and durability have not yet been assessed with sufficient precision. In addition, there are clear gaps in the empirical findings, not least related to relative underrepresentation of the CEE region (in particular, Poland).


2019 ◽  
Vol 20 (2) ◽  
pp. 123-129 ◽  
Author(s):  
Mariana Jesus ◽  
Tânia Silva ◽  
César Cagigal ◽  
Vera Martins ◽  
Carla Silva

Introduction: The field of nutritional psychiatry is a fast-growing one. Although initially, it focused on the effects of vitamins and micronutrients in mental health, in the last decade, its focus also extended to the dietary patterns. The possibility of a dietary cost-effective intervention in the most common mental disorder, depression, cannot be overlooked due to its potential large-scale impact. Method: A classic review of the literature was conducted, and studies published between 2010 and 2018 focusing on the impact of dietary patterns in depression and depressive symptoms were included. Results: We found 10 studies that matched our criteria. Most studies showed an inverse association between healthy dietary patterns, rich in fruits, vegetables, lean meats, nuts and whole grains, and with low intake of processed and sugary foods, and depression and depressive symptoms throughout an array of age groups, although some authors reported statistical significance only in women. While most studies were of cross-sectional design, making it difficult to infer causality, a randomized controlled trial presented similar results. Discussion: he association between dietary patterns and depression is now well-established, although the exact etiological pathways are still unknown. Dietary intervention, with the implementation of healthier dietary patterns, closer to the traditional ones, can play an important role in the prevention and adjunctive therapy of depression and depressive symptoms. Conclusion: More large-scale randomized clinical trials need to be conducted, in order to confirm the association between high-quality dietary patterns and lower risk of depression and depressive symptoms.


Coronaviruses ◽  
2020 ◽  
Vol 01 ◽  
Author(s):  
Yam Nath Paudel ◽  
Efthalia Angelopoulou ◽  
Bhupendra Raj Giri ◽  
Christina Piperi ◽  
Iekhsan Othman ◽  
...  

: COVID-19 has emerged as a devastating pandemic of the century that the current generations have ever experienced. The COVID-19 pandemic has infected more than 12 million people around the globe and 0.5 million people have succumbed to death. Due to the lack of effective vaccines against the COVID-19, several nations throughout the globe has imposed a lock-down as a preventive measure to lower the spread of COVID-19 infection. As a result of lock-down most of the universities and research institutes has witnessed a long pause in basic science research ever. Much has been talked about the long-term impact of COVID-19 in economy, tourism, public health, small and large-scale business of several kind. However, the long-term implication of these research lab shutdown and its impact in the basic science research has not been much focused. Herein, we provide a perspective that portrays a common problem of all the basic science researchers throughout the globe and its long-term consequences.


Sign in / Sign up

Export Citation Format

Share Document