scholarly journals $ \ell_{1} $-norm based safe semi-supervised learning

2021 ◽  
Vol 18 (6) ◽  
pp. 7727-7742
Author(s):  
Haitao Gan ◽  
◽  
Zhi Yang ◽  
Ji Wang ◽  
Bing Li ◽  
...  

<abstract><p>In the past few years, Safe Semi-Supervised Learning (S3L) has received considerable attentions in machine learning field. Different researchers have proposed many S3L methods for safe exploitation of risky unlabeled samples which result in performance degradation of Semi-Supervised Learning (SSL). Nevertheless, there exist some shortcomings: (1) Risk degrees of the unlabeled samples are in advance defined by analyzing prediction differences between Supervised Learning (SL) and SSL; (2) Negative impacts of labeled samples on learning performance are not investigated. Therefore, it is essential to design a novel method to adaptively estimate importance and risk of both unlabeled and labeled samples. For this purpose, we present $ \ell_{1} $-norm based S3L which can simultaneously reach the safe exploitation of the labeled and unlabeled samples in this paper. In order to solve the proposed ptimization problem, we utilize an effective iterative approach. In each iteration, one can adaptively estimate the weights of both labeled and unlabeled samples. The weights can reflect the importance or risk of the labeled and unlabeled samples. Hence, the negative effects of the labeled and unlabeled samples are expected to be reduced. Experimental performance on different datasets verifies that the proposed S3L method can obtain comparable performance with the existing SL, SSL and S3L methods and achieve the expected goal.</p></abstract>

2021 ◽  
Vol 2 (1) ◽  
pp. 69-79
Author(s):  
Hurmat Ejaz ◽  
Esther Somanader ◽  
Uday Dave ◽  
Hermann Ehrlich ◽  
M. Azizur Rahman

Didymosphenia geminata diatoms, or Didymo, was first found to be an invasive species that could have negative impacts on the environment due to the aggressive growth of its polysaccharide-based stalks. The stalks’ adhesive properties have prompted park officials to alert the general public to limit further spread and contamination of this algae to other bodies of water. Although the negative effects of Didymo have been studied in the past, recent studies have demonstrated a potential positive side to this alga. One of the potential benefits includes the structural component of the polysaccharide stalks. The origin of the polysaccharides within stalks remains unknown; however, they can be useful in a waste management and agricultural setting. The primary purpose of this study was to describe both the harmful and beneficial nature of Didymo. Important outcomes include findings related to its application in various fields such as medicine and technology. These polysaccharides can be isolated and studied closely to produce efficient solar power cells and batteries. Though they may be harmful while uncontained in nature, they appear to be very useful in the technological and medical advancement of our society.


Author(s):  
Noor Asyikin Sulaiman ◽  
Md Pauzi Abdullah ◽  
Hayati Abdullah ◽  
Muhammad Noorazlan Shah Zainudin ◽  
Azdiana Md Yusop

Air conditioning system is a complex system and consumes the most energy in a building. Any fault in the system operation such as cooling tower fan faulty, compressor failure, damper stuck, etc. could lead to energy wastage and reduction in the system’s coefficient of performance (COP). Due to the complexity of the air conditioning system, detecting those faults is hard as it requires exhaustive inspections. This paper consists of two parts; i) to investigate the impact of different faults related to the air conditioning system on COP and ii) to analyse the performances of machine learning algorithms to classify those faults. Three supervised learning classifier models were developed, which were deep learning, support vector machine (SVM) and multi-layer perceptron (MLP). The performances of each classifier were investigated in terms of six different classes of faults. Results showed that different faults give different negative impacts on the COP. Also, the three supervised learning classifier models able to classify all faults for more than 94%, and MLP produced the highest accuracy and precision among all.


2019 ◽  
Vol 30 (1) ◽  
pp. 61-79 ◽  
Author(s):  
Weiyu Wang ◽  
Keng Siau

The exponential advancement in artificial intelligence (AI), machine learning, robotics, and automation are rapidly transforming industries and societies across the world. The way we work, the way we live, and the way we interact with others are expected to be transformed at a speed and scale beyond anything we have observed in human history. This new industrial revolution is expected, on one hand, to enhance and improve our lives and societies. On the other hand, it has the potential to cause major upheavals in our way of life and our societal norms. The window of opportunity to understand the impact of these technologies and to preempt their negative effects is closing rapidly. Humanity needs to be proactive, rather than reactive, in managing this new industrial revolution. This article looks at the promises, challenges, and future research directions of these transformative technologies. Not only are the technological aspects investigated, but behavioral, societal, policy, and governance issues are reviewed as well. This research contributes to the ongoing discussions and debates about AI, automation, machine learning, and robotics. It is hoped that this article will heighten awareness of the importance of understanding these disruptive technologies as a basis for formulating policies and regulations that can maximize the benefits of these advancements for humanity and, at the same time, curtail potential dangers and negative impacts.


2015 ◽  
Vol 7 (1) ◽  
pp. 18-30
Author(s):  
Zalán Bodó ◽  
Lehel Csató

Abstract Semi-supervised learning has become an important and thoroughly studied subdomain of machine learning in the past few years, because gathering large unlabeled data is almost costless, and the costly human labeling process can be minimized by semi-supervision. Label propagation is a transductive semi-supervised learning method that operates on the—most of the time undirected—data graph. It was introduced in [8] and since many variants were proposed. However, the base algorithm has two variants: the first variant presented in [8] and its slightly modified version used afterwards, e.g. in [7]. This paper presents and compares the two algorithms—both theoretically and experimentally—and also tries to make a recommendation which variant to use.


Text classification and clustering approach is essential for big data environments. In supervised learning applications many classification algorithms have been proposed. In the era of big data, a large volume of training data is available in many machine learning works. However, there is a possibility of mislabeled or unlabeled data that are not labeled properly. Some labels may be incorrect resulted in label noise which in turn regress learning performance of a classifier. A general approach to address label noise is to apply noise filtering techniques to identify and remove noise before learning. A range of noise filtering approaches have been developed to improve the classifiers performance. This paper proposes noise filtering approach in text data during the training phase. Many supervised learning algorithms generates high error rates due to noise in training dataset, our work eliminates such noise and provides accurate classification system.


2021 ◽  
Author(s):  
Pouyan Hosseinizadeh

While many modelling methods have been developed and introduced to predict the actual state of a system at the next point of time, the purpose of this research is to present and discuss two approaches NOT to predict the exact future states, but to identify the potential for final collapse of a system. The first approach is based on kernel methods, a sub category of supervised learning, and attempts to provide a visualization method to classify the active and dead companies and predict the potential collapse of a system. The second method aims to analyze the inclination of a system by looking at the local changes that have been observed over a certain period of time in the past. Application of these modelling approaches to predict collapse in different companies belonging to two industrial sectors by looking at behaviour of their closing stock prices are discussed in this research. Advantages and limitations of each approach are also discussed.


2021 ◽  
pp. 1-13
Author(s):  
Zhi Yang ◽  
Haitao Gan ◽  
Xuan Li ◽  
Cong Wu

Since label noise can hurt the performance of supervised learning (SL), how to train a good classifier to deal with label noise is an emerging and meaningful topic in machine learning field. Although many related methods have been proposed and achieved promising performance, they have the following drawbacks: (1) they can lead to data waste and even performance degradation if the mislabeled instances are removed; and (2) the negative effect of the extremely mislabeled instances cannot be completely eliminated. To address these problems, we propose a novel method based on the capped ℓ1 norm and a graph-based regularizer to deal with label noise. In the proposed algorithm, we utilize the capped ℓ1 norm instead of the ℓ1 norm. The used norm can inherit the advantage of the ℓ1 norm, which is robust to label noise to some extent. Moreover, the capped ℓ1 norm can adaptively find extremely mislabeled instances and eliminate the corresponding negative influence. Additionally, the proposed algorithm makes full use of the mislabeled instances under the graph-based framework. It can avoid wasting collected instance information. The solution of our algorithm can be achieved through an iterative optimization approach. We report the experimental results on several UCI datasets that include both binary and multi-class problems. The results verified the effectiveness of the proposed algorithm in comparison to existing state-of-the-art classification methods.


2021 ◽  
Author(s):  
Pouyan Hosseinizadeh

While many modelling methods have been developed and introduced to predict the actual state of a system at the next point of time, the purpose of this research is to present and discuss two approaches NOT to predict the exact future states, but to identify the potential for final collapse of a system. The first approach is based on kernel methods, a sub category of supervised learning, and attempts to provide a visualization method to classify the active and dead companies and predict the potential collapse of a system. The second method aims to analyze the inclination of a system by looking at the local changes that have been observed over a certain period of time in the past. Application of these modelling approaches to predict collapse in different companies belonging to two industrial sectors by looking at behaviour of their closing stock prices are discussed in this research. Advantages and limitations of each approach are also discussed.


2019 ◽  
Vol 109 (2) ◽  
pp. 373-440 ◽  
Author(s):  
Jesper E. van Engelen ◽  
Holger H. Hoos

AbstractSemi-supervised learning is the branch of machine learning concerned with using labelled as well as unlabelled data to perform certain learning tasks. Conceptually situated between supervised and unsupervised learning, it permits harnessing the large amounts of unlabelled data available in many use cases in combination with typically smaller sets of labelled data. In recent years, research in this area has followed the general trends observed in machine learning, with much attention directed at neural network-based models and generative learning. The literature on the topic has also expanded in volume and scope, now encompassing a broad spectrum of theory, algorithms and applications. However, no recent surveys exist to collect and organize this knowledge, impeding the ability of researchers and engineers alike to utilize it. Filling this void, we present an up-to-date overview of semi-supervised learning methods, covering earlier work as well as more recent advances. We focus primarily on semi-supervised classification, where the large majority of semi-supervised learning research takes place. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches and algorithms developed over the past two decades, with an emphasis on the most prominent and currently relevant work. Furthermore, we propose a new taxonomy of semi-supervised classification algorithms, which sheds light on the different conceptual and methodological approaches for incorporating unlabelled data into the training process. Lastly, we show how the fundamental assumptions underlying most semi-supervised learning algorithms are closely connected to each other, and how they relate to the well-known semi-supervised clustering assumption.


2021 ◽  
Vol 27 (12) ◽  
pp. 1390-1407
Author(s):  
Ani Vanyan ◽  
Hrant Khachatrian

Semi-supervised learning is a branch of machine learning focused on improving the performance of models when the labeled data is scarce, but there is access to large number of unlabeled examples. Over the past five years there has been a remarkable progress in designing algorithms which are able to get reasonable image classification accuracy having access to the labels for only 0.1% of the samples. In this survey, we describe most of the recently proposed deep semi-supervised learning algorithms for image classification and identify the main trends of research in the field. Next, we compare several components of the algorithms, discuss the challenges of reproducing the results in this area, and highlight recently proposed applications of the methods originally developed for semi-supervised learning.


Sign in / Sign up

Export Citation Format

Share Document