$ \ell_{1} $-norm based safe semi-supervised learning

Haitao Gan;  ; Zhi Yang; Ji Wang; Bing Li;  ;  ;

doi:10.3934/mbe.2021383

$ \ell_{1} $-norm based safe semi-supervised learning

Mathematical Biosciences and Engineering ◽

10.3934/mbe.2021383 ◽

2021 ◽

Vol 18 (6) ◽

pp. 7727-7742

Author(s):

Haitao Gan ◽

◽

Zhi Yang ◽

Ji Wang ◽

Bing Li ◽

...

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Learning Performance ◽

Iterative Approach ◽

Negative Effects ◽

Experimental Performance ◽

The Past ◽

Comparable Performance ◽

Novel Method ◽

Negative Impacts

<abstract><p>In the past few years, Safe Semi-Supervised Learning (S3L) has received considerable attentions in machine learning field. Different researchers have proposed many S3L methods for safe exploitation of risky unlabeled samples which result in performance degradation of Semi-Supervised Learning (SSL). Nevertheless, there exist some shortcomings: (1) Risk degrees of the unlabeled samples are in advance defined by analyzing prediction differences between Supervised Learning (SL) and SSL; (2) Negative impacts of labeled samples on learning performance are not investigated. Therefore, it is essential to design a novel method to adaptively estimate importance and risk of both unlabeled and labeled samples. For this purpose, we present $ \ell_{1} $-norm based S3L which can simultaneously reach the safe exploitation of the labeled and unlabeled samples in this paper. In order to solve the proposed ptimization problem, we utilize an effective iterative approach. In each iteration, one can adaptively estimate the weights of both labeled and unlabeled samples. The weights can reflect the importance or risk of the labeled and unlabeled samples. Hence, the negative effects of the labeled and unlabeled samples are expected to be reduced. Experimental performance on different datasets verifies that the proposed S3L method can obtain comparable performance with the existing SL, SSL and S3L methods and achieve the expected goal.</p></abstract>

Download Full-text

Didymo and Its Polysaccharide Stalks: Beneficial to the Environment or Not?

Polysaccharides ◽

10.3390/polysaccharides2010005 ◽

2021 ◽

Vol 2 (1) ◽

pp. 69-79

Author(s):

Hurmat Ejaz ◽

Esther Somanader ◽

Uday Dave ◽

Hermann Ehrlich ◽

M. Azizur Rahman

Keyword(s):

Invasive Species ◽

Structural Component ◽

Negative Effects ◽

Adhesive Properties ◽

The Past ◽

Positive Side ◽

Potential Benefits ◽

Negative Impacts ◽

Bodies Of Water ◽

Agricultural Setting

Didymosphenia geminata diatoms, or Didymo, was first found to be an invasive species that could have negative impacts on the environment due to the aggressive growth of its polysaccharide-based stalks. The stalks’ adhesive properties have prompted park officials to alert the general public to limit further spread and contamination of this algae to other bodies of water. Although the negative effects of Didymo have been studied in the past, recent studies have demonstrated a potential positive side to this alga. One of the potential benefits includes the structural component of the polysaccharide stalks. The origin of the polysaccharides within stalks remains unknown; however, they can be useful in a waste management and agricultural setting. The primary purpose of this study was to describe both the harmful and beneficial nature of Didymo. Important outcomes include findings related to its application in various fields such as medicine and technology. These polysaccharides can be isolated and studied closely to produce efficient solar power cells and batteries. Though they may be harmful while uncontained in nature, they appear to be very useful in the technological and medical advancement of our society.

Download Full-text

Fault detection for air conditioning system using machine learning

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v9.i1.pp109-116 ◽

2020 ◽

Vol 9 (1) ◽

pp. 109

Author(s):

Noor Asyikin Sulaiman ◽

Md Pauzi Abdullah ◽

Hayati Abdullah ◽

Muhammad Noorazlan Shah Zainudin ◽

Azdiana Md Yusop

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Air Conditioning ◽

Machine Learning Algorithms ◽

Coefficient Of Performance ◽

Support Vector ◽

Air Conditioning System ◽

Learning Classifier ◽

Negative Impacts ◽

The Impact

Air conditioning system is a complex system and consumes the most energy in a building. Any fault in the system operation such as cooling tower fan faulty, compressor failure, damper stuck, etc. could lead to energy wastage and reduction in the system’s coefficient of performance (COP). Due to the complexity of the air conditioning system, detecting those faults is hard as it requires exhaustive inspections. This paper consists of two parts; i) to investigate the impact of different faults related to the air conditioning system on COP and ii) to analyse the performances of machine learning algorithms to classify those faults. Three supervised learning classifier models were developed, which were deep learning, support vector machine (SVM) and multi-layer perceptron (MLP). The performances of each classifier were investigated in terms of six different classes of faults. Results showed that different faults give different negative impacts on the COP. Also, the three supervised learning classifier models able to classify all faults for more than 94%, and MLP produced the highest accuracy and precision among all.

Download Full-text

Artificial Intelligence, Machine Learning, Automation, Robotics, Future of Work and Future of Humanity

Journal of Database Management ◽

10.4018/jdm.2019010104 ◽

2019 ◽

Vol 30 (1) ◽

pp. 61-79 ◽

Cited By ~ 21

Author(s):

Weiyu Wang ◽

Keng Siau

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Industrial Revolution ◽

Future Research ◽

Way Of Life ◽

Negative Effects ◽

Future Research Directions ◽

Negative Impacts ◽

The Impact ◽

The Way

The exponential advancement in artificial intelligence (AI), machine learning, robotics, and automation are rapidly transforming industries and societies across the world. The way we work, the way we live, and the way we interact with others are expected to be transformed at a speed and scale beyond anything we have observed in human history. This new industrial revolution is expected, on one hand, to enhance and improve our lives and societies. On the other hand, it has the potential to cause major upheavals in our way of life and our societal norms. The window of opportunity to understand the impact of these technologies and to preempt their negative effects is closing rapidly. Humanity needs to be proactive, rather than reactive, in managing this new industrial revolution. This article looks at the promises, challenges, and future research directions of these transformative technologies. Not only are the technological aspects investigated, but behavioral, societal, policy, and governance issues are reviewed as well. This research contributes to the ongoing discussions and debates about AI, automation, machine learning, and robotics. It is hoped that this article will heighten awareness of the importance of understanding these disruptive technologies as a basis for formulating policies and regulations that can maximize the benefits of these advancements for humanity and, at the same time, curtail potential dangers and negative impacts.

Download Full-text

A note on label propagation for semi-supervised learning

Acta Universitatis Sapientiae Informatica ◽

10.1515/ausi-2015-0010 ◽

2015 ◽

Vol 7 (1) ◽

pp. 18-30

Author(s):

Zalán Bodó ◽

Lehel Csató

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Unlabeled Data ◽

Label Propagation ◽

Learning Method ◽

The Past ◽

Data Graph

Abstract Semi-supervised learning has become an important and thoroughly studied subdomain of machine learning in the past few years, because gathering large unlabeled data is almost costless, and the costly human labeling process can be minimized by semi-supervision. Label propagation is a transductive semi-supervised learning method that operates on the—most of the time undirected—data graph. It was introduced in [8] and since many variants were proposed. However, the base algorithm has two variants: the first variant presented in [8] and its slightly modified version used afterwards, e.g. in [7]. This paper presents and compares the two algorithms—both theoretically and experimentally—and also tries to make a recommendation which variant to use.

Download Full-text

Noise Removal Process from Label Classification using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c3920.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 172-175

Keyword(s):

Machine Learning ◽

Big Data ◽

Supervised Learning ◽

Noise Removal ◽

Error Rates ◽

Training Data ◽

Learning Performance ◽

Training Dataset ◽

Noise Filtering ◽

Label Noise

Text classification and clustering approach is essential for big data environments. In supervised learning applications many classification algorithms have been proposed. In the era of big data, a large volume of training data is available in many machine learning works. However, there is a possibility of mislabeled or unlabeled data that are not labeled properly. Some labels may be incorrect resulted in label noise which in turn regress learning performance of a classifier. A general approach to address label noise is to apply noise filtering techniques to identify and remove noise before learning. A range of noise filtering approaches have been developed to improve the classifiers performance. This paper proposes noise filtering approach in text data during the training phase. Many supervised learning algorithms generates high error rates due to noise in training dataset, our work eliminates such noise and provides accurate classification system.

Download Full-text

Predicting system collapse : application of kernel-based machine learning and inclination analysis

10.32920/ryerson.14663292 ◽

2021 ◽

Author(s):

Pouyan Hosseinizadeh

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Kernel Methods ◽

Stock Prices ◽

Actual State ◽

Visualization Method ◽

The Past ◽

Industrial Sectors ◽

System Collapse ◽

Modelling Methods

While many modelling methods have been developed and introduced to predict the actual state of a system at the next point of time, the purpose of this research is to present and discuss two approaches NOT to predict the exact future states, but to identify the potential for final collapse of a system. The first approach is based on kernel methods, a sub category of supervised learning, and attempts to provide a visualization method to classify the active and dead companies and predict the potential collapse of a system. The second method aims to analyze the inclination of a system by looking at the local changes that have been observed over a certain period of time in the past. Application of these modelling approaches to predict collapse in different companies belonging to two industrial sectors by looking at behaviour of their closing stock prices are discussed in this research. Advantages and limitations of each approach are also discussed.

Download Full-text

Capped ℓ1-norm regularized least squares classification with label noise

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200432 ◽

2021 ◽

pp. 1-13

Author(s):

Zhi Yang ◽

Haitao Gan ◽

Xuan Li ◽

Cong Wu

Keyword(s):

Machine Learning ◽

Least Squares ◽

Supervised Learning ◽

State Of The Art ◽

Negative Influence ◽

Optimization Approach ◽

Classification Methods ◽

Label Noise ◽

Negative Effect ◽

Novel Method

Since label noise can hurt the performance of supervised learning (SL), how to train a good classifier to deal with label noise is an emerging and meaningful topic in machine learning field. Although many related methods have been proposed and achieved promising performance, they have the following drawbacks: (1) they can lead to data waste and even performance degradation if the mislabeled instances are removed; and (2) the negative effect of the extremely mislabeled instances cannot be completely eliminated. To address these problems, we propose a novel method based on the capped ℓ1 norm and a graph-based regularizer to deal with label noise. In the proposed algorithm, we utilize the capped ℓ1 norm instead of the ℓ1 norm. The used norm can inherit the advantage of the ℓ1 norm, which is robust to label noise to some extent. Moreover, the capped ℓ1 norm can adaptively find extremely mislabeled instances and eliminate the corresponding negative influence. Additionally, the proposed algorithm makes full use of the mislabeled instances under the graph-based framework. It can avoid wasting collected instance information. The solution of our algorithm can be achieved through an iterative optimization approach. We report the experimental results on several UCI datasets that include both binary and multi-class problems. The results verified the effectiveness of the proposed algorithm in comparison to existing state-of-the-art classification methods.

Download Full-text

Predicting system collapse : application of kernel-based machine learning and inclination analysis

10.32920/ryerson.14663292.v1 ◽

2021 ◽

Author(s):

Pouyan Hosseinizadeh

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Kernel Methods ◽

Stock Prices ◽

Actual State ◽

Visualization Method ◽

The Past ◽

Industrial Sectors ◽

System Collapse ◽

Modelling Methods

Download Full-text

A survey on semi-supervised learning

Machine Learning ◽

10.1007/s10994-019-05855-6 ◽

2019 ◽

Vol 109 (2) ◽

pp. 373-440 ◽

Cited By ~ 34

Author(s):

Jesper E. van Engelen ◽

Holger H. Hoos

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Supervised Classification ◽

The Past ◽

Learning Tasks ◽

Supervised Learning Algorithms ◽

Unlabelled Data ◽

Methodological Approaches ◽

Advanced Readers ◽

Relevant Work

AbstractSemi-supervised learning is the branch of machine learning concerned with using labelled as well as unlabelled data to perform certain learning tasks. Conceptually situated between supervised and unsupervised learning, it permits harnessing the large amounts of unlabelled data available in many use cases in combination with typically smaller sets of labelled data. In recent years, research in this area has followed the general trends observed in machine learning, with much attention directed at neural network-based models and generative learning. The literature on the topic has also expanded in volume and scope, now encompassing a broad spectrum of theory, algorithms and applications. However, no recent surveys exist to collect and organize this knowledge, impeding the ability of researchers and engineers alike to utilize it. Filling this void, we present an up-to-date overview of semi-supervised learning methods, covering earlier work as well as more recent advances. We focus primarily on semi-supervised classification, where the large majority of semi-supervised learning research takes place. Our survey aims to provide researchers and practitioners new to the field as well as more advanced readers with a solid understanding of the main approaches and algorithms developed over the past two decades, with an emphasis on the most prominent and currently relevant work. Furthermore, we propose a new taxonomy of semi-supervised classification algorithms, which sheds light on the different conceptual and methodological approaches for incorporating unlabelled data into the training process. Lastly, we show how the fundamental assumptions underlying most semi-supervised learning algorithms are closely connected to each other, and how they relate to the well-known semi-supervised clustering assumption.

Download Full-text

Deep Semi-Supervised Image Classification Algorithms: a Survey

JUCS - Journal of Universal Computer Science ◽

10.3897/jucs.77029 ◽

2021 ◽

Vol 27 (12) ◽

pp. 1390-1407

Author(s):

Ani Vanyan ◽

Hrant Khachatrian

Keyword(s):

Machine Learning ◽

Image Classification ◽

Supervised Learning ◽

Classification Accuracy ◽

Classification Algorithms ◽

The Past ◽

Supervised Learning Algorithms ◽

Supervised Image Classification ◽

Learning Focused ◽

Remarkable Progress

Semi-supervised learning is a branch of machine learning focused on improving the performance of models when the labeled data is scarce, but there is access to large number of unlabeled examples. Over the past five years there has been a remarkable progress in designing algorithms which are able to get reasonable image classification accuracy having access to the labels for only 0.1% of the samples. In this survey, we describe most of the recently proposed deep semi-supervised learning algorithms for image classification and identify the main trends of research in the field. Next, we compare several components of the algorithms, discuss the challenges of reproducing the results in this area, and highlight recently proposed applications of the methods originally developed for semi-supervised learning.

Download Full-text