On the sample complexity of PAC learning half-spaces against the uniform distribution

1995 ◽  
Vol 6 (6) ◽  
pp. 1556-1559 ◽  
Author(s):  
P.M. Long
1993 ◽  
Vol 5 (5) ◽  
pp. 767-782 ◽  
Author(s):  
Mostefa Golea ◽  
Mario Marchand

We present an algorithm that PAC learns any perceptron with binary weights and arbitrary threshold under the family of product distributions. The sample complexity of this algorithm is of O[(n/ε)4 ln(n/δ)] and its running time increases only linearly with the number of training examples. The algorithm does not try to find an hypothesis that agrees with all of the training examples; rather, it constructs a binary perceptron based on various probabilistic estimates obtained from the training examples. We show that, under the restricted case of the uniform distribution and zero threshold, the algorithm reduces to the well known clipped Hebb rule. We calculate exactly the average generalization rate (i.e., the learning curve) of the algorithm, under the uniform distribution, in the limit of an infinite number of dimensions. We find that the error rate decreases exponentially as a function of the number of training examples. Hence, the average case analysis gives a sample complexity of O[n ln(1/ε)], a large improvement over the PAC learning analysis. The analytical expression of the learning curve is in excellent agreement with the extensive numerical simulations. In addition, the algorithm is very robust with respect to classification noise.


2004 ◽  
Vol 68 (1) ◽  
pp. 205-234 ◽  
Author(s):  
Nader H. Bshouty ◽  
Jeffrey C. Jackson ◽  
Christino Tamon

Author(s):  
Александр Макаров ◽  
Aleksandr Makarov

As a result of the research of production process organization for the roof construction of residential multi-storey buildings, an artificial neural network (ANN) was designed, the purpose of which is to predict the labor productivity based on organizational factors. One of the main tasks on the way to this purpose is the training of ANN on precedents of the sample extracted from the research object. In view of the deficiency of training data, the main problem is to determine the conditions for the statistical significance of the predictions of the model trained on limited sample. This article is devoted to solving this problem within the research of production organization. The paper uses the provisions of the statistical learning theory, the notion of the Vapnik-Chervonenkis dimension for describing the sample complexity, and also the approaches of probably approximately correct learning (PAC-learning). The technologies of statistical bootstrapping and bagging are described, which allow expanding the training sample. ANN training is conducted using a computer experiment on the programming language Python. The bounds of the theoretical sample complexity, which is necessary for obtaining of ANN results within a given confidence interval with a confidence level of 0,95, were estimated. The sample was transformed by an order comparable to the theoretical lower bound. ANN was trained and the mean square error (MSE) in the test sample was defined, which amounted to . The theoretical bounds of the sample complexity to ensure a given statistical significance are determined in the article. After the ANN training on the sample, the order of which corresponds to theoretical lower bound, a prediction error was obtained on the test sample within the given confidence interval.


2002 ◽  
Vol 11 (04) ◽  
pp. 499-511 ◽  
Author(s):  
ARTURO HERNÁNDEZ-AGUIRRE ◽  
CRIS KOUTSOUGERAS ◽  
BILL BUCKLES

We find new sample complexity bounds for real function learning tasks in the uniform distribution by means of linear neural networks. These bounds, tighter than the distribution-free ones reported elsewhere in the literature, are applicable to simple functional link networks and radial basis neural networks.


Sign in / Sign up

Export Citation Format

Share Document