ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Electronic Resource
    Electronic Resource
    Springer
    Computational complexity 5 (1995), S. 1-23 
    ISSN: 1420-8954
    Keywords: Machine learning ; computational learning theory ; on-line learning ; linear functions ; worst-case loss bounds ; adaptive filter theory ; 68T05
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We present an algorithm for the on-line learning of linear functions which is optimal to within a constant factor with respect to bounds on the sum of squared errors for a worst case sequence of trials. The bounds are logarithmic in the number of variables. Furthermore, the algorithm is shown to be optimally robust with respect to noise in the data (again to within a constant factor).
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 30 (1998), S. 7-21 
    ISSN: 0885-6125
    Keywords: PAC learning ; multiple-instance examples ; axis-aligned hyperrectangles
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We describe a polynomial-time algorithm for learning axis-aligned rectangles in Q d with respect to product distributions from multiple-instance examples in the PAC model. Here, each example consists of n elements of Qd together with a label indicating whether any of the n points is in the rectangle to be learned. We assume that there is an unknown product distribution D over Q d such that all instances are independently drawn according to D. The accuracy of a hypothesis is measured by the probability that it would incorrectly predict whether one of n more points drawn from D was in the rectangle to be learned. Our algorithm achieves accuracy ∈ with probability 1-δ in O (d5 n12/∈20 log2 nd/∈δ time.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 18 (1995), S. 187-230 
    ISSN: 0885-6125
    Keywords: computational learning theory ; on-line learning ; mistake-bounded learning ; function learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The majority of results in computational learning theory are concerned with concept learning, i.e. with the special case of function learning for classes of functions with range {0, 1}. Much less is known about the theory of learning functions with a larger range such as ℕ or ℝ. In particular relatively few results exist about the general structure of common models for function learning, and there are only very few nontrivial function classes for which positive learning results have been exhibited in any of these models. We introduce in this paper the notion of a binary branching adversary tree for function learning, which allows us to give a somewhat surprising equivalent characterization of the optimal learning cost for learning a class of real-valued functions (in terms of a max-min definition which does not involve any “learning” model). Another general structural result of this paper relates the cost for learning a union of function classes to the learning costs for the individual function classes. Furthermore, we exhibit an efficient learning algorithm for learning convex piecewise linear functions from ℝ d into ℝ. Previously, the class of linear functions from ℝ d into ℝ was the only class of functions with multidimensional domain that was known to be learnable within the rigorous framework of a formal model for online learning. Finally we give a sufficient condition for an arbitrary class $$\mathcal{F}$$ of functions from ℝ into ℝ that allows us to learn the class of all functions that can be written as the pointwise maximum ofk functions from $$\mathcal{F}$$ . This allows us to exhibit a number of further nontrivial classes of functions from ℝ into ℝ for which there exist efficient learning algorithms.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 14 (1994), S. 27-45 
    ISSN: 0885-6125
    Keywords: Computational learning theory ; concept drift ; concept learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this paper we consider the problem of tracking a subset of a domain (called the target) which changes gradually over time. A single (unknown) probability distribution over the domain is used to generate random examples for the learning algorithm and measure the speed at which the target changes. Clearly, the more rapidly the target moves, the harder it is for the algorithm to maintain a good approximation of the target. Therefore we evaluate algorithms based on how much movement of the target can be tolerated between examples while predicting with accuracy ε Furthermore, the complexity of the class $$\mathcal{H}$$ of possible targets, as measured by d, its VC-dimension, also effects the difficulty of tracking the target concept. We show that if the problem of minimizing the number of disagreements with a sample from among concepts in a class $$\mathcal{H}$$ can be approximated to within a factor k, then there is a simple tracking algorithm for $$\mathcal{H}$$ which can achieve a probability ε of making a mistake if the target movement rate is at most a constant times $$ \in ^2 /(k(d + k)\ln \frac{1}{ \in })$$ , where d is the Vapnik-Chervonenkis dimension of $$\mathcal{H}$$ . Also, we show that if $$\mathcal{H}$$ is properly PAC-learnable, then there is an efficient (randomized) algorithm that with high probability approximately minimizes disagreements to within a factor of 7d + 1, yielding an efficient tracking algorithm for $$\mathcal{H}$$ which tolerates drift rates up to a constant times $$ \in ^2 /(d^2 \ln \frac{1}{ \in })$$ . In addition, we prove complementary results for the classes of halfspaces and axis-aligned hyperrectangles showing that the maximum rate of drift that any algorithm (even with unlimited computational power) can tolerate is a constant times ε2/d.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 27 (1997), S. 5-5 
    ISSN: 0885-6125
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 36 (1999), S. 147-181 
    ISSN: 0885-6125
    Keywords: computational learning theory ; learning with queries ; mistake bounds ; function learning ; learning with noise
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We solve an open problem of Maass and Turán, showing that the optimal mistake-bound when learning a given concept class without membership queries is within a constant factor of the optimal number of mistakes plus membership queries required by an algorithm that can ask membership queries. Previously known results imply that the constant factor in our bound is best possible. We then show that, in a natural generalization of the mistake-bound model, the usefulness to the learner of arbitrary “yes-no” questions between trials is very limited. We show that several natural structural questions about relatives of the mistake-bound model can be answered through the application of this general result. Most of these results can be interpreted as saying that learning in apparently less powerful (and more realistic) models is not much more difficult than learning in more powerful models.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 37 (1999), S. 337-354 
    ISSN: 0885-6125
    Keywords: computational learning theory ; concept drift ; context-sensitive learning ; prediction ; PAC learning ; agnostic learning ; uniform convergence ; VC theory
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We show that a $$\frac{{c \in ^3 }}{{{\text{VCdim(}}\mathcal{F}{\text{)}}}}$$ bound on the rate of drift of the distribution generating the examples is sufficient for agnostic learning to relative accuracy ∈, where c 〉 0 is a constant; this matches a known necessary condition to within a constant factor. We establish a $$\frac{{c \in ^2 }}{{{\text{VCdim(}}\mathcal{F}{\text{)}}}}$$ sufficient condition for the realizable case, also matching a known necessary condition to within a constant factor. We provide a relatively simple proof of a bound of $$O(\frac{1}{{_ \in 2}}({\text{VCdim(}}\mathcal{F}{\text{)}}$$ + $$\log \frac{1}{\delta }))$$ on the sample complexity of agnostic learning in a fixed environment.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 14 (1994), S. 27-45 
    ISSN: 0885-6125
    Keywords: Computational learning theory ; concept drift ; concept learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this paper we consider the problem of tracking a subset of a domain (called thetarget) which changes gradually over time. A single (unknown) probability distribution over the domain is used to generate random examples for the learning algorithm and measure the speed at which the target changes. Clearly, the more rapidly the target moves, the harder it is for the algorithm to maintain a good approximation of the target. Therefore we evaluate algorithms based on how much movement of the target can be tolerated between examples while predicting with accuracy ε. Furthermore, the complexity of the classH of possible targets, as measured byd, its VC-dimension, also effects the difficulty of tracking the target concept. We show that if the problem of minimizing the number of disagreements with a sample from among concepts in a classH can be approximated to within a factork, then there is a simple tracking algorithm forH which can achieve a probability ε of making a mistake if the target movement rate is at most a constant times ε2/(k(d +k) ln 1/ε), whered is the Vapnik-Chervonenkis dimension ofH. Also, we show that ifH is properly PAC-learnable, then there is an efficient (randomized) algorithm that with high probability approximately minimizes disagreements to within a factor of 7d + 1, yielding an efficient tracking algorithm forH which tolerates drift rates up to a constant times ε2/(d 2 ln 1/ε). In addition, we prove complementary results for the classes of halfspaces and axisaligned hyperrectangles showing that the maximum rate of drift that any algorithm (even with unlimited computational power) can tolerate is a constant times ε2/d.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 18 (1995), S. 187-230 
    ISSN: 0885-6125
    Keywords: computational learning theory ; on-line learning ; mistake-bounded learning ; function learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The majority of results in computational learning theory are concerned with concept learning, i.e. with the special case of function learning for classes of functions with range {0, 1}. Much less is known about the theory of learning functions with a larger range such as $$\mathbb{N}$$ or $$\mathbb{R}$$ . In particular relatively few results exist about the general structure of common models for function learning, and there are only very few nontrivial function classes for which positive learning results have been exhibited in any of these models. We introduce in this paper the notion of a binary branching adversary tree for function learning, which allows us to give a somewhat surprising equivalent characterization of the optimal learning cost for learning a class of real-valued functions (in terms of a max-min definition which does not involve any “learning” model). Another general structural result of this paper relates the cost for learning a union of function classes to the learning costs for the individual function classes. Furthermore, we exhibit an efficient learning algorithm for learning convex piecewise linear functions from $$\mathbb{R}^d $$ into $$\mathbb{R}$$ . Previously, the class of linear functions from $$\mathbb{R}^d $$ into $$\mathbb{R}$$ was the only class of functions with multidimensional domain that was known to be learnable within the rigorous framework of a formal model for online learning. Finally we give a sufficient condition for an arbitrary class $$\mathcal{F}$$ of functions from $$\mathbb{R}$$ into $$\mathbb{R}$$ that allows us to learn the class of all functions that can be written as the pointwise maximum of k functions from $$\mathcal{F}$$ . This allows us to exhibit a number of further nontrivial classes of functions from $$\mathbb{R}$$ into $$\mathbb{R}$$ for which there exist efficient learning algorithms.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2020-04-24
    Description: The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, we consider when a perfect fit to training data in linear regression is compatible with accurate prediction. We give a characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy. The characterization is in terms of two notions of the effective rank of the data covariance. It shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size. By studying examples of data covariance properties that this characterization shows are required for benign overfitting, we find an important role for finite-dimensional data: the accuracy of the minimum norm interpolating prediction rule approaches the best possible accuracy for a much narrower range of properties of the data distribution when the data lie in an infinite-dimensional space vs. when the data lie in a finite-dimensional space with dimension that grows faster than the sample size.
    Print ISSN: 0027-8424
    Electronic ISSN: 1091-6490
    Topics: Biology , Medicine , Natural Sciences in General
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...