ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 14 (1994), S. 295-301 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; temporal differences ; $$\mathcal{Q}$$ -learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The methods of temporal differences (Samuel, 1959; Sutton, 1984, 1988) allow an agent to learn accurate predictions of stationary stochastic future outcomes. The learning is effectively stochastic approximation based on samples extracted from the process generating the agent's future. Sutton (1988) proved that for a special case of temporal differences, the expected values of the predictions converge to their correct values, as larger samples are taken, and Dayan (1992) extended his proof to the general case. This article proves the stronger result than the predictions of a slightly modified form of temporal difference learning converge with probability one, and shows how to quantify the rate of convergence.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 14 (1994), S. 295-301 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; temporal differences ; Q-learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The methods of temporal differences (Samuel, 1959; Sutton, 1984, 1988) allow an agent to learn accurate predictions of stationary stochastic future outcomes. The learning is effectively stochastic approximation based on samples extracted from the process generating the agent's future. Sutton (1988) proved that for a special case of temporal differences, the expected values of the predictions converge to their correct values, as large samples are taken, and Dayan (1992) extended his proof to the general case. This article proves the stronger result that the predictions of a slightly modified form of temporal difference learning converge with probability one, and shows how to quantify the rate of convergence.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 25 (1996), S. 5-22 
    ISSN: 0885-6125
    Keywords: Reinforcement learning ; dynamic programming ; exploration bonuses ; certainty equivalence ; non-stationary environment
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Finding the Bayesian balance between exploration and exploitation in adaptive optimal control is in general intractable. This paper shows how to compute suboptimal estimates based on a certainty equivalence approximation (Cozzolino, Gonzalez-Zubieta & Miller, 1965) arising from a form of dual control. This system-atizes and extends existing uses of exploration bonuses in reinforcement learning (Sutton, 1990). The approach has two components: a statistical model of uncertainty in the world and a way of turning this into exploratory behavior. This general approach is applied to two-dimensional mazes with moveable barriers and its performance is compared with Sutton's DYNA system.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 25 (1996), S. 5-22 
    ISSN: 0885-6125
    Keywords: Reinforcement learning ; dynamic programming ; exploration bonuses ; certainty equivalence ; non-stationary environment
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Finding the Bayesian balance between exploration and exploitation in adaptive optimal control is in general intractable. This paper shows how to compute suboptimal estimates based on a certainty equivalence approximation (Cozzolino, Gonzalez-Zubieta Miller, 1965) arising from a form of dual control. This systematizes and extends existing uses of exploration bonuses in reinforcement learning (Sutton, 1990). The approach has two components: a statistical model of uncertainty in the world and a way of turning this into exploratory behavior. This general approach is applied to two-dimensional mazes with moveable barriers and its performance is compared with Sutton‘s DYNA system.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Electronic Resource
    Electronic Resource
    [s.l.] : Macmillian Magazines Ltd.
    Nature 428 (2004), S. 854-856 
    ISSN: 1476-4687
    Source: Nature Archives 1869 - 2009
    Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics
    Notes: [Auszug] Brightness—the perception of an object's luminance—arises from complex and poorly understood interactions at several levels of processing. It is well known that the brightness of an object depends on its spatial context, which can include perceptual organization, scene ...
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Electronic Resource
    Electronic Resource
    [s.l.] : Macmillan Magazines Ltd.
    Nature 394 (1998), S. 725-726 
    ISSN: 1476-4687
    Source: Nature Archives 1869 - 2009
    Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics
    Notes: [Auszug] Whether reaching, throwing, running or dancing, our natural tendency is to make smooth and precise movements. Out of the infinite number of ways that we could have made a particular movement, we generally pick the one that is the smoothest. The current thinking in the field of motor control is that ...
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Electronic Resource
    Electronic Resource
    [s.l.] : Nature Publishing Group
    Nature 352 (1991), S. 669-670 
    ISSN: 1476-4687
    Source: Nature Archives 1869 - 2009
    Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics
    Notes: [Auszug] IN the 1960s, recordings from single neurons opened a new view of the visual system1. The surprise was that single neurons carried so much information about features in the image; the eventual disappointment was that recordings from dozens of visual areas of cerebral cortex did not lead to a deeper ...
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    ISSN: 1573-6873
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Medicine , Physics
    Notes: Abstract Markov kinetic models were used to synthesize a complete description of synaptic transmission, including opening of voltage-dependent channels in the presynaptic terminal, release of neurotransmitter, gating of postsynaptic receptors, and activation of second-messenger systems. These kinetic schemes provide a more general framework for modeling ion channels than the Hodgkin-Huxley formalism, supporting a continuous spectrum of descriptions ranging from the very simple and computationally efficient to the highly complex and biophysically precise. Examples are given of simple kinetic schemes based on fits to experimental data that capture the essential properties of voltage-gated, synaptic and neuromodulatory currents. The Markov formalism allows the dynamics of ionic currents to be considered naturally in the larger context of biochemical signal transduction. This framework can facilitate the integration of a wide range of experimental data and promote consistent theoretical analysis of neural mechanisms from molecular interactions to network computations.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Electronic Resource
    Electronic Resource
    [s.l.] : Nature Publishing Group
    Nature 306 (1983), S. 21-26 
    ISSN: 1476-4687
    Source: Nature Archives 1869 - 2009
    Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics
    Notes: [Auszug] The functional abilities and parallel architecture of the human visual system are a rich source of ideas about visual processing. Any visual task that we can perform quickly and effortlessly is likely to have a computational solution using a parallel algorithm. Recently, several such parallel ...
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Electronic Resource
    Electronic Resource
    [s.l.] : Nature Publishing Group
    Nature 333 (1988), S. 452-454 
    ISSN: 1476-4687
    Source: Nature Archives 1869 - 2009
    Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics
    Notes: [Auszug] Fig. 1 Organization of neural network that extracts surface cur-vatures from images of shaded surfaces, a, Diagram of three-layer network. Each unit projects to all units in the subsequent layer. The responses of the units in the input layer are determined by the environment. The responses of each ...
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...