ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Electronic Resource

TD(λ) Converges with Probability 1 (1994)

Dayan, Peter ; Sejnowski, Terrence J.

Springer

Machine learning 14 (1994), S. 295-301

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: reinforcement learning ; temporal differences ; $$\mathcal{Q}$$ -learning

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract The methods of temporal differences (Samuel, 1959; Sutton, 1984, 1988) allow an agent to learn accurate predictions of stationary stochastic future outcomes. The learning is effectively stochastic approximation based on samples extracted from the process generating the agent's future. Sutton (1988) proved that for a special case of temporal differences, the expected values of the predictions converge to their correct values, as larger samples are taken, and Dayan (1992) extended his proof to the general case. This article proves the stronger result than the predictions of a slightly modified form of temporal difference learning converge with probability one, and shows how to quantify the rate of convergence.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1022657612745

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

2

Electronic Resource

TD(λ) converges with probability 1 (1994)

Dayan, Peter ; Sejnowski, Terrence J.

Springer

Machine learning 14 (1994), S. 295-301

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: reinforcement learning ; temporal differences ; Q-learning

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract The methods of temporal differences (Samuel, 1959; Sutton, 1984, 1988) allow an agent to learn accurate predictions of stationary stochastic future outcomes. The learning is effectively stochastic approximation based on samples extracted from the process generating the agent's future. Sutton (1988) proved that for a special case of temporal differences, the expected values of the predictions converge to their correct values, as large samples are taken, and Dayan (1992) extended his proof to the general case. This article proves the stronger result that the predictions of a slightly modified form of temporal difference learning converge with probability one, and shows how to quantify the rate of convergence.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00993978

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

3

Electronic Resource

Exploration bonuses and dual control (1996)

Dayan, Peter ; Sejnowski, Terrence J.

Springer

Machine learning 25 (1996), S. 5-22

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: Reinforcement learning ; dynamic programming ; exploration bonuses ; certainty equivalence ; non-stationary environment

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Finding the Bayesian balance between exploration and exploitation in adaptive optimal control is in general intractable. This paper shows how to compute suboptimal estimates based on a certainty equivalence approximation (Cozzolino, Gonzalez-Zubieta & Miller, 1965) arising from a form of dual control. This system-atizes and extends existing uses of exploration bonuses in reinforcement learning (Sutton, 1990). The approach has two components: a statistical model of uncertainty in the world and a way of turning this into exploratory behavior. This general approach is applied to two-dimensional mazes with moveable barriers and its performance is compared with Sutton's DYNA system.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00115298

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

4

Electronic Resource

Exploration Bonuses and Dual Control (1996)

Dayan, Peter ; Sejnowski, Terrence J.

Springer

Machine learning 25 (1996), S. 5-22

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: Reinforcement learning ; dynamic programming ; exploration bonuses ; certainty equivalence ; non-stationary environment

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Finding the Bayesian balance between exploration and exploitation in adaptive optimal control is in general intractable. This paper shows how to compute suboptimal estimates based on a certainty equivalence approximation (Cozzolino, Gonzalez-Zubieta Miller, 1965) arising from a form of dual control. This systematizes and extends existing uses of exploration bonuses in reinforcement learning (Sutton, 1990). The approach has two components: a statistical model of uncertainty in the world and a way of turning this into exploratory behavior. This general approach is applied to two-dimensional mazes with moveable barriers and its performance is compared with Sutton‘s DYNA system.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1018357105171

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

5

Electronic Resource

Perceived luminance depends on temporal context (2004)

Sejnowski, Terrence J. ; Eagleman, David M. ; Jacobson, John E.

[s.l.] : Macmillian Magazines Ltd.

Nature 428 (2004), S. 854-856

add to mindlist on the mindlist

Details

ISSN: 1476-4687

Source: Nature Archives 1869 - 2009

Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics

Notes: [Auszug] Brightness—the perception of an object's luminance—arises from complex and poorly understood interactions at several levels of processing. It is well known that the brightness of an object depends on its spatial context, which can include perceptual organization, scene ...

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1038/nature02467

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

6

Electronic Resource

Neurobiology Making smooth moves (1998)

Sejnowski, Terrence J.

[s.l.] : Macmillan Magazines Ltd.

Nature 394 (1998), S. 725-726

add to mindlist on the mindlist

Details

ISSN: 1476-4687

Source: Nature Archives 1869 - 2009

Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics

Notes: [Auszug] Whether reaching, throwing, running or dancing, our natural tendency is to make smooth and precise movements. Out of the infinite number of ways that we could have made a particular movement, we generally pick the one that is the smoothest. The current thinking in the field of motor control is that ...

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1038/29406

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

7

Electronic Resource

Back together again (1991)

Sejnowski, Terrence J.

[s.l.] : Nature Publishing Group

Nature 352 (1991), S. 669-670

add to mindlist on the mindlist

Details

ISSN: 1476-4687

Source: Nature Archives 1869 - 2009

Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics

Notes: [Auszug] IN the 1960s, recordings from single neurons opened a new view of the visual system1. The surprise was that single neurons carried so much information about features in the image; the eventual disappointment was that recordings from dozens of visual areas of cerebral cortex did not lead to a deeper ...

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1038/352669a0

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

8

Electronic Resource

Synthesis of models for excitable membranes, synaptic transmission and neuromodulation using a common kinetic formalism (1994)

Destexhe, Alain ; Mainen, Zachary F. ; Sejnowski, Terrence J.

Springer

Journal of computational neuroscience 1 (1994), S. 195-230

add to mindlist on the mindlist

Details

ISSN: 1573-6873

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Medicine , Physics

Notes: Abstract Markov kinetic models were used to synthesize a complete description of synaptic transmission, including opening of voltage-dependent channels in the presynaptic terminal, release of neurotransmitter, gating of postsynaptic receptors, and activation of second-messenger systems. These kinetic schemes provide a more general framework for modeling ion channels than the Hodgkin-Huxley formalism, supporting a continuous spectrum of descriptions ranging from the very simple and computationally efficient to the highly complex and biophysically precise. Examples are given of simple kinetic schemes based on fits to experimental data that capture the essential properties of voltage-gated, synaptic and neuromodulatory currents. The Markov formalism allows the dynamics of ionic currents to be considered naturally in the larger context of biochemical signal transduction. This framework can facilitate the integration of a wide range of experimental data and promote consistent theoretical analysis of neural mechanisms from molecular interactions to network computations.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00961734

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

9

Electronic Resource

Parallel visual computation (1983)

Ballard, Dana H. ; Hinton, Geoffrey E. ; Sejnowski, Terrence J.

[s.l.] : Nature Publishing Group

Nature 306 (1983), S. 21-26

add to mindlist on the mindlist

Details

ISSN: 1476-4687

Source: Nature Archives 1869 - 2009

Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics

Notes: [Auszug] The functional abilities and parallel architecture of the human visual system are a rich source of ideas about visual processing. Any visual task that we can perform quickly and effortlessly is likely to have a computational solution using a parallel algorithm. Recently, several such parallel ...

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1038/306021a0

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

10

Electronic Resource

Network model of shape-from-shading: neural function arises from both receptive and projective fields (1988)

Lehky, Sidney R. ; Sejnowski, Terrence J.

[s.l.] : Nature Publishing Group

Nature 333 (1988), S. 452-454

add to mindlist on the mindlist

Details

ISSN: 1476-4687

Source: Nature Archives 1869 - 2009

Topics: Biology , Chemistry and Pharmacology , Medicine , Natural Sciences in General , Physics

Notes: [Auszug] Fig. 1 Organization of neural network that extracts surface cur-vatures from images of shaded surfaces, a, Diagram of three-layer network. Each unit projects to all units in the subsequent layer. The responses of the units in the input layer are determined by the environment. The responses of each ...

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1038/333452a0

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext