ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

feed icon rss

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    Monograph available for loan
    Monograph available for loan
    Cambridge, Mass. [u.a.] : MIT Press
    Call number: PIK M 031-05-0020
    Type of Medium: Monograph available for loan
    Pages: XVIII, 322 S , graph. Darst , 24 cm
    Edition: 5. printing
    ISBN: 0262193981
    Series Statement: Adaptive computation and machine learning
    Location: A 18 - must be ordered
    Branch Library: PIK Library
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Electronic Resource
    Electronic Resource
    Springer
    Biological cybernetics 40 (1981), S. 201-211 
    ISSN: 1432-0770
    Source: Springer Online Journal Archives 1860-2000
    Topics: Biology , Computer Science , Physics
    Notes: Abstract An associative memory system is presented which does not require a “teacher” to provide the desired associations. For each input key it conducts a search for the output pattern which optimizes an external payoff or reinforcement signal. The associative search network (ASN) combines pattern recognition and function optimization capabilities in a simple and effective way. We define the associative search problem, discuss conditions under which the associative search network is capable of solving it, and present results from computer simulations. The synthesis of sensory-motor control surfaces is discussed as an example of the associative search problem.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 33-57 
    ISSN: 0885-6125
    Keywords: Reinforcement learning ; Markov Decision Problems ; Temporal Difference Methods ; Least-Squares
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We introduce two new temporal diffence (TD) algorithms based on the theory of linear least-squares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this algorithm, Recursive Least-Squares TD (RLS TD). Although these new TD algorithms require more computation per time-step than do Sutton's TD(λ) algorithms, they are more efficient in a statistical sense because they extract more information from training experiences. We describe a simulation experiment showing the substantial improvement in learning rate achieved by RLS TD in an example Markov prediction problem. To quantify this improvement, we introduce theTD error variance of a Markov chain, ωTD, and experimentally conclude that the convergence rate of a TD algorithm depends linearly on ωTD. In addition to converging more rapidly, LS TD and RLS TD do not have control parameters, such as a learning rate parameter, thus eliminating the possibility of achieving poor performance by an unlucky choice of parameters.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 33-57 
    ISSN: 0885-6125
    Keywords: Reinforcement learning ; Markov Decision Problems ; Temporal Difference Methods ; Least-Squares
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We introduce two new temporal difference (TD) algorithms based on the theory of linear least-squares function approximation. We define an algorithm we call Least-Squares TD (LS TD) for which we prove probability-one convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this algorithm, Recursive Least-Square TD (RLS TD). Although these new TD algorithms require more computation per time-step than do Sutton‘s TD(λ) algorithms, they are more efficient in a statistical sense because they extract more information from training experiences. We describe a simulation experiment showing the substantial improvement in learning rate achieved by RLS TD in an example Markov prediction problem. To quantify this improvement, we introduce the TD error variance of a Markov chain, σTD, and experimentally conclude that the convergence rate of a TD algorithm depends linearly on σTD. In addition to converging more rapidly, LS TD and RLS TD do not have control parameters, such as a learning rate parameter, thus eliminating the possibility of achieving poor performance by an unlucky choice of parameters.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 33 (1998), S. 235-262 
    ISSN: 0885-6125
    Keywords: Reinforcement learning ; multiple agents ; teams ; elevator group control ; discrete event dynamic systems
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Recent algorithmic and theoretical advances in reinforcement learning (RL) have attracted widespread interest. RL algorithms have appeared that approximate dynamic programming on an incremental basis. They can be trained on the basis of real or simulated experiences, focusing their computation on areas of state space that are actually visited during control, making them computationally tractable on very large problems. If each member of a team of agents employs one of these algorithms, a new collective learning algorithm emerges for the team as a whole. In this paper we demonstrate that such collective RL algorithms can be powerful heuristic methods for addressing large-scale control problems. Elevator group control serves as our testbed. It is a difficult domain posing a combination of challenges not seen in most multi-agent learning research to date. We use a team of RL agents, each of which is responsible for controlling one elevator car. The team receives a global reward signal which appears noisy to each agent due to the effects of the actions of the other agents, the random nature of the arrivals and the incomplete observation of the state. In spite of these complications, we show results that in simulation surpass the best of the heuristic elevator control algorithms of which we are aware. These results demonstrate the power of multi-agent RL on a very large scale stochastic dynamic optimization problem of practical utility.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Electronic Resource
    Electronic Resource
    Springer
    Biological cybernetics 42 (1981), S. 1-8 
    ISSN: 1432-0770
    Source: Springer Online Journal Archives 1860-2000
    Topics: Biology , Computer Science , Physics
    Notes: Abstract In a previous paper we defined the associative search problem and presented a system capable of solving it under certain conditions. In this paper we interpret a spatial learning problem as an associative search task and describe the behavior of an adaptive network capable of solving it. This example shows how naturally the associative search problem can arise and permits the search, association, and generalization properties of the adaptive network to bee clearly illustrated.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Electronic Resource
    Electronic Resource
    Springer
    Biological cybernetics 43 (1982), S. 175-185 
    ISSN: 1432-0770
    Source: Springer Online Journal Archives 1860-2000
    Topics: Biology , Computer Science , Physics
    Notes: Abstract An approach to solving nonlinear control problems is illustrated by means of a layered associative network composed of adaptive elements capable of reinforcement learning. The first layer adaptively develops a representation in terms of which the second layer can solve the problem linearly. The adaptive elements comprising the network employ a novel type of learning rule whose properties, we argue, are essential to the adaptive behavior of the layered network. The behavior of the network is illustrated by means of a spatial learning problem that requires the formation of nonlinear associations. We argue that this approach to nonlinearity can be extended to a large class of nonlinear control problems.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 1981-11-01
    Print ISSN: 0340-1200
    Electronic ISSN: 1432-0770
    Topics: Biology , Computer Science , Physics
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 1982-04-01
    Print ISSN: 0340-1200
    Electronic ISSN: 1432-0770
    Topics: Biology , Computer Science , Physics
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 1993-01-01
    Print ISSN: 0166-2236
    Electronic ISSN: 1878-108X
    Topics: Biology , Medicine
    Published by Cell Press
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...