ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Electronic Resource

Prioritized sweeping: Reinforcement learning with less data and less time (1993)

Moore, Andrew W. ; Atkeson, Christopher G.

Springer

Machine learning 13 (1993), S. 103-130

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: Memory-based learning ; learning control ; reinforcement learning ; temporal differencing ; asynchronous dynamic programming ; heuristic search ; prioritized sweeping

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract We present a new algorithm,prioritized sweeping, for efficient prediction and control of stochastic Markov systems. Incremental learning methods such as temporal differencing and Q-learning have real-time performance. Classical methods are slower, but more accurate, because they make full use of the observations. Prioritized sweeping aims for the best of both worlds. It uses all previous experiences both to prioritize important dynamic programming sweeps and to guide the exploration of state-space. We compare prioritized sweeping with other reinforcement learning schemes for a number of different stochastic optimal control problems. It successfully solves large state-space real-time problems with which other methods have difficulty.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00993104

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

2

Electronic Resource

Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time (1993)

Moore, Andrew W. ; Atkeson, Christopher G.

Springer

Machine learning 13 (1993), S. 103-130

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: Memory-based learning ; learning control ; reinforcement learning ; temporal differencing ; asynchronous dynamic programming ; heuristic search ; prioritized sweeping

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract We present a new algorithm, prioritized sweeping, for efficient prediction and control of stochastic Markov systems. Incremental learning methods such as temporal differencing and Q-learning have real-time performance. Classical methods are slower, but more accurate, because they make full use of the observations. Prioritized sweeping aims for the best of both worlds. It uses all previous experiences both to prioritize important dynamic programming sweeps and to guide the exploration of state-space. We compare prioritized sweeping with other reinforcement learning schemes for a number of different stochastic optimal control problems. It successfully solves large state-space real-time problems with which other methods have difficulty.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1022635613229

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

3

Electronic Resource

The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces (1995)

Moore, Andrew W. ; Atkeson, Christopher G.

Springer

Machine learning 21 (1995), S. 199-233

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: Reinforcement Learning ; Curse of Dimensionality ; Learning Control ; Robotics ; kd-trees

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Parti-game is a new algorithm for learning feasible trajectories to goal regions in high dimensional continuous state-spaces. In high dimensions it is essential that neither planning nor exploration occurs uniformly over a state-space. Parti-game maintains a decision-tree partitioning of state-space and applies techniques from game-theory and computational geometry to efficiently and adaptively concentrate high resolution only on critical areas. The current version of the algorithm is designed to find feasible paths or trajectories to goal regions in high dimensional spaces. Future versions will be designed to find a solution that optimizes a real-valued criterion. Many simulated problems have been tested, ranging from two-dimensional to nine-dimensional state-spaces, including mazes, path planning, non-linear dynamics, and planar snake robots in restricted spaces. In all cases, a good solution is found in less than ten trials and a few minutes.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1022656217772

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

4

Electronic Resource

The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces (1995)

Moore, Andrew W. ; Atkeson, Christopher G.

Springer

Machine learning 21 (1995), S. 199-233

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: Reinforcement Learning ; Curse of Dimensionality ; Learning Control ; Robotics ; kd-trees

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Parti-game is a new algorithm for learning feasible trajectories to goal regions in high dimensional continuous state-spaces. In high dimensions it is essential that neither planning nor exploration occurs uniformly over a state-space. Parti-game maintains a decision-tree partitioning of state-space and applies techniques from game-theory and computational geometry to efficiently and adaptively concentrate high resolution only on critical areas. The current version of the algorithm is designed to find feasible paths or trajectories to goal regions in high dimensional spaces. Future versions will be designed to find a solution that optimizes a real-valued criterion. Many simulated problems have been rested, ranging from two-dimensional to nine-dimensional state-spaces, including mazes, path planning, non-linear dynamics, and planar snake robots in restricted spaces. In all cases, a good solution is found in less than ten trials and a few minutes.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00993591

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

5

Electronic Resource

Locally Weighted Learning (1997)

Atkeson, Christopher G. ; Moore, Andrew W. ; Schaal, Stefan

Springer

Artificial intelligence review 11 (1997), S. 11-73

add to mindlist on the mindlist

Details

ISSN: 1573-7462

Keywords: locally weighted regression ; LOESS ; LWR ; lazy learning ; memory-based learning ; least commitment learning ; distance functions ; smoothing parameters ; weighting functions ; global tuning ; local tuning ; interference

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract This paper surveys locally weighted learning, a form of lazy learning and memory-based learning, and focuses on locally weighted linear regression. The survey discusses distance functions, smoothing parameters, weighting functions, local model structures, regularization of the estimates and bias, assessing predictions, handling noisy data and outliers, improving the quality of predictions by tuning fit parameters, interference between old and new data, implementing locally weighted learning efficiently, and applications of locally weighted learning. A companion paper surveys how locally weighted learning can be used in robot learning and control.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1006559212014

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

6

Electronic Resource

Locally Weighted Learning for Control (1997)

Atkeson, Christopher G. ; Moore, Andrew W. ; Schaal, Stefan

Springer

Artificial intelligence review 11 (1997), S. 75-113

add to mindlist on the mindlist

Details

ISSN: 1573-7462

Keywords: locally weighted regression ; LOESS ; LWR ; lazy learning ; memory-based learning ; least commitment learning ; forward models ; inverse models ; linear quadratic regulation (LQR) ; shifting setpoint algorithm ; dynamic programming

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Lazy learning methods provide useful representations and training algorithms for learning about complex phenomena during autonomous adaptive control of complex systems. This paper surveys ways in which locally weighted learning, a type of lazy learning, has been applied by us to control tasks. We explain various forms that control tasks can take, and how this affects the choice of learning paradigm. The discussion section explores the interesting impact that explicitly remembering all previous experiences has on the problem of learning to control.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1006511328852

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

7

Electronic Resource

The Racing Algorithm: Model Selection for Lazy Learners (1997)

Maron, Oden ; Moore, Andrew W.

Springer

Artificial intelligence review 11 (1997), S. 193-225

add to mindlist on the mindlist

Details

ISSN: 1573-7462

Keywords: lazy learning ; model selection ; cross validation ; optimization ; attribute selection

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract Given a set of models and some training data, we would like to find the model that best describes the data. Finding the model with the lowest generalization error is a computationally expensive process, especially if the number of testing points is high or if the number of models is large. Optimization techniques such as hill climbing or genetic algorithms are helpful but can end up with a model that is arbitrarily worse than the best one or cannot be used because there is no distance metric on the space of discrete models. In this paper we develop a technique called “racing” that tests the set of models in parallel, quickly discards those models that are clearly inferior and concentrates the computational effort on differentiating among the better models. Racing is especially suitable for selecting among lazy learners since training requires negligible expense, and incremental testing using leave-one-out cross validation is efficient. We use racing to select among various lazy learning algorithms and to find relevant features in applications ranging from robot juggling to lesion detection in MRI scans.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1006556606079

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

8

Electronic Resource

An investigation of climate drift in a coupled atmosphere-ocean-sea ice model (1994)

Moore, Andrew M ; Gordon, Hal B

Springer

Climate dynamics 10 (1994), S. 81-95

add to mindlist on the mindlist

Details

ISSN: 1432-0894

Source: Springer Online Journal Archives 1860-2000

Topics: Geosciences , Physics

Notes: Abstract Climate drift is a common and serious problem in most state-of-the-art coupled atmosphere-ocean-sea ice models. We consider the nature of climate drift in such a model, and in particular address the question of whether or not climate drift is inherent to the model, or whether the drift can be averted by a suitable choice of initial conditions or coupling procedure. The “synchronous” approach to coupling was adopted in which the ocean, atmosphere and sea ice models were spun-up independently to equilibrium using climatological forcing fields. The models were then coupled and integrated forward in time. Several experiments were performed which were designed to assess the impact of different coupling methodologies and changes in the initial conditions of the component models on the climate drift of the system. The results of our experiments indicate that climate drift is a problem inherent to the coupled model in that systematic errors in the components lead to incompatibilities in the surface fluxes required by the component models to maintain realistic climatologies. We conclude that climate drift can be averted only if the parameterizations of certain important physical processes are improved which should have the effect of reducing or eliminating these incompatibilities.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00210338

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

9

Electronic Resource

An investigation of climate drift in a coupled atmosphere-ocean-sea ice model (1994)

Moore, Andrew M ; Gordon, Hal B

Springer

Climate dynamics 10 (1994), S. 81-95

add to mindlist on the mindlist

Details

ISSN: 1432-0894

Source: Springer Online Journal Archives 1860-2000

Topics: Geosciences , Physics

Notes: Abstract. Climate drift is a common and serious problem in most state-of-the-art coupled atmosphere-ocean-sea ice models. We consider the nature of climate drift in such a model, and in particular address the question of whether or not climate drift is inherent to the model, or whether the drift can be averted by a suitable choice of initial conditions or coupling procedure. The ”synchronous" approach to coupling was adopted in which the ocean, atmosphere and sea ice models were spun-up independently to equilibrium using climatological forcing fields. The models were then coupled and integrated forward in time. Several experiments were performed which were designed to assess the impact of different coupling methodologies and changes in the initial conditions of the component models on the climate drift of the system. The results of our experiments indicate that climate drift is a problem inherent to the coupled model in that systematic errors in the components lead to incompatibilities in the surface fluxes required by the component models to maintain realistic climatologies. We conclude that climate drift can be averted only if the parameterizations of certain important physical processes are improved which should have the effect of reducing or eliminating these incompatibilities.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/s003820050037

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

10

Electronic Resource

Computing motion using analog VLSI vision chips: An experimental comparison among different approaches (1992)

Horiuchi, Timothy ; Bair, Wyeth ; Bishofberger, Brooks ; [et al.]

Springer

International journal of computer vision 8 (1992), S. 203-216

add to mindlist on the mindlist

Details

ISSN: 1573-1405

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract We have designed, built and tested a number of analog CMOS VLSI circuits for computing 1-D motion from the time-varying intensity values provided by an array of on-chip phototransistors. We present experimental data for two such circuits and discuss their relative performance. One circuit approximates the correlation model while a second chip uses resistive grids to compute zero-crossings to be tracked over time by a separate digital processor. Both circuits integrate image acquisition with image processing functions and compute velocity in real time. For comparison, we also describe the performance of a simple motion algorithm using off-the-shelf digital components. We conclude that analog circuits implementing various correlation-like motion algorithms are more robust than our previous analog circuits implementing gradient-like motion algorithms.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00055152

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext