ALBERT — All Library Books, journals and Electronic Records Telegrafenberg

1

Electronic Resource

The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms (1996)

Koenig, Sven ; Simmons, Reid G.

Springer

Machine learning 22 (1996), S. 227-250

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: action models ; admissible and consistent heuristics ; action-penalty representation ; complexity, goal-directed exploration ; goal-reward representation ; on-line reinforcement learning ; prior knowledge ; reward structure ; Q-hat-learning ; Q-learning

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract We analyze the complexity of on-line reinforcement-learning algorithms applied to goal-directed exploration tasks. Previous work had concluded that, even in deterministic state spaces, initially uninformed reinforcement learning was at least exponential for such problems, or that it was of polynomial worst-case time-complexity only if the learning methods were augmented. We prove that, to the contrary, the algorithms are tractable with only a simple change in the reward structure ("penalizing the agent for action executions") or in the initialization of the values that they maintain. In particular, we provide tight complexity bounds for both Watkins' Q-learning and Heger's Q-hat-learning and show how their complexity depends on properties of the state spaces. We also demonstrate how one can decrease the complexity even further by either learning action models or utilizing prior knowledge of the topology of the state spaces. Our results provide guidance for empirical reinforcement-learning researchers on how to distinguish hard reinforcement-learning problems from easy ones and how to represent them in a way that allows them to be solved efficiently.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00114729

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

2

Electronic Resource

The Effect of Representation and Knowledge on Goal-Directed Exploration with Reinforcement-Learning Algorithms (1996)

Koenig, Sven ; Simmons, Reid G.

Springer

Machine learning 22 (1996), S. 227-250

add to mindlist on the mindlist

Details

ISSN: 0885-6125

Keywords: action models ; admissible and consistent heuristics ; action-penalty representation ; complexity ; goal-directed exploration ; goal-reward representation ; on-line reinforcement learning ; prior knowledge ; reward structure ; Q-hat-learning ; Q-learning

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract We analyze the complexity of on-line reinforcement-learning algorithms applied to goal-directed exploration tasks. Previous work had concluded that, even in deterministic state spaces, initially uninformed reinforcement learning was at least exponential for such problems, or that it was of polynomial worst-case time-complexity only if the learning methods were augmented. We prove that, to the contrary, the algorithms are tractable with only a simple change in the reward structure ("penalizing the agent for action executions") or in the initialization of the values that they maintain. In particular, we provide tight complexity bounds for both Watkins‘ Q-learning and Heger‘s Q-hat-learning and show how their complexity depends on properties of the state spaces. We also demonstrate how one can decrease the complexity even further by either learning action models or utilizing prior knowledge of the topology of the state spaces. Our results provide guidance for empirical reinforcement-learning researchers on how to distinguish hard reinforcement-learning problems from easy ones and how to represent them in a way that allows them to be solved efficiently.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1018068507504

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

3

Electronic Resource

Evolution of a Prototype Lunar Rover: Addition of Laser-Based Hazard Detection, and Results from Field Trials in Lunar Analog Terrain (1999)

Krotkov, Eric ; Hebert, Martial ; Henriksen, Lars ; [et al.]

Springer

Autonomous robots 7 (1999), S. 119-130

add to mindlist on the mindlist

Details

ISSN: 1573-7527

Keywords: safeguard teleoperation ; hazard detection ; laser range finder ; stereo range finder ; lunar rover

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics

Notes: Abstract This paper presents the results of field trials of a prototype lunar rover traveling over natural terrain under safeguarded teleoperation control. Both the rover and the safeguarding approach have been used in previous work. The original contributions of this paper are the development and integration of a laser hazard detection system, and extensive field testing of the overall system. The laser system, which complements an existing stereo vision system, is based on a line-scanning laser ranger viewing the area 1 meter in front of the rover. The laser system has demonstrated excellent performance: zero misses and few false alarms operating at 4 Hz. The overall safeguarding system guided the rover 43 km over lunar analogue terrain with 0.8 failures per kilometer.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1023/A:1008926000060

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

4

Electronic Resource

Progress towards robotic exploration of extreme terrain (1992)

Simmons, Reid ; Krotkov, Eric ; Whittaker, William ; [et al.]

Springer

Applied intelligence 2 (1992), S. 163-180

add to mindlist on the mindlist

Details

ISSN: 1573-7497

Keywords: Autonomous robots ; robot system design ; planetary exploration

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science

Notes: Abstract A high degree of mobility, reliability, and efficiency are needed for autonomous exploration of extreme terrain. These requirements have guided the development of the Ambler, a six-legged robot designed for planetary exploration. To address issues of efficiency and mobility, the Ambler is configured with a stacked arrangement of orthogonal legs and exhibits a unique circulating gait, where trailing legs recover directly from rear to front. The Ambler is designed to stably traverse a 30 degree slope while crossing meter sized features. The same three principles have provided many constraints on the design of a software system that autonomously navigates the Ambler through natural terrain using 3-D perception and a combined deliberative/reactive architecture. The software system has required research advances in real-time control, perception of rugged terrain, motion planning, task-level control, and system integration. This paper presents many of the factors that influenced the design of the Ambler and its software system. In particular, important assumptions regarding the mechanism, perception, planning, and control are presented and evaluated in light of experimental and theoretical research of this project.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00058761

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

5

Electronic Resource

Stereo perception and dead reckoning for a prototype lunar rover (1995)

Krotkov, Eric ; Hebert, Martial ; Simmons, Reid

Springer

Autonomous robots 2 (1995), S. 313-331

add to mindlist on the mindlist

Details

ISSN: 1573-7527

Keywords: mobile robot ; robot navigation ; terrain mapping ; obstacle avoidance

Source: Springer Online Journal Archives 1860-2000

Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics

Notes: Abstract This paper describes practical, effective approaches to stereo perception and dead reckoning, and presents results from systems implemented for a prototype lunar rover operating in natural, outdoor environments. The stereo perception hardware includes a binocular head mounted on a motion-averaging mast. This head provides images to a normalized correlation matcher, that intelligently selects what part of the image to process (saving time), and subsamples the images (again saving time) without subsampling disparities (which would reduce accuracy). The implementation has operated successfully during long-duration field exercises, processing streams of thousands of images. The dead reckoning approach employs encoders, inclinometers, a compass, and a turn-rate sensor to maintain the position and orientation of the rover as it traverses. The approach integrates classical odometry with inertial guidance. The implementation succeeds in the face of significant sensor noise by virtue of sensor modelling, plus extensive filtering. The stereo and dead reckoning components are used by an obstacle avoidance planner that projects a finite number of arcs through the terrain map, and evaluates the traversability of each arc to choose a travel direction that is safe and effective. With these components integrated into a complete navigation system, a prototype rover has traversed over 1 km in lunar-like environments.

Type of Medium: Electronic Resource

URL: http://dx.doi.org/10.1007/BF00710797

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

Paper (German National Licenses)

Fulltext

6

Unknown

The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms (1996)

Koenig, Sven ; Simmons, Reid G.

Springer

In: Machine Learning. 1996; 22(1-3): 227-250. Published 1996 Jan 01. doi: 10.1007/bf00114729.

add to mindlist on the mindlist

Details

Publication Date: 1996-01-01

Print ISSN: 0885-6125

Electronic ISSN: 1573-0565

Topics: Computer Science

Published by Springer

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

7

Unknown

The roles of associational and causal reasoning in problem solving (1992)

Simmons, Reid G.

Elsevier

In: Artificial Intelligence. 1992; 53(2-3): 159-207. Published 1992 Feb 01. doi: 10.1016/0004-3702(92)90070-e.

add to mindlist on the mindlist

Details

Publication Date: 1992-02-01

Print ISSN: 0004-3702

Electronic ISSN: 1872-7921

Topics: Computer Science

Published by Elsevier

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

PAPER CURRENT

S·F·X

Fulltext

8

Unknown

Lunar rover technology demonstrations with Dante and Ratler (1994)

Whittaker, Red ; Katragadda, Lalitesh ; Krotkov, Eric ; [et al.]

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2013-08-31

Description: Carnegie Mellon University has undertaken a research, development, and demonstration program to enable a robotic lunar mission. The two-year mission scenario is to traverse 1,000 kilometers, revisiting the historic sites of Apollo 11, Surveyor 5, Ranger 8, Apollo 17, and Lunokhod 2, and to return continuous live video amounting to more than 11 terabytes of data. Our vision blends autonomously safeguarded user driving with autonomous operation augmented with rich visual feedback, in order to enable facile interaction and exploration. The resulting experience is intended to attract mass participation and evoke strong public interest in lunar exploration. The encompassing program that forwards this work is the Lunar Rover Initiative (LRI). Two concrete technology demonstration projects currently advancing the Lunar Rover Initiative are: (1) The Dante/Mt. Spurr project, which, at the time of this writing, is sending the walking robot Dante to explore the Mt. Spurr volcano, in rough terrain that is a realistic planetary analogue. This project will generate insights into robot system robustness in harsh environments, and into remote operation by novices; and (2) The Lunar Rover Demonstration project, which is developing and evaluating key technologies for navigation, teleoperation, and user interfaces in terrestrial demonstrations. The project timetable calls for a number of terrestrial traverses incorporating teleoperation and autonomy including natural terrain this year, 10 km in 1995. and 100 km in 1996. This paper will discuss the goals of the Lunar Rover Initiative and then focus on the present state of the Dante/Mt. Spurr and Lunar Rover Demonstration projects.

Keywords: MECHANICAL ENGINEERING

Type: JPL, Third International Symposium on Artificial Intelligence, Robotics, and Automation for Space 1994; p 113-116

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

9

Unknown

From Livingstone to SMV: Formal Verification for Autonomous Spacecrafts (2000)

Simmons, Reid ; Pecheur, Charles

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2013-08-29

Description: To fulfill the needs of its deep space exploration program, NASA is actively supporting research and development in autonomy software. However, the reliable and cost-effective development and validation of autonomy systems poses a tough challenge. Traditional scenario-based testing methods fall short because of the combinatorial explosion of possible situations to be analyzed, and formal verification techniques typically require a tedious, manual modelling by formal method experts. This paper presents the application of formal verification techniques in the development of autonomous controllers based on Livingstone, a model-based health-monitoring system that can detect and diagnose anomalies and suggest possible recovery actions. We present a translator that converts the models used by Livingstone into specifications that can be verified with the SMV model checker. The translation frees the Livingstone developer from the tedious conversion of his design to SMV, and isolates him from the technical details of the SMV program. We describe different aspects of the translation and briefly discuss its application to several NASA domains.

Keywords: Spacecraft Design, Testing and Performance

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview

10

Unknown

Task-level control for autonomous robots (1994)

Simmons, Reid

In: CASI

add to mindlist on the mindlist

Details

Publication Date: 2019-06-28

Description: Task-level control refers to the integration and coordination of planning, perception, and real-time control to achieve given high-level goals. Autonomous mobile robots need task-level control to effectively achieve complex tasks in uncertain, dynamic environments. This paper describes the Task Control Architecture (TCA), an implemented system that provides commonly needed constructs for task-level control. Facilities provided by TCA include distributed communication, task decomposition and sequencing, resource management, monitoring and exception handling. TCA supports a design methodology in which robot systems are developed incrementally, starting first with deliberative plans that work in nominal situations, and then layering them with reactive behaviors that monitor plan execution and handle exceptions. To further support this approach, design and analysis tools are under development to provide ways of graphically viewing the system and validating its behavior.

Keywords: CYBERNETICS

Type: AIAA PAPER 94-1210-CP , NASA. Johnson Space Center, Conference on Intelligent Robotics in Field, Factory, Service, and Space (CIRFFSS 1994), Volume 1; p 275-281

Format: application/pdf

Permalink

	Location	Call Number	Expected	Availability

Others were also interested in ...

NASA TECHNICAL REPORTS

S·F·X

Overview