ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (77)
  • stability  (39)
  • reinforcement learning  (38)
  • Springer  (77)
  • Blackwell Publishing Ltd
  • 1995-1999  (73)
  • 1980-1984  (4)
  • 1925-1929
  • Computer Science  (77)
Collection
  • Articles  (77)
Publisher
Years
Year
  • 1
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 28 (1997), S. 169-210 
    ISSN: 0885-6125
    Keywords: Explanation-based learning ; reinforcement learning ; dynamic programming ; goal regression ; speedup learning ; incomplete theory problem ; intractable theory problem
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In speedup-learning problems, where full descriptions of operators are known, both explanation-based learning (EBL) and reinforcement learning (RL) methods can be applied. This paper shows that both methods involve fundamentally the same process of propagating information backward from the goal toward the starting state. Most RL methods perform this propagation on a state-by-state basis, while EBL methods compute the weakest preconditions of operators, and hence, perform this propagation on a region-by-region basis. Barto, Bradtke, and Singh (1995) have observed that many algorithms for reinforcement learning can be viewed as asynchronous dynamic programming. Based on this observation, this paper shows how to develop dynamic programming versions of EBL, which we call region-based dynamic programming or Explanation-Based Reinforcement Learning (EBRL). The paper compares batch and online versions of EBRL to batch and online versions of point-based dynamic programming and to standard EBL. The results show that region-based dynamic programming combines the strengths of EBL (fast learning and the ability to scale to large state spaces) with the strengths of reinforcement learning algorithms (learning of optimal policies). Results are shown in chess endgames and in synthetic maze tasks.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    ISSN: 0885-6125
    Keywords: reinforcement learning ; vision ; learning from easy mission ; state-action deviation
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper presents a method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal. We discuss several issues in applying the reinforcement learning method to a real robot with vision sensor by which the robot can obtain information about the changes in an environment. First, we construct a state space in terms of size, position, and orientation of a ball and a goal in an image, and an action space is designed in terms of the action commands to be sent to the left and right motors of a mobile robot. This causes a state-action deviation problem in constructing the state and action spaces that reflect the outputs from physical sensors and actuators, respectively. To deal with this issue, an action set is constructed in a way that one action consists of a series of the same action primitive which is successively executed until the current state changes. Next, to speed up the learning time, a mechanism of Learning from Easy Missions (or LEM) is implemented. LEM reduces the learning time from exponential to almost linear order in the size of the state space. The results of computer simulations and real robot experiments are given.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 32 (1998), S. 5-40 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; temporal difference ; Monte Carlo ; MSE ; bias ; variance ; eligibility trace ; Markov reward process
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We provide analytical expressions governing changes to the bias and variance of the lookup table estimators provided by various Monte Carlo and temporal difference value estimation algorithms with offline updates over trials in absorbing Markov reward processes. We have used these expressions to develop software that serves as an analysis tool: given a complete description of a Markov reward process, it rapidly yields an exact mean-square-error curve, the curve one would get from averaging together sample mean-square-error curves from an infinite number of learning trials on the given problem. We use our analysis tool to illustrate classes of mean-square-error curve behavior in a variety of example reward processes, and we show that although the various temporal difference algorithms are quite sensitive to the choice of step-size and eligibility-trace parameters, there are values of these parameters that make them similarly competent, and generally good.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    ISSN: 0885-6125
    Keywords: inductive bias ; reinforcement learning ; reward acceleration ; Levin search ; success-story algorithm ; incremental self-improvement
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We study task sequences that allow for speeding up the learner's average reward intake through appropriate shifts of inductive bias (changes of the learner's policy). To evaluate long-term effects of bias shifts setting the stage for later bias shifts we use the “success-story algorithm” (SSA). SSA is occasionally called at times that may depend on the policy itself. It uses backtracking to undo those bias shifts that have not been empirically observed to trigger long-term reward accelerations (measured up until the current SSA call). Bias shifts that survive SSA represent a lifelong success history. Until the next SSA call, they are considered useful and build the basis for additional bias shifts. SSA allows for plugging in a wide variety of learning algorithms. We plug in (1) a novel, adaptive extension of Levin search and (2) a method for embedding the learner's policy modification strategy within the policy itself (incremental self-improvement). Our inductive transfer case studies involve complex, partially observable environments where traditional reinforcement learning fails.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 123-158 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; temporal difference learning ; eligibility trace ; Monte Carlo method ; Markov chain ; CMAC
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The eligibility trace is one of the basic mechanisms used in reinforcement learning to handle delayed reward. In this paper we introduce a new kind of eligibility trace, thereplacing trace, analyze it theoretically, and show that it results in faster, more reliable learning than the conventional trace. Both kinds of trace assign credit to prior events according to how recently they occurred, but only the conventional trace gives greater credit to repeated events. Our analysis is for conventional and replace-trace versions of the offline TD(1) algorithm applied to undiscounted absorbing Markov chains. First, we show that these methods converge under repeated presentations of the training set to the same predictions as two well known Monte Carlo methods. We then analyze the relative efficiency of the two Monte Carlo methods. We show that the method corresponding to conventional TD is biased, whereas the method corresponding to replace-trace TD is unbiased. In addition, we show that the method corresponding to replacing traces is closely related to the maximum likelihood solution for these tasks, and that its mean squared error is always lower in the long run. Computational results confirm these analyses and show that they are applicable more generally. In particular, we show that replacing traces significantly improve performance and reduce parameter sensitivity on the "Mountain-Car" task, a full reinforcement-learning problem with a continuous state space, when using a feature-based function approximator.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 283-290 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; temporal difference learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-programming based reinforcement learning method, with the TD(λ) return estimation process, which is typically used in actor-critic learning, another well-known dynamic-programming based reinforcement learning method. The parameter λ is used to distribute credit throughout sequences of actions, leading to faster learning and also helping to alleviate the non-Markovian effect of coarse state-space quatization. The resulting algorithm.Q(λ)-learning, thus combines some of the best features of the Q-learning and actor-critic learning paradigms. The behavior of this algorithm has been demonstrated through computer simulations.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 123-158 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; temporal difference learning ; eligibility trace ; Monte Carlo method ; Markov chain ; CMAC
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The eligibility trace is one of the basic mechanisms used in reinforcement learning to handle delayed reward. In this paper we introduce a new kind of eligibility trace, the replacing trace, analyze it theoretically, and show that it results in faster, more reliable learning than the conventional trace. Both kinds of trace assign credit to prior events according to how recently they occurred, but only the conventional trace gives greater credit to repeated events. Our analysis is for conventional and replace-trace versions of the offline TD(1) algorithm applied to undiscounted absorbing Markov chains. First, we show that these methods converge under repeated presentations of the training set to the same predictions as two well known Monte Carlo methods. We then analyze the relative efficiency of the two Monte Carlo methods. We show that the method corresponding to conventional TD is biased, whereas the method corresponding to replace-trace TD is unbiased. In addition, we show that the method corresponding to replacing traces is closely related to the maximum likelihood solution for these tasks, and that its mean squared error is always lower in the long run. Computational results confirm these analyses and show that they are applicable more generally. In particular, we show that replacing traces significantly improve performance and reduce parameter sensitivity on the "Mountain-Car" task, a full reinforcement-learning problem with a continuous state space, when using a feature-based function approximator.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    ISSN: 0885-6125
    Keywords: reinforcement learning ; vision ; learning from easy mission ; state-action deviation
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper presents a method of vision-based reinforcement learning by which a robot learns to shoot a ball into a goal. We discuss several issues in applying the reinforcement learning method to a real robot with vision sensor by which the robot can obtain information about the changes in an environment. First, we construct a state space in terms of size, position, and orientation of a ball and a goal in an image, and an action space is designed in terms of the action commands to be sent to the left and right motors of a mobile robot. This causes a “state-action deviation” problem in constructing the state and action spaces that reflect the outputs from physical sensors and actuators, respectively. To deal with this issue, an action set is constructed in a way that one action consists of a series of the same action primitive which is successively executed until the current state changes. Next, to speed up the learning time, a mechanism of Learning from Easy Missions (or LEM) is implemented. LEM reduces the learning time from exponential to almost linear order in the size of the state space. The results of computer simulations and real robot experiments are given.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 19 (1995), S. 209-240 
    ISSN: 0885-6125
    Keywords: learning classifier systems ; reinforcement learning ; genetic algorithms ; animat problem
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this article we investigate the feasibility of using learning classifier systems as a tool for building adaptive control systems for real robots. Their use on real robots imposes efficiency constraints which are addressed by three main tools: parallelism, distributed architecture, and training. Parallelism is useful to speed up computation and to increase the flexibility of the learning system design. Distributed architecture helps in making it possible to decompose the overall task into a set of simpler learning tasks. Finally, training provides guidance to the system while learning, shortening the number of cycles required to learn. These tools and the issues they raise are first studied in simulation, and then the experience gained with simulations is used to implement the learning system on the real robot. Results have shown that with this approach it is possible to let the AutonoMouse, a small real robot, learn to approach a light source under a number of different noise and lesion conditions.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 35 (1999), S. 155-185 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; multi-agent systems ; planning ; evolutionary economics ; tragedy of the commons ; classifier systems ; agoric systems ; autonomous programming ; cognition ; artificial intelligence ; Hayek ; complex adaptive systems ; temporal difference learning ; evolutionary computation ; economic models of mind ; economic models of computation ; Blocks World ; reasoning ; learning ; computational learning theory ; learning to reason ; meta-reasoning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A market-based algorithm is presented which autonomously apportions complex tasks to multiple cooperating agents giving each agent the motivation of improving performance of the whole system. A specific model, called “The Hayek Machine” is proposed and tested on a simulated Blocks World (BW) planning problem. Hayek learns to solve more complex BW problems than any previous learning algorithm. Given intermediate reward and simple features, it has learned to efficiently solve arbitrary BW problems. The Hayek Machine can also be seen as a model of evolutionary economics.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 20 (1995), S. 23-33 
    ISSN: 0885-6125
    Keywords: stability ; bias ; accuracy ; repeatability ; agreement ; similarity
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Research on bias in machine learning algorithms has generally been concerned with the impact of bias on predictive accuracy. We believe that there are other factors that should also play a role in the evaluation of bias. One such factor is the stability of the algorithm; in other words, the repeatability of the results. If we obtain two sets of data from the same phenomenon, with the same underlying probability distribution, then we would like our learning algorithm to induce approximately the same concepts from both sets of data. This paper introduces a method for quantifying stability, based on a measure of the agreement between concepts. We also discuss the relationships among stability, predictive accuracy, and bias.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 59-94 
    ISSN: 0885-6125
    Keywords: Compact representation ; curse of dimensionality ; dynamic programming ; features ; function approximation ; neuro-dynamic programming ; reinforcement learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We develop a methodological framework and present a few different ways in which dynamic programming and compact representations can be combined to solve large scale stochastic control problems. In particular, we develop algorithms that employ two types of feature-based compact representations; that is, representations that involve feature extraction and a relatively simple approximation architecture. We prove the convergence of these algorithms and provide bounds on the approximation error. As an example, one of these algorithms is used to generate a strategy for the game of Tetris. Furthermore, we provide a counter-example illustrating the difficulties of integrating compact representations with dynamic programming, which exemplifies the shortcomings of certain simple approaches.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 19 (1995), S. 209-240 
    ISSN: 0885-6125
    Keywords: learning classifier systems ; reinforcement learning ; genetic algorithms ; animat problem
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this article we investigate the feasibility of using learning classifier systems as a tool for building adaptive control systems for real robots. Their use on real robots imposes efficiency constraints which are addressed by three main tools: parallelism, distributed architecture, and training. Parallelism is useful to speed up computation and to increase the flexibility of the learning system design. Distributed architecture helps in making it possible to decompose the overall task into a set of simpler learning tasks. Finally, training provides guidance to the system while learning, shortening the number of cycles required to learn. These tools and the issues they raise are first studied in simulation, and then the experience gained with simulations is used to implement the learning system on the real robot. Results have shown that with this approach it is possible to let the AutonoMouse, a small real robot, learn to approach a light source under a number of different noise and lesion conditions.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 59-94 
    ISSN: 0885-6125
    Keywords: Compact representation ; curse of dimensionality ; dynamic programming ; features ; function approximation ; neuro-dynamic programming ; reinforcement learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We develop a methodological framework and present a few different ways in which dynamic programming and compact representations can be combined to solve large scale stochastic control problems. In particular, we develop algorithms that employ two types of feature-based compact representations; that is, representations that involve feature extraction and a relatively simple approximation architecture. We prove the convergence of these algorithms and provide bounds on the approximation error. As an example, one of these algorithms is used to generate a strategy for the game of Tetris. Furthermore, we provide a counter-example illustrating the difficulties of integrating compact representations with dynamic programming, which exemplifies the shortcomings of certain simple approaches.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 22 (1996), S. 283-290 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; temporal difference learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper presents a novel incremental algorithm that combines Q-learning, a well-known dynamic-programming based reinforcement learning method, with the TD(λ) return estimation process, which is typically used in actor-critic learning, another well-known dynamic-programming based reinforcement learning method. The parameter λ is used to distribute credit throughout sequences of actions, leading to faster learning and also helping to alleviate the non-Markovian effect of coarse state-space quantization. The resulting algorithm, Q(λ)-learning, thus combines some of the best features of the Q-learning and actor-critic learning paradigms. The behavior of this algorithm has been demonstrated through computer simulations.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 20 (1995), S. 23-33 
    ISSN: 0885-6125
    Keywords: stability ; bias ; accuracy ; repeatability ; agreement ; similarity
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Research on bias in machine learning algorithms has generally been concerned with the impact of bias on predictive accuracy. We believe that there are other factors that should also play a role in the evaluation of bias. One such factor is the stability of the algorithm; in other words, the repeatability of the results. If we obtain two sets of data from the same phenomenon, with the same underlying probability distribution, then we would like our learning algorithm to induce approximately the same concepts from both sets of data. This paper introduces a method for quantifying stability, based on a measure of the agreement between concepts. We also discuss the relationships among stability, predictive accuracy, and bias.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 28 (1997), S. 77-104 
    ISSN: 0885-6125
    Keywords: Continual learning ; transfer ; reinforcement learning ; sequence learning ; hierarchical neural networks
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Continual learning is the constant development of increasingly complex behaviors; the process of building more complicated skills on top of those already developed. A continual-learning agent should therefore learn incrementally and hierarchically. This paper describes CHILD, an agent capable of Continual, Hierarchical, Incremental Learning and Development. CHILD can quickly solve complicated non-Markovian reinforcement-learning tasks and can then transfer its skills to similar but even more complicated tasks, learning these faster still.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 35 (1999), S. 117-154 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; exploration vs. exploitation dilemma ; Markov decision processes ; bandit problems
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper presents an action selection technique for reinforcement learning in stationary Markovian environments. This technique may be used in direct algorithms such as Q-learning, or in indirect algorithms such as adaptive dynamic programming. It is based on two principles. The first is to define a local measure of the uncertainty using the theory of bandit problems. We show that such a measure suffers from several drawbacks. In particular, a direct application of it leads to algorithms of low quality that can be easily misled by particular configurations of the environment. The second basic principle was introduced to eliminate this drawback. It consists of assimilating the local measures of uncertainty to rewards, and back-propagating them with the dynamic programming or temporal difference mechanisms. This allows reproducing global-scale reasoning about the uncertainty, using only local measures of it. Numerical simulations clearly show the efficiency of these propositions.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 31 (1998), S. 55-85 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; module-based RL ; robot learning ; problem decomposition ; Markovian Decision Problems ; feature space ; subgoals ; local control ; switching control
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The behavior of reinforcement learning (RL) algorithms is best understood in completely observable, discrete-time controlled Markov chains with finite state and action spaces. In contrast, robot-learning domains are inherently continuous both in time and space, and moreover are partially observable. Here we suggest a systematic approach to solve such problems in which the available qualitative and quantitative knowledge is used to reduce the complexity of learning task. The steps of the design process are to:i) decompose the task into subtasks using the qualitative knowledge at hand; ii) design local controllers to solve the subtasks using the available quantitative knowledge and iii) learn a coordination of these controllers by means of reinforcement learning. It is argued that the approach enables fast, semi-automatic, but still high quality robot-control as no fine-tuning of the local controllers is needed. The approach was verified on a non-trivial real-life robot task. Several RL algorithms were compared by ANOVA and it was found that the model-based approach worked significantly better than the model-free approach. The learnt switching strategy performed comparably to a handcrafted version. Moreover, the learnt strategy seemed to exploit certain properties of the environment which were not foreseen in advance, thus supporting the view that adaptive algorithms are advantageous to non-adaptive ones in complex environments.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 33 (1998), S. 105-115 
    ISSN: 0885-6125
    Keywords: reinforcement learning ; Q-learning ; TD(λ) ; online Q(λ) ; lazy learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Q(λ)-learning uses TD(λ)-methods to accelerate Q-learning. The update complexity of previous online Q(λ) implementations based on lookup tables is bounded by the size of the state/action space. Our faster algorithm's update complexity is bounded by the number of actions. The method is based on the observation that Q-value updates may be postponed until they are needed.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Electronic Resource
    Electronic Resource
    Springer
    Machine learning 33 (1998), S. 201-233 
    ISSN: 0885-6125
    Keywords: Markov games ; differential games ; pursuit games ; multiagent learning ; reinforcement learning ; Q-learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Game playing has been a popular problem area for research in artificial intelligence and machine learning for many years. In almost every study of game playing and machine learning, the focus has been on games with a finite set of states and a finite set of actions. Further, most of this research has focused on a single player or team learning how to play against another player or team that is applying a fixed strategy for playing the game. In this paper, we explore multiagent learning in the context of game playing and develop algorithms for “co-learning” in which all players attempt to learn their optimal strategies simultaneously. Specifically, we address two approaches to colearning, demonstrating strong performance by a memory-based reinforcement learner and comparable but faster performance with a tree-based reinforcement learner.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 19 (1997), S. 411-436 
    ISSN: 1573-0409
    Keywords: assembly planning ; stability ; robot ; forward ; operations
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract The paper presents an approach to sequence planning consisting in determining assembly sequences defined in terms of mating and non-mating operations and based on a dynamic expansion of the assembly tree obtained using a knowledge base management system. The planner considers the case of a single-robot assembly workcell. The use of stability and the detailed definition of sequences also by means of several non-mating operations are shown to be powerful instruments in the control of the tree expansion. Forward assembly planning has been chosen, in order to minimize the number of stability checks. Backtracking is avoided by combining precedence relations and stability analysis. Hard and soft constrains are introduced to drive the tree expansion. Hard constraints are precedence relations and stability analysis. All operations are associated to costs, which are used as soft constraints. The operation based approach enables one to manage even non-mating operations and to easily overcome the linearity constraint. Costs enable the planner to manage the association among tools and components. The first section of the paper concerns Stability Analysis that is subdivided into Static and Dynamic Stability Analysis. The former is mainly involved in analyzing gravity effects; the latter is mainly involved in evaluate inertia effects due to manipulation. Stability Analysis is implemented in a simplified form. Fundamental assumptions are: no rotational equilibrium condition is considered; for each reaction force only direction and versus, but not magnitude, are considered; friction is neglected. The second section discusses the structure of the planner and its implementation. The planner is a rule based system. Forward chaining and hypothetical reasoning are the inference strategies used. The knowledge base and the data base of the system are presented and the advantages obtained using a rule based system are discussed. The third section shows two planning examples, showing the performance of the system in a simple case and in an industrial test case, the assembly of a microwave branching filter composed of 26 components.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 23 (1998), S. 165-182 
    ISSN: 1573-0409
    Keywords: compliance tasks ; reinforcement learning ; robust control
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract The complexity in planning and control of robot compliance tasks mainly results from simultaneous control of both position and force and inevitable contact with environments. It is quite difficult to achieve accurate modeling of the interaction between the robot and the environment during contact. In addition, the interaction with the environment varies even for compliance tasks of the same kind. To deal with these phenomena, in this paper, we propose a reinforcement learning and robust control scheme for robot compliance tasks. A reinforcement learning mechanism is used to tackle variations among compliance tasks of the same kind. A robust compliance controller that guarantees system stability in the presence of modeling uncertainties and external disturbances is used to execute control commands sent from the reinforcement learning mechanism. Simulations based on deburring compliance tasks demonstrate the effectiveness of the proposed scheme.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 22 (1998), S. 23-38 
    ISSN: 1573-0409
    Keywords: robot dynamic model ; stiffness matrix ; constant disturbance ; integrator backstepping ; Liapunov functions ; Barbalat lemma ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract A robust regulator for flexible-joint robots is proposed, which yields constant torque disturbance rejection acting on the links. The design uses the integrator backstepping technique [4,5] to cancel nonlinearities and disturbance not in the range space of the control. Stability of the closed loop system is shown using iterative Liapunov functions.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 20 (1997), S. 131-155 
    ISSN: 1573-0409
    Keywords: robot adaptive control ; basis function-like networks ; stability ; discrete variable structure
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract Stable neural network-based sampled-data indirect and direct adaptivecontrol approaches, which are the integration of a neural network (NN)approach and the adaptive implementation of the discrete variable structurecontrol, are developed in this paper for the trajectory tracking control ofa robot arm with unknown nonlinear dynamics. The robot arm is assumed tohave an upper and lower bound of its inertia matrix norm and its states areavailable for measurement. The discrete variable structure control servestwo purposes, i.e., one is to force the system states to be within the stateregion in which neural networks are used when the system goes out of neuralcontrol; and the other is to improve the tracking performance within the NNapproximation region. Main theory results for designing stable neuralnetwork-based sampled data indirect and direct adaptive controllers aregiven, and the extension of the proposed control approaches to the compositeadaptive control of a flexible-link robot is discussed. Finally, theeffectiveness of the proposed control approaches is illustrated throughsimulation studies.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 26 (1999), S. 91-100 
    ISSN: 1573-0409
    Keywords: robots ; neural networks ; adaptiveness ; stability ; approximation
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract An indirect adaptive control approach is developed in this paper for robots with unknown nonlinear dynamics using neural networks (NNs). A key property of the proposed approach is that the actual joint angle values in the control law are replaced by the desired joint angles, angle velocities and accelerators, and the bound on the NN reconstruction errors is assumed to be unknown. Main theoretical results for designing such a neuro-controller are given, and the control performance of the proposed controller is verified with simulation studies.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 21 (1998), S. 51-71 
    ISSN: 1573-0409
    Keywords: Q-learning algorithm ; reinforcement learning ; experience generalisation
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract In the last years, temporal differences methods have been put forward as convenient tools for reinforcement learning. Techniques based on temporal differences, however, suffer from a serious drawback: as stochastic adaptive algorithms, they may need extensive exploration of the state-action space before convergence is achieved. Although the basic methods are now reasonably well understood, it is precisely the structural simplicity of the reinforcement learning principle – learning through experimentation – that causes these excessive demands on the learning agent. Additionally, one must consider that the agent is very rarely a tabula rasa: some rough knowledge about characteristics of the surrounding environment is often available. In this paper, I present methods for embedding a priori knowledge in a reinforcement learning technique in such a way that both the mathematical structure of the basic learning algorithm and the capacity to generalise experience across the state-action space are kept. Extensive experimental results show that the resulting variants may lead to good performance, provided a sensible balance between risky use of prior imprecise knowledge and cautious use of learning experience is adopted.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 21 (1998), S. 221-238 
    ISSN: 1573-0409
    Keywords: artificial neural network ; dynamic control ; reinforcement learning ; robot control
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract Conventional robot control schemes are basically model-based methods. However, exact modeling of robot dynamics poses considerable problems and faces various uncertainties in task execution. This paper proposes a reinforcement learning control approach for overcoming such drawbacks. An artificial neural network (ANN) serves as the learning structure, and an applied stochastic real-valued (SRV) unit as the learning method. Initially, force tracking control of a two-link robot arm is simulated to verify the control design. The simulation results confirm that even without information related to the robot dynamic model and environment states, operation rules for simultaneous controlling force and velocity are achievable by repetitive exploration. Hitherto, however, an acceptable performance has demanded many learning iterations and the learning speed proved too slow for practical applications. The approach herein, therefore, improves the tracking performance by combining a conventional controller with a reinforcement learning strategy. Experimental results demonstrate improved trajectory tracking performance of a two-link direct-drive robot manipulator using the proposed method.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 23 (1998), S. 27-43 
    ISSN: 1573-0409
    Keywords: autonomous control ; actuator delays ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract In this paper, we consider the control design problem of vehicle following systems with actuator delays. An upper bound for the time delays is first constructed to guarantee the vehicle stability. Second, sufficient conditions are presented to avoid slinky-effects in the vehicle following. Next, zero steady state achieved by the proposed controller is proven. Finally, simulations are given to examine our claims.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Electronic Resource
    Electronic Resource
    Springer
    Journal of intelligent and robotic systems 12 (1995), S. 103-125 
    ISSN: 1573-0409
    Keywords: Machine learning ; reinforcement learning ; intelligent control ; machine tool ; tool monitoring ; metal cutting ; manufacturing
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract This paper deals with the issue of automatic learning and recognition of various conditions of a machine tool. The ultimate goal of the research discussed in this paper is to develop a comparehensive monitor and control (M&C) system that can substitute for the expert machinist and perform certain critical in-process tasks to assure quality production. The M&C system must reliably recognize and respond to qualitatively different behaviours of the machine tool, learn new behaviors, respond faster than its human counterpart to quality threatening circumstances, and interface with an existing controller. The research considers a series of face-milling anomalies that were subsequently simulated and used as a first step towards establishing the feasibility of employing machine learning as an integral component of the intelligent controller. We address the question of feasibility in two steps. First, it is important to know if the process models (dull tool, broken tool, etc.) can be learned (model learning). And second, if the models are learned, can an algorithm reliably select an appropriate model (distinguish between dull and broken tools) based on input from the model learner and from the sensors (model selection). The results of the simulation-based tests demonstrate that the milling-process anomalies can be learned, and the appropriate model can be reliably selected. Such a model can be subsequently utilized to make compensating in-process machine-tool adjustments. In addition, we observed that the learning curve need not approach the 100% level to be functional.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Electronic Resource
    Electronic Resource
    Springer
    Computing 31 (1983), S. 261-267 
    ISSN: 1436-5057
    Keywords: 65M10 ; Dispersive equation ; finite difference ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Description / Table of Contents: Zusammenfassung Dieser Artikel beinhaltet eine Zusammenstellung von Differenzenverfahren für die Dispersionsgleichungu 1=au xxx. Es werden Kriterien zur Herleitung von Stabilitätsbedingungen für Differenzenverfahren angegeben und auf die angegebenen Differenzenverfahren angewendet.
    Notes: Abstract In this paper a table of difference schemes for the dispersive equationu i=au xxx is presented. A collection of criterions for deriving stability conditions of difference schemes is given and applied to these difference schemes.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Electronic Resource
    Electronic Resource
    Springer
    Computing 32 (1984), S. 229-237 
    ISSN: 1436-5057
    Keywords: 65L05 ; 65L07 ; Stiff system ; Rosenbroek method ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Description / Table of Contents: Zusammenfassung In dieser Arbeit wird die Stabilität des Kaps-Rentrop-Verfahrens in die Anwesenheit nichtlinearer Steifheit (Stiffness) analysiert. Dazu werden mittels eines einfachen Modells zwei Größen introduziert. Die Werte dieser Größen reflektieren gewissermaßen das Verhalten eines Kaps-Rentrop-Verfahrens in die Anwesenheit einer bestimmten Kopplung zwischen die beiden Komponenten in das steife System gewöhnlicher Differentialgleichungen. Einige numerische Beispiele veranschaulichen die Analyse.
    Notes: Abstract In this paper we give an analysis of the effect of stiff nonlinearities on the behavior of a Kaps-Rentrop method. To that end we introduce two quantities related to a simple model. The values of these quantities determine to some extent the behavior of a Kaps-Rentrop method in case of a strong coupling between the smooth component and the transient one. Numerical examples illustrate the theoretical results.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 26 (1997), S. 343-363 
    ISSN: 1572-9443
    Keywords: retrial queues ; stability ; ergodicity ; renovation
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We consider the following Type of problems. Calls arrive at a queue of capacity K (which is called the primary queue), and attempt to get served by a single server. If upon arrival, the queue is full and the server is busy, the new arriving call moves into an infinite capacity orbit, from which it makes new attempts to reach the primary queue, until it finds it non-full (or it finds the server idle). If the queue is not full upon arrival, then the call (customer) waits in line, and will be served according to the FIFO order. If λ is the arrival rate (average number per time unit) of calls and μ is one over the expected service time in the facility, it is well known that μ 〉 λ is not always sufficient for stability. The aim of this paper is to provide general conditions under which it is a sufficient condition. In particular, (i) we derive conditions for Harris ergodicity and obtain bounds for the rate of convergence to the steady state and large deviations results, in the case that the inter-arrival times, retrial times and service times are independent i.i.d. sequences and the retrial times are exponentially distributed; (ii) we establish conditions for strong coupling convergence to a stationary regime when either service times are general stationary ergodic (no independence assumption), and inter-arrival and retrial times are i.i.d. exponentially distributed; or when inter-arrival times are general stationary ergodic, and service and retrial times are i.i.d. exponentially distributed; (iii) we obtain conditions for the existence of uniform exponential bounds of the queue length process under some rather broad conditions on the retrial process. We finally present conditions for boundedness in distribution for the case of nonpatient (or non persistent) customers.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 29 (1998), S. 129-159 
    ISSN: 1572-9443
    Keywords: rate-based feedback control ; ATM networks ; stability ; optimal algorithms
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Motivated by ABR class of service in ATM networks, we study a continuous time queueing system with a feedback control of the arrival rate of some of the sources. The feedback about the queue length or the total workload is provided at regular intervals (variations on it, especially the traffic management specification TM 4.0, are also considered). The propagation delays can be nonnegligible. For a general class of feedback algorithms, we obtain the stability of the system in the presence of one or more bottleneck nodes in the virtual circuit. Our system is general enough that it can be useful to study feedback control in other network protocols. We also obtain rates of convergence to the stationary distributions and finiteness of moments. For the single botterneck case, we provide algorithms to compute the stationary distributions and the moments of the sojourn times in different sets of states. We also show analytically (by showing continuity of stationary distributions and moments) that for small propagation delays, we can provide feedback algorithms which have higher mean throughput, lower probability of overflow and lower delay jitter than any open loop policy. Finally these results are supplemented by some computational results.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 32 (1999), S. 99-130 
    ISSN: 1572-9443
    Keywords: neural network ; inhibition ; stability ; Markov process ; fluid limit ; Harris-recurrence ; transience
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The subject of the paper is the stability analysis of some neural networks consisting of a finite number of interacting neurons. Following the approach of Dai [5] we use the fluid limit model of the network to derive a sufficient condition for positive Harris-recurrence of the associated Markov process. This improves the main result in Karpelevich et al. [11] and, at the same time, sheds some new light on it. We further derive two different conditions that are sufficient for transience of the state process and illustrate our results by classifying some examples according to positive recurrence or transience.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 33 (1999), S. 293-325 
    ISSN: 1572-9443
    Keywords: stability ; fluid models ; multiclass queueing networks ; piecewise linear Lyapunov functions ; linear Lyapunov functions ; monotone global stability ; static buffer priority disciplines
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper studies the stability of a three‐station fluid network. We show that, unlike the two‐station networks in Dai and Vande Vate [18], the global stability region of our three‐station network is not the intersection of its stability regions under static buffer priority disciplines. Thus, the “worst” or extremal disciplines are not static buffer priority disciplines. We also prove that the global stability region of our three‐station network is not monotone in the service times and so, we may move a service time vector out of the global stability region by reducing the service time for a class. We introduce the monotone global stability region and show that a linear program (LP) related to a piecewise linear Lyapunov function characterizes this largest monotone subset of the global stability region for our three‐station network. We also show that the LP proposed by Bertsimas et al. [1] does not characterize either the global stability region or even the monotone global stability region of our three‐station network. Further, we demonstrate that the LP related to the linear Lyapunov function proposed by Chen and Zhang [11] does not characterize the stability region of our three‐station network under a static buffer priority discipline.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 21 (1995), S. 67-95 
    ISSN: 1572-9443
    Keywords: Polling systems ; stability ; stationary regime
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A stationary regime for polling systems with general ergodic (G/G) arrival processes at each station is constructed. Mutual independence of the arrival processes is not required. It is shown that the stationary workload so constructed is minimal in the stochastic ordering sense. In the model considered the server switches from station to station in a Markovian fashion, and a specific service policy is applied to each queue. Our hypotheses cover the purely gated, thea-limited, the binomial-gated and other policies. As a by-product we obtain sufficient conditions for the stationary regime of aG/G/1/∞ queue with multiple server vacations (see Doshi [11]) to be ergodic.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 22 (1996), S. 47-63 
    ISSN: 1572-9443
    Keywords: Sample-path analysis ; stability ; rate stability ; ω-rate stability ; input-output process ; queueing ; infinite-server queues
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract An input-output processZ = {Z(t), t ⩾ 0} is said to beω-rate stable ifZ(t) = o(ω(t)) for some non-negative functionω(t). We prove that the processZ is ω-rate stable under weak conditions that include the assumption that input satisfies a linear burstiness condition and Z is asymptotically average stable. In many cases of interest, the conditions forω-rate-stability can be verified from input data. For example, using input information, we establishω-rate stability of the workload for multiserver queues, an ATM multiplexer, andω-rate stability of queue-length processes for infinite server queues.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 22 (1996), S. 345-366 
    ISSN: 1572-9443
    Keywords: State-dependent service and interarrival times ; Lindley equation ; recursive stochastic equations ; stability ; normal approximations
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We consider a modification of the standardG/G/1 queueing system with infinite waiting space and the first-in-first-out discipline in which the service times and interarrival times depend linearly and randomly on the waiting times. In this model the waiting times satisfy a modified version of the classical Lindley recursion. When the waiting-time distributions converge to a proper limit, Whitt [10] proposed a normal approximation for this steady-state limit. In this paper we prove a limit theorem for the steady-state limit of the system. Thus, our result provides a solid foundation for Whitt's normal approximation of the steady-state distribution of the system.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    ISSN: 1572-9443
    Keywords: dam ; storage process ; saturation rule ; intermittent production ; state dependent rates ; state dependent jumps ; stability ; positive Harris recurrence
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We consider a dam process with a general (state dependent) release rule and a pure jump input process, where the jump sizes are state dependent. We give sufficient conditions under which the process has a stationary version in the case where the jump times and sizes are governed by a marked point process which is point (Palm) stationary and ergodic. We give special attention to the Markov and Markov regenerative cases for which the main stability condition is weakened. We then study an intermittent production process with state dependent rates. We provide sufficient conditions for stability for this process and show that if these conditions are satisfied, then an interesting new relationship exists between the stationary distribution of this process and a dam process of the type we explore here.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 27 (1997), S. 205-226 
    ISSN: 1572-9443
    Keywords: multiclass queueing networks ; ergodicity ; stability ; performance analysis
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We develop the use of piecewise linear test functions for the analysis of stability of multiclass queueing networks and their associated fluid limit models. It is found that if an associated LP admits a positive solution, then a Lyapunov function exists. This implies that the fluid limit model is stable and hence that the network model is positive Harris recurrent with a finite polynomial moment. Also, it is found that if a particular LP admits a solution, then the network model is transient.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 28 (1998), S. 33-54 
    ISSN: 1572-9443
    Keywords: queueing networks ; throughput ; closed networks ; efficiency ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A closed network is said to be “guaranteed efficient” if the throughput converges under all non-idling policies to the capacity of the bottlenecks in the network, as the number of trapped customers increases to infinity. We obtain a necessary condition for guaranteed efficiency of closed re-entrant lines. For balanced two-station systems, this necessary condition is almost sufficient, differing from it only by the strictness of an inequality. This near characterization is obtained by studying a special type of virtual station called “alternating visit virtual station”. These special virtual stations allow us to relate the necessary condition to certain indices arising in heavy traffic studies using a Brownian network approximation, as well as to certain policies proposed as being extremal with respect to the asymptotic loss in the throughput. Using the near characterization of guaranteed efficiency we also answer the often pondered question of whether an open network or its closed counterpart has greater throughput - the answer is that neither can assure a greater guaranteed throughput.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 29 (1998), S. 55-73 
    ISSN: 1572-9443
    Keywords: multi‐server queue ; customer class ; state‐dependent routing ; stability ; Markov chain ; fluid limit
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We consider a multi‐station queue with a multi‐class input process when any station is available for the service of only some (not all) customer classes. Upon arrival, any customer may choose one of its accessible stations according to some state‐dependent policy. We obtain simple stability criteria for this model in two particular cases when service rates are either station‐ or class‐independent. Then, we study a two‐station queue under general assumptions on service rates. Our proofs are based on the fluid approximation approach.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 32 (1999), S. 131-168 
    ISSN: 1572-9443
    Keywords: stability ; positive recurrence ; fluid limit ; polling system ; exhaustive service policy
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We introduce a generalized criterion for the stability of Markovian queueing systems in terms of stochastic fluid limits. We consider an example in which this criterion may be applied: a polling system with two stations and two heterogeneous servers.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 31 (1999), S. 171-206 
    ISSN: 1572-9443
    Keywords: scheduling ; open multiclass queueing networks ; discrete-review policies ; fluid models ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper describes a family of discrete-review policies for scheduling open multiclass queueing networks. Each of the policies in the family is derived from what we call a dynamic reward function: such a function associates with each queue length vector q and each job class k a positive value r k (q), which is treated as a reward rate for time devoted to processing class k jobs. Assuming that each station has a traffic intensity parameter less than one, all policies in the family considered are shown to be stable. In such a policy, system status is reviewed at discrete points in time, and at each such point the controller formulates a processing plan for the next review period, based on the queue length vector observed. Stability is proved by combining elementary large deviations theory with an analysis of an associated fluid control problem. These results are extended to systems with class dependent setup times as well as systems with alternate routing and admission control capabilities.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Electronic Resource
    Electronic Resource
    Springer
    Queueing systems 32 (1999), S. 195-231 
    ISSN: 1572-9443
    Keywords: window flow control ; TCP ; stability ; multiclass networks ; stationary ergodic point processes ; (max,+)-linear system
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We focus on window flow control as used in packet-switched communication networks. The approach consists in studying the stability of a system where each node on the path followed by the packets of the controlled connection is modeled by a FIFO (First-In-First-Out) queue of infinite capacity which receives in addition some cross traffic represented by an exogenous flow. Under general stochastic assumptions, namely for stationary and ergodic input processes, we show the existence of a maximum throughput allowed by the flow control. Then we establish bounds on the value of this maximum throughput. These bounds, which do not coincide in general, are reached by time-space scalings of the exogenous flows. Therefore, the performance of the window flow control depends not only on the traffic intensity of the cross flows, but also on fine statistical characteristics such as the burstiness of these flows. These results are illustrated by several examples, including the case of a nonmonotone, nonconvex and fractal stability region.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    ISSN: 1432-0770
    Keywords: Key words: Hebbian learning rule ; attractor dynamics ; symmetric connections ; multiplicative normalization ; self-organization ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Biology , Computer Science , Physics
    Notes: Abstract. While learning and development are well characterized in feedforward networks, these features are more difficult to analyze in recurrent networks due to the increased complexity of dual dynamics – the rapid dynamics arising from activation states and the slow dynamics arising from learning or developmental plasticity. We present analytical and numerical results that consider dual dynamics in a recurrent network undergoing Hebbian learning with either constant weight decay or weight normalization. Starting from initially random connections, the recurrent network develops symmetric or near-symmetric connections through Hebbian learning. Reciprocity and modularity arise naturally through correlations in the activation states. Additionally, weight normalization may be better than constant weight decay for the development of multiple attractor states that allow a diverse representation of the inputs. These results suggest a natural mechanism by which synaptic plasticity in recurrent networks such as cortical and brainstem premotor circuits could enhance neural computation and the generation of motor programs.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Electronic Resource
    Electronic Resource
    Springer
    Artificial intelligence review 11 (1997), S. 343-370 
    ISSN: 1573-7462
    Keywords: lazy learning ; nearest neighbor ; genetic algorithms ; differential games ; pursuit games ; teaching ; reinforcement learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Combining different machine learning algorithms in the same system can produce benefits above and beyond what either method could achieve alone. This paper demonstrates that genetic algorithms can be used in conjunction with lazy learning to solve examples of a difficult class of delayed reinforcement learning problems better than either method alone. This class, the class of differential games, includes numerous important control problems that arise in robotics, planning, game playing, and other areas, and solutions for differential games suggest solution strategies for the general class of planning and control problems. We conducted a series of experiments applying three learning approaches – lazy Q-learning, k-nearest neighbor (k-NN), and a genetic algorithm – to a particular differential game called a pursuit game. Our experiments demonstrate that k-NN had great difficulty solving the problem, while a lazy version of Q-learning performed moderately well and the genetic algorithm performed even better. These results motivated the next step in the experiments, where we hypothesized k-NN was having difficulty because it did not have good examples – a common source of difficulty for lazy learning. Therefore, we used the genetic algorithm as a bootstrapping method for k-NN to create a system to provide these examples. Our experiments demonstrate that the resulting joint system learned to solve the pursuit games with a high degree of accuracy – outperforming either method alone – and with relatively small memory requirements.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Electronic Resource
    Electronic Resource
    Springer
    Applied intelligence 11 (1999), S. 241-258 
    ISSN: 1573-7497
    Keywords: analysis strategies ; limited resources ; reinforcement learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Intelligent data analysis implies the reasoned application of autonomous or semi-autonomous tools to data sets drawn from problem domains. Automation of this process of reasoning about analysis (based on factors such as available computational resources, cost of analysis, risk of failure, lessons learned from past errors, and tentative structural models of problem domains) is highly non-trivial. By casting the problem of reasoning about analysis (MetaReasoning) as yet another data analysis problem domain, we have previously [R. Levinson and J. Wilkinson, in Advances in Intelligent Data Analysis, edited by X. Liu, P. Cohen, and M. Berthold, volume LNCS 1280, Springer-Verlag, Berlin, pp. 89–100, 1997] presented a design framework, MetaReasoning for Data Analysis Tool Allocation (MRDATA). Crucial to this framework is the ability of a Tool Allocator to track resource consumption (i.e. processor time and memory usage) by the Tools it employs, as well as the ability to allocate measured quantities of resources to these Tools. In order to test implementations of the MRDATA design, we now implement a Runtime Environment for Data Analysis Tool Allocation, RE:DATA. Tool Allocators run as processes under RE:DATA, are allotted system resources, and may use these resources to run their Tools as spawned sub-processes. We also present designs of native RE:DATA implementations of analysis tools used by MRDATA: K-Nearest Neighbor Tables, Regression Trees, Interruptible (“Any-Time”) Regression Trees, and “Hierarchy Diffusion” Temporal Difference Learners. Preliminary results are discussed and techniques for integration with non-native tools are explored.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Electronic Resource
    Electronic Resource
    Springer
    Autonomous robots 6 (1999), S. 187-201 
    ISSN: 1573-7527
    Keywords: landmark recognition ; learning in computer vision ; local learning ; recognition feedback ; reinforcement learning ; traffic sign recognition
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract Current machine perception techniques that typically use segmentation followed by object recognition lack the required robustness to cope with the large variety of situations encountered in real-world navigation. Many existing techniques are brittle in the sense that even minor changes in the expected task environment (e.g., different lighting conditions, geometrical distortion, etc.) can severely degrade the performance of the system or even make it fail completely. In this paper we present a system that achieves robust performance by using local reinforcement learning to induce a highly adaptive mapping from input images to segmentation strategies for successful recognition. This is accomplished by using the confidence level of model matching as reinforcement to drive learning. Local reinforcement learning gives rises to better improvement in recognition performance. The system is verified through experiments on a large set of real images of traffic signs.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Electronic Resource
    Electronic Resource
    Springer
    Autonomous robots 7 (1999), S. 77-88 
    ISSN: 1573-7527
    Keywords: reinforcement learning ; CMAC ; world models ; simulated soccer ; Q(λ) ; evolutionary computation ; PIPE
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract We use reinforcement learning (RL) to compute strategies for multiagent soccer teams. RL may profit significantly from world models (WMs) estimating state transition probabilities and rewards. In high-dimensional, continuous input spaces, however, learning accurate WMs is intractable. Here we show that incomplete WMs can help to quickly find good action selection policies. Our approach is based on a novel combination of CMACs and prioritized sweeping-like algorithms. Variants thereof outperform both Q(λ)-learning with CMACs and the evolutionary method Probabilistic Incremental Program Evolution (PIPE) which performed best in previous comparisons.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Electronic Resource
    Electronic Resource
    Springer
    Journal of scientific computing 12 (1997), S. 361-369 
    ISSN: 1573-7691
    Keywords: Alternating-direction implicit ; difference scheme ; stability ; convergence
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A new alternating-direction implicit (ADI) scheme for solving three-dimensional parabolic differential equations has been developed based on the idea of regularized difference scheme. It is unconditionally stable and second-order accurate. Further, it overcomes the drawback of the Douglas scheme and is to be very well to simulate fast transient phenomena and to efficiently capture steady state solutions of parabolic differential equations. Numerical example is illustrated.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    ISSN: 1573-773X
    Keywords: constrained learning ; factorization ; feedforward networks ; IIR filters ; polynomials ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract Adaptive artificial neural network techniques are introduced and applied to the factorization of 2-D second order polynomials. The proposed neural network is trained using a constrained learning algorithm that achieves minimization of the usual mean square error criterion along with simultaneous satisfaction of multiple equality and inequality constraints between the polynomial coefficients. Using this method, we are able to obtain good approximate solutions for non-factorable polynomials. By incorporating stability constraints into the formalism, our method can be successfully used for the realization of stable 2-D second order IIR filters in cascade form.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Electronic Resource
    Electronic Resource
    Springer
    Artificial intelligence review 12 (1998), S. 445-468 
    ISSN: 1573-7462
    Keywords: architectures for autonomous robots ; artificial neural networks ; behaviour-based robots ; emergent properties ; reinforcement learning ; supervised learning ; unsupervised learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract This paper is a survey of some recentconnectionist approaches to the design and developmentof behaviour-based mobile robots. The research isanalysed in terms of principal connectionist learningmethods and neurological modeling trends. Possibleadvantages over conventionally programmed methods areconsidered and the connectionist achievements to dateare assessed. A realistic view is taken of theprospects for medium term progress and someobservations are made concerning the direction thismight profitably take.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Electronic Resource
    Electronic Resource
    Springer
    Applied intelligence 11 (1999), S. 109-127 
    ISSN: 1573-7497
    Keywords: hybrid models ; sequential decision making ; neural networks ; reinforcement learning ; cognitive modeling
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In developing autonomous agents, one usually emphasizes only (situated) procedural knowledge, ignoring more explicit declarative knowledge. On the other hand, in developing symbolic reasoning models, one usually emphasizes only declarative knowledge, ignoring procedural knowledge. In contrast, we have developed a learning model CLARION, which is a hybrid connectionist model consisting of both localist and distributed representations, based on the two-level approach proposed in [40]. CLARION learns and utilizes both procedural and declarative knowledge, tapping into the synergy of the two types of processes, and enables an agent to learn in situated contexts and generalize resulting knowledge to different scenarios. It unifies connectionist, reinforcement, and symbolic learning in a synergistic way, to perform on-line, bottom-up learning. This summary paper presents one version of the architecture and some results of the experiments.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Electronic Resource
    Electronic Resource
    Springer
    Autonomous robots 4 (1997), S. 73-83 
    ISSN: 1573-7527
    Keywords: robotics ; robot learning ; group behavior ; multi-agent systems ; reinforcement learning
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract This paper describes a formulation of reinforcement learning that enables learning in noisy, dynamic environments such as in the complex concurrent multi-robot learning domain. The methodology involves minimizing the learning space through the use of behaviors and conditions, and dealing with the credit assignment problem through shaped reinforcement in the form of heterogeneous reinforcement functions and progress estimators. We experimentally validate the approach on a group of four mobile robots learning a foraging task.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Electronic Resource
    Electronic Resource
    Springer
    Autonomous robots 5 (1998), S. 273-295 
    ISSN: 1573-7527
    Keywords: reinforcement learning ; module-based RL ; robot learning ; problem decomposition ; Markovian decision problems ; feature space ; subgoals ; local control ; switching control
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract The behavior of reinforcement learning (RL) algorithms is best understood in completely observable, discrete-time controlled Markov chains with finite state and action spaces. In contrast, robot-learning domains are inherently continuous both in time and space, and moreover are partially observable. Here we suggest a systematic approach to solve such problems in which the available qualitative and quantitative knowledge is used to reduce the complexity of learning task. The steps of the design process are to: (i) decompose the task into subtasks using the qualitative knowledge at hand; (ii) design local controllers to solve the subtasks using the available quantitative knowledge, and (iii) learn a coordination of these controllers by means of reinforcement learning. It is argued that the approach enables fast, semi-automatic, but still high quality robot-control as no fine-tuning of the local controllers is needed. The approach was verified on a non-trivial real-life robot task. Several RL algorithms were compared by ANOVA and it was found that the model-based approach worked significantly better than the model-free approach. The learnt switching strategy performed comparably to a handcrafted version. Moreover, the learnt strategy seemed to exploit certain properties of the environment which were not foreseen in advance, thus supporting the view that adaptive algorithms are advantageous to nonadaptive ones in complex environments.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Electronic Resource
    Electronic Resource
    Springer
    Autonomous robots 7 (1999), S. 41-56 
    ISSN: 1573-7527
    Keywords: classical conditioning ; reinforcement learning ; biological models
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract Classical conditioning is a basic learning mechanism in animals and can be found in almost all organisms. If we want to construct robots with abilities matching those of their biological counterparts, this is one of the learning mechanisms that needs to be implemented first. This article describes a computational model of classical conditioning where the goal of learning is assumed to be the prediction of a temporally discounted reward or punishment based on the current stimulus situation. The model is well suited for robotic implementation as it models a number of classical conditioning paradigms and learning in the model is guaranteed to converge with arbitrarily complex stimulus sequences. This is an essential feature once the step is taken beyond the simple laboratory experiment with two or three stimuli to the real world where no such limitations exist. It is also demonstrated how the model can be included in a more complex system that includes various forms of sensory pre-processing and how it can handle reinforcement learning, timing of responses and function as an adaptive world model.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Electronic Resource
    Electronic Resource
    Springer
    Autonomous robots 6 (1999), S. 281-292 
    ISSN: 1573-7527
    Keywords: mobile robotics ; reinforcement learning ; artificial neural networks ; simulation ; real world
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract We present a case study of reinforcement learning on a real robot that learns how to back up a trailer and discuss the lessons learned about the importance of proper experimental procedure and design. We identify areas of particular concern to the experimental robotics community at large. In particular, we address concerns pertinent to robotics simulation research, implementing learning algorithms on real robotic hardware, and the difficulties involved with transferring research between the two.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Electronic Resource
    Electronic Resource
    Springer
    Autonomous robots 7 (1999), S. 57-75 
    ISSN: 1573-7527
    Keywords: sensor-based manipulators ; multi-goal reaching tasks ; reinforcement learning ; neural networks ; differential inverse kinematics
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mechanical Engineering, Materials Science, Production Engineering, Mining and Metallurgy, Traffic Engineering, Precision Mechanics
    Notes: Abstract Our work focuses on making an autonomous robot manipulator learn suitable collision-free motions from local sensory data while executing high-level descriptions of tasks. The robot arm must reach a sequence of targets where it undertakes some manipulation. The robot manipulator has a sonar sensing skin covering its links to perceive the obstacles in its surroundings. We use reinforcement learning for that purpose, and the neural controller acquires appropriate reaction strategies in acceptable time provided it has some a priori knowledge. This knowledge is specified in two main ways: an appropriate codification of the signals of the neural controller—inputs, outputs and reinforcement—and decomposition of the learning task. The codification facilitates the generalization capabilities of the network as it takes advantage of inherent symmetries and is quite goal-independent. On the other hand, the task of reaching a certain goal position is decomposed into two sequential subtasks: negotiate obstacles and move to goal. Experimental results show that the controller achieves a good performance incrementally in a reasonable time and exhibits high tolerance to failing sensors.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 3 (1996), S. 11-15 
    ISSN: 1573-773X
    Keywords: hardware realisation ; RAM-based nodes ; reinforcement learning ; reward-penalty
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract RAM-based neural networks are designed to be efficiently implemented in hardware. The desire to retain this property influences the training algorithms used, and has led to the use of reinforcement (reward-penalty) learning. An analysis of the reinforcement algorithm applied to RAM-based nodes has shown the ease with which unlearning can occur. An amended algorithm is proposed which demonstrates improved learning performance compared to previously published reinforcement regimes.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 9 (1999), S. 119-127 
    ISSN: 1573-773X
    Keywords: reinforcement learning ; neurocontrol ; optimization ; polytope algorithm ; pole balancing ; genetic reinforcement
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A new training algorithm is presented for delayed reinforcement learning problems that does not assume the existence of a critic model and employs the polytope optimization algorithm to adjust the weights of the action network so that a simple direct measure of the training performance is maximized. Experimental results from the application of the method to the pole balancing problem indicate improved training performance compared with critic-based and genetic reinforcement approaches.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 4 (1996), S. 167-172 
    ISSN: 1573-773X
    Keywords: fuzzy min-max neural network ; reinforcement learning ; autonomous vehicle navigation
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The fuzzy min-max neural network constitutes a neural architecture that is based on hyperbox fuzzy sets and can be incrementally trained by appropriately adjusting the number of hyperboxes and their corresponding volumes. Two versions have been proposed: for supervised and unsupervised learning. In this paper a modified approach is presented that is appropriate for reinforcement learning problems with discrete action space and is applied to the difficult task of autonomous vehicle navigation when no a priori knowledge of the enivronment is available. Experimental results indicate that the proposed reinforcement learning network exhibits superior learning behavior compared to conventional reinforcement schemes.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Electronic Resource
    Electronic Resource
    Springer
    Journal of scientific computing 12 (1997), S. 215-231 
    ISSN: 1573-7691
    Keywords: Transport models ; shallow water ; splitting methods ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract We investigate the use of splitting methods for the numerical integration of three-dimensional transport-chemistry models. In particular, we investigate various possibilities for the time discretization that can take advantage of the parallelization and vectorization facilities offered by multi-processor vector computers. To suppress wiggles in the numerical solution, we use third-order, upwind-biased discretization of the advection terms, resulting in a five-point coupling in each direction. As an alternative to the usual splitting functions, such as co-ordinate splitting or operator splitting, we consider a splitting function that is based on a three-coloured hopscotch-type splitting in the horizontal direction, whereas full coupling is retained in the vertical direction. Advantages of this splitting function are the easy application of domain decomposition techniques and unconditional stability in the vertical, which is an important property for transport in shallow water. The splitting method is obtained by combining the hopscotch-type splitting function with various second-order splitting formulae from the literature. Although some of the resulting methods are highly accurate, their stability behaviour (due to horizontal advection) is quite poor. Therefore we also discuss several new splitting formulae with the aim to improve the stability characteristics. It turns out that this is possible indeed, but the price to pay is a reduction of the accuracy. Therefore, such methods are to be preferred if accuracy is less crucial than stability; such a situation is frequently encountered in solving transport problems. As part of the project TRUST (Transport and Reactions Unified by Splitting Techniques), preliminary versions of the schemes are implemented on the Cray C98 4256 computer and are available for benchmarking.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Electronic Resource
    Electronic Resource
    Springer
    Journal of scientific computing 12 (1997), S. 353-360 
    ISSN: 1573-7691
    Keywords: Alternating-direction implicit ; difference scheme ; stability ; convergence
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A generalized Peaceman–Rachford alternating-direction implicit (ADI) scheme for solving two-dimensional parabolic differential equations has been developed based on the idea of regularized difference scheme. It is to be very well to simulate fast transient phenomena and to efficiently capture steady state solutions of parabolic differential equations. Numerical example is illustrated.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Electronic Resource
    Electronic Resource
    Springer
    Journal of scientific computing 13 (1998), S. 173-183 
    ISSN: 1573-7691
    Keywords: Modified conjugate gradient method ; conjugate gradient method ; Krylov space ; convergence rate ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this note, we examine a modified conjugate gradient procedure for solving $$A\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} = \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{b}$$ in which the approximation space is based upon the Krylov space ( $$A\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{x} = \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{b}$$ ) associated with $$\sqrt A$$ and $$\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{b}$$ . We show that, given initial vectors $$\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{b}$$ and $$\sqrt A \underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{b}$$ (possibly computed at some expense), the best fit solution in $$K^k \sqrt A ,\underset{\raise0.3em\hbox{$\smash{\scriptscriptstyle-}$}}{b}$$ can be computed using a finite-term recurrence requiring only one multiplication by A per iteration. The initial convergence rate appears, as expected, to be twice as fast as that of the standard conjugate gradient method, but stability problems cause the convergence to be degraded.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Electronic Resource
    Electronic Resource
    Springer
    Journal of network and systems management 3 (1995), S. 371-380 
    ISSN: 1573-7705
    Keywords: Telephone traffic ; network management ; control theory ; dynamic flows ; stability ; routing algorithms ; broadband networks ; simulation
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract The control of telephony traffic is the task of network management and routing algorithms. In this paper, a study of two trunk groups carrying telephony traffic is used to show that instabilities can arise if there is a delay in getting feedback information for a network controller. The network controller seeks to balance the traffic in the two trunk groups, which may represent two paths from a source to a destination. An analysis shows how factors such as holding time, controller gain and feedback delay influence stability. Simulation of a two service case is also carried out to show that the same instabilities can arise.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Electronic Resource
    Electronic Resource
    Springer
    Neural processing letters 10 (1999), S. 267-271 
    ISSN: 1573-773X
    Keywords: recurrent neural networks ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract In this paper, we point out that the conditions given in [1] are sufficient but unnecessary for the global asymptotically stable equilibrium of a class of delay differential equations. Instead, we prove that under weaker conditions, it is still global asymptotically stable.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Electronic Resource
    Electronic Resource
    Springer
    Numerical algorithms 10 (1995), S. 225-244 
    ISSN: 1572-9265
    Keywords: Cholesky factorization error analysis ; Hankel matrix ; least squares ; normal equations ; orthogonal factorization ; QR factorization ; semi-normal equations ; stability ; Toeplitz matrix ; weak stability ; Primary 65F25 ; Secondary 47B35 ; 65F05 ; 65F30 ; 65Y05 ; 65Y10
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract We show that a fast algorithm for theQR factorization of a Toeplitz or Hankel matrixA is weakly stable in the sense thatR T R is close toA T A. Thus, when the algorithm is used to solve the semi-normal equationsR TRx=AT b, we obtain a weakly stable method for the solution of a nonsingular Toeplitz or Hankel linear systemAx=b. The algorithm also applies to the solution of the full-rank Toeplitz or Hankel least squares problem min ||Ax-b||2.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Electronic Resource
    Electronic Resource
    Springer
    Numerical algorithms 14 (1997), S. 343-359 
    ISSN: 1572-9265
    Keywords: progressive interpolation ; stability ; spline ; shape parameters ; geometric continuity ; 41A05 ; 41A15 ; 65D05 ; 65D07
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract In this paper, we study several interpolating and smoothing methods for data which are known “progressively”. The algorithms proposed are governed by recurrence relations and our principal goal is to study their stability. A recurrence relation will be said stable if the spectral radius of the associated matrix is less than one. The iteration matrices depend on shape parameters which come either from the connection at the knots, or from the nature of the interpolant between two knots. We obtain various stability domains. Moving the parameters inside these domains leads to interesting shape effects.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Electronic Resource
    Electronic Resource
    Springer
    Numerical algorithms 10 (1995), S. 245-260 
    ISSN: 1572-9265
    Keywords: Multistep methods ; differential-algebraic equations ; stability ; existence and uniqueness ; convergence of iterative method ; 65L06 ; 65L20 ; 65N22
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Mathematics
    Notes: Abstract Multistep methods for the differential/algebraic equations (DAEs) in the form of $$F_1 (x) = 0, F_2 (x,x',z) = 0$$ are presented, whereF 1 maps from ℝ n to ℝ ′ ,F 2 from ℝ n x ℝ n x ℝ m to ℝ s andr〈n≤r+s=n+m. By employing the deviations of the available existence theories, a new form of the multistep method for solutions of (1) is developed. Furthermore, it is shown that this method has no typical instabilities such as those that may occur in the application of multistep method to DAEs in the traditional manner. A proof of the solvability of the multistep system is provided, and an iterative method is developed for solving these nonlinear algebraic equations. Moreover, a proof of the convergence of this iterative method is presented.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Electronic Resource
    Electronic Resource
    Springer
    Journal of logic, language and information 7 (1998), S. 143-163 
    ISSN: 1572-9583
    Keywords: Belief revision ; consolidation ; coherence ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Linguistics and Literary Studies , Computer Science
    Notes: Abstract The notion of epistemic coherence is interpreted as involving not only consistency but also stability. The problem how to consolidate a belief system, i.e., revise it so that it becomes coherent, is studied axiomatically as well as in terms of set-theoretical constructions. Representation theorems are given for subtractive consolidation (where coherence is obtained by deleting beliefs) and additive consolidation (where coherence is obtained by adding beliefs).
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Electronic Resource
    Electronic Resource
    Springer
    Journal of computational neuroscience 6 (1999), S. 191-214 
    ISSN: 1573-6873
    Keywords: Neural network ; prefrontal ; reinforcement learning ; striatum ; timing
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science , Medicine , Physics
    Notes: Abstract A neural network model of how dopamine and prefrontal cortex activity guides short- and long-term information processing within the cortico-striatal circuits during reward-related learning of approach behavior is proposed. The model predicts two types of reward-related neuronal responses generated during learning: (1) cell activity signaling errors in the prediction of the expected time of reward delivery and (2) neural activations coding for errors in the prediction of the amount and type of reward or stimulus expectancies. The former type of signal is consistent with the responses of dopaminergic neurons, while the latter signal is consistent with reward expectancy responses reported in the prefrontal cortex. It is shown that a neural network architecture that satisfies the design principles of the adaptive resonance theory of Carpenter and Grossberg (1987) can account for the dopamine responses to novelty, generalization, and discrimination of appetitive and aversive stimuli. These hypotheses are scrutinized via simulations of the model in relation to the delivery of free food outside a task, the timed contingent delivery of appetitive and aversive stimuli, and an asymmetric, instructed delay response task.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Electronic Resource
    Electronic Resource
    Springer
    International journal of parallel programming 12 (1983), S. 193-209 
    ISSN: 1573-7640
    Keywords: Database ; characteristic frequency ; aggregated model ; decomposition ; stability ; dynamic distribution
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Notes: Abstract A time decomposition technique is suggested for large-database (DB) models. The problem of network aggregation is studied and the results used to create a meaningful decomposed model. Decomposition conditions and assumptions are discussed and illustrated by examples. A practical operating schedule is presented for the time-separated DB model. The schedule uses a sequence of decomposed models, which are to be constructed recursively. The application of the time separation technique for large-DB models is presented in the form of a closed-loop algorithm. The problem of decomposition stability with respect to variations in time constants is considered as well. Two alternative approaches to the problem are suggested. For a probabilistic approach, practical approximate formulas are obtained for subsystem time constants and recommendations are made with respect to the decomposition structure. An approximate performance analysis is done for both standard and time-decomposed models. A comparison of the results is given.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    ISSN: 1436-5057
    Keywords: 65 L 05 ; Rosenbrock-type methods ; quasilinear-implicit differential equations ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Description / Table of Contents: Zusammenfassung Bei der Lösung quasilinear-impliziter ODEs mittels Rosenbrock-Typ-Methoden können trotz guter Stabilitätseigenschaften (A- bzw. L-Stabilität) des Grundverfahrens Stabilitätsprobleme auftreten. Diese Schwierigkeiten sind auf Ungenauigkeiten bei der Berechnung künstlich eingeführter Komponenten (Überführung in DAEs) zurückzuführen. Die Arbeit untersucht die Ursachen für diese Effekte und zeigt Möglichkeiten, diese zu überwinden.
    Notes: Abstract The solution of quasilinear-implicit ODEs using Rosenbrock type methods may suffer from stability problems despite stability properties such as A-stability or L-stability, respectively. These problems are caused by inexact computation of artificial introduced components (transformation to DAE system). The paper investigates the source of the numerical difficulties and shows modifications to overcome them.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Electronic Resource
    Electronic Resource
    Springer
    Computing 24 (1980), S. 341-347 
    ISSN: 1436-5057
    Keywords: Numerical analysis ; Volterra integral equations of the second kind ; stability
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Description / Table of Contents: Zusammenfassung Ziel dieser Arbeit ist es, die Stabilitätseigenschaften einer Klasse Volterrascher Integralgleichungen zweiter Art zu untersuchen. Unsere Behandlung ist der üblichen Stabilitätsanalyse ähnlich, in der die Kernfunktionen zu einer im voraus beschränkten Klasse von Testfunktionen gehören. Wir haben die Klasse der “endlich zerlegbaren” Kerne betrachtet. Stabilitätsbedingungen werden abgeleitet und verglichen mit den Bedingungen für die einfache Testgleichung. Es zeigt sich, daß die neuen Kriteria einschränkender sind als die konventionellen Bedingungen. Der praktische Wert wird getestet durch numerische Experimente mit der Trapezregel.
    Notes: Abstract The purpose of this paper is to analyse the stability properties of a class of multistep methods for second kind Volterra integral equations. Our approach follows the usual analysis in which the kernel function is a priori restricted to a special class of test functions. We consider the class of finitely decomposable kernels. Stability conditions will be derived and compared with those obtained with the simple test equation. It turns out that the new criteria are more severe than the conventional conditions. The practical value is tested by numerical experiments with the trapezoidal rule.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Electronic Resource
    Electronic Resource
    Springer
    Computing 57 (1996), S. 281-299 
    ISSN: 1436-5057
    Keywords: 65N15 ; 65N99 ; 35A40 ; Finite volume method ; box scheme ; stability ; error estimates
    Source: Springer Online Journal Archives 1860-2000
    Topics: Computer Science
    Description / Table of Contents: Zusammenfassung Es wird eine Box-Methode mit quadratischen Ansatzfunktionen zur Diskretisierung elliptischer Randwertaufgaben vorgestellt. Die entstehende Diskretisierungsmatrix ist nichsymmetrisch. Die Stabilitätsanalyse basiert auf einer elementweisen Abschätzung des Skalarproduktes 〈A h u h ,u h 〉. Hinreichende Bedingungen an die Geometrie der Dreiecke der Triangulierung führen zur diskreten Elliptizität. Unter diesen Voraussetzungen wird eineO(h 2)-Fehlerabschätzung bewiesen.
    Notes: Abstract The paper presents a box scheme with quadratic basis functions for the discretisation of elliptic boundary value problems. The resulting discretisation matrix is non-symmetrical (and also not an M-matrix). The stability analysis is based on an elementwise estimation of the scalar product 〈A h u h ,u h 〉. Sufficient conditions placed on the triangles of the triangulation lead to discrete ellipticity. Proof of anO(h 2) error estimate is given for these conditions.
    Type of Medium: Electronic Resource
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...