www.Autodiff.org - Publication: Stable Adaptive Control Using New Critic Designs

Stable Adaptive Control Using New Critic Designs

- misc -

Year
1998

Abstract
Classical adaptive control proves total-system stability for control of linear plants, but only for plants meeting very restrictive assumptions. Approximate Dynamic Programming (ADP) has the potential, in principle, to ensure stability without such tight restrictions. It also offers nonlinear and neural extensions for optimal control, with empirically supported links to what is seen in the brain. However, the relevant ADP methods in use today -- TD, HDP, DHP, GDHP -- and the Galerkin-based versions of these all have serious limitations when used here as parallel distributed real-time learning systems; either they do not possess quadratic unconditional stability (to be defined) or they lead to incorrect results in the stochastic case. (ADAC or Q-learning designs do not help.) After explaining these conclusions, this paper describes new ADP designs which overcome these limitations. It also addresses the Generalized Moving Target problem, a common family of static optimization problems, and describes a way to stabilize large-scale economic equilibrium models, such as the old long-term energy model of DOE.

BibTeX
@MISC{
         Werbos1998SAC,
       title = "Stable Adaptive Control Using New Critic Designs",
       author = "Paul J. Werbos",
       note = "\url{http://arxiv.org/abs/adap-org/9810001}",
       abstract = "Classical adaptive control proves total-system stability for control of linear
         plants, but only for plants meeting very restrictive assumptions. Approximate Dynamic Programming
         (ADP) has the potential, in principle, to ensure stability without such tight restrictions. It also
         offers nonlinear and neural extensions for optimal control, with empirically supported links to what
         is seen in the brain. However, the relevant ADP methods in use today -- TD, HDP, DHP, GDHP -- and
         the Galerkin-based versions of these all have serious limitations when used here as parallel
         distributed real-time learning systems; either they do not possess quadratic unconditional stability
         (to be defined) or they lead to incorrect results in the stochastic case. (ADAC or Q-learning
         designs do not help.) After explaining these conclusions, this paper describes new ADP designs which
         overcome these limitations. It also addresses the Generalized Moving Target problem, a common family
         of static optimization problems, and describes a way to stabilize large-scale economic equilibrium
         models, such as the old long-term energy model of DOE.",
       oai = "oai:arXiv.org:adap-org/9810001",
       subject = "Adaptation and Self-Organizing Systems",
       URL = "http://arxiv.org/abs/adap-org/9810001",
       year = "1998"
}

back

autodiff.org
Username:
Password: