Abstract
In this paper we use an approach which uses a superharmonic property of a sequence of functions generated by an algorithm to show that these functions converge in a non-increasing manner to the optimal value function for our problem, and bounds are given for the loss of optimality if the computational process is terminated at any iteration. The basic procedure is to add an additional linear term at each iteration, selected by solving a particular optimisation problem, for which primal and dual linear programming formulations are given.
Similar content being viewed by others
References
Kallenberg LCM (1983) Linear programming and finite markovian control problems. Mathematisch Centrum Tracts 148, Mathematisch Centrum Amsterdam
Monahan GE (1982) A survey of partially observable markov decision processes: Theory, models, and algorithm. Management Science 28: 1–16
Porteus EL (1971) Some bounds for discounted sequential decision processes. Management Science 18: 7–11
Rockafellar RT (1970) Convex analysis. Princeton University Press New Jersey
Stoer J, Witzgall C (1970) Convexity and optimisation in finite dimensions I. Springer-Verlag Berlin
White CC (1991) A survey of solution techniques for the partially observed markov decision process. Ann OR 32: 215–230
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
White, D.J. A superharmonic approach to solving infinite horizon partially observable Markov decision problems. ZOR - Methods and Models of Operations Research 41, 71–88 (1995). https://doi.org/10.1007/BF01415066
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01415066