Negatively isotone optimal policies for random walk type Markov decision processes

White, D. J.

doi:10.1007/BF01837023

Negatively isotone optimal policies for random walk type Markov decision processes

Theoretical Papers
Published: March 1982

Volume 4, pages 41–45, (1982)
Cite this article

Operations-Research-Spektrum Aims and scope Submit manuscript

D. J. White¹

24 Accesses
7 Citations
Explore all metrics

Summary

This paper considers a random walk type Markov decision process in which the state spaceI is an integer subset of IR^m, and the action spaceK is independent ofi εI. The natural order

, overI, and a quasi order,

′, overK, is assumed, together with aconditional convexity assumption on the returns {r ^k_i }, and certain other assumptions about these rewards and the transition probabilities in relationship to the orders

and

′.A negatively isotone policy is one for whichi

i′→δ(i)⊁′)δ(i′) (i.e.δ(i)

′δ(i)′ orδ(i′)

′δi)). It is shown that, under specified conditions, a negatively isotone optimal policy exists. Some consideration is given to computational implications in particular relationship to Howard's policy space method.

Zusammenfassung

Wir betrachten einen Markovschen Entscheidungsprozeß vom random walk Typ. Der ZustandsraumI sei eine Teilmenge des IR^m, wobeii εI ganzzahlige Komponenten habe. Die MengeK der zulässigen Aktionen ini εI sei unabhängig voni εI. Sei

die natürliche Ordnung aufI und

′ sei eine Quasiordnung aufK. Die Erträge {r ^k_i }seienbedingt konvex, darüberhinaus seien weitere Voraussetzungen über diese Erträge und die Übergangswahrscheinlichkeiten in Bezug auf die Ordnungen

und

′ erfüllt. Eine Politik δ heißt negativ isoton, falls ausi

i′ folgtδi⊁′δ(i′) (d. h.δ(i)

′δ(i)′ oderδ(i)′

′δ(i)′). Wir zeigen, daß unter gewissen Voraussetzungen einenegativ isotone optimale Politik existiert: Auch diskutieren wir einige Folgerungen für die Numerik, insbesondere hinsichtlich Howards Politikiteration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

White DJ (1977) Kernels of preference structures. Econometrica 45:91–100
Google Scholar
Porteus EL (1975) On the optimality of structured policies in countable stage decision processes. Manag Sci 22:148–157
Google Scholar
White CC (1980) The optimality of isotone strategies for Markov decision processes with utility criterion. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes. Academic Press, London, pp 261–275
Google Scholar
White DJ (1981) Isotone optimal policies for structured Markov decision processes. Eur J Oper Res 7:392–402
Google Scholar
Kalin D (1978) A note on monotone optimal policies for Markov decision processes. Math Prog 15:220–222
Google Scholar
Serfozo RF (1976) Monotone optimal policies for Markov decision processes. Math Prog Study 6:202–215
Google Scholar
Howard RA (1960) Dynamic programming and Markov processes. J Wiley, New York
Google Scholar
Tijms H (1980) An algorithm for average costs denumerable state semi-Markov decision problems with applications to controlled production and queueing systems. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes. Academic Press, London, pp 143–179
Google Scholar
Bellman R (1957) Dynamic programming. Princeton University Press, New Jersey
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Decision Theory, University of Manchester, Dover Street, M13 9PL, Manchester, England
D. J. White

Authors

D. J. White
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

White, D.J. Negatively isotone optimal policies for random walk type Markov decision processes. OR Spektrum 4, 41–45 (1982). https://doi.org/10.1007/BF01837023

Download citation

Received: 30 January 1981
Accepted: 26 October 1981
Issue Date: March 1982
DOI: https://doi.org/10.1007/BF01837023

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Negatively isotone optimal policies for random walk type Markov decision processes

Summary

Zusammenfassung

Access this article

Similar content being viewed by others

Closed-form expressions of the run-length distribution of the nonparametric double sampling precedence monitoring scheme

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Non-convex scenario optimization

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Negatively isotone optimal policies for random walk type Markov decision processes

Summary

Zusammenfassung

Access this article

Similar content being viewed by others

Closed-form expressions of the run-length distribution of the nonparametric double sampling precedence monitoring scheme

Existence and Uniqueness of Quasi-stationary Distributions for Symmetric Markov Processes with Tightness Property

Non-convex scenario optimization

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation