Skip to main content
Log in

Negatively isotone optimal policies for random walk type Markov decision processes

  • Theoretical Papers
  • Published:
Operations-Research-Spektrum Aims and scope Submit manuscript

Summary

This paper considers a random walk type Markov decision process in which the state spaceI is an integer subset of IRm, and the action spaceK is independent ofi εI. The natural order

, overI, and a quasi order,

′, overK, is assumed, together with aconditional convexity assumption on the returns {r ki }, and certain other assumptions about these rewards and the transition probabilities in relationship to the orders

and

′.A negatively isotone policy is one for whichi

i′→δ(i)⊁′)δ(i′) (i.e.δ(i)

′δ(i)′ orδ(i′)

δi)). It is shown that, under specified conditions, a negatively isotone optimal policy exists. Some consideration is given to computational implications in particular relationship to Howard's policy space method.

Zusammenfassung

Wir betrachten einen Markovschen Entscheidungsprozeß vom random walk Typ. Der ZustandsraumI sei eine Teilmenge des IRm, wobeii εI ganzzahlige Komponenten habe. Die MengeK der zulässigen Aktionen ini εI sei unabhängig voni εI. Sei

die natürliche Ordnung aufI und

′ sei eine Quasiordnung aufK. Die Erträge {r ki }seienbedingt konvex, darüberhinaus seien weitere Voraussetzungen über diese Erträge und die Übergangswahrscheinlichkeiten in Bezug auf die Ordnungen

und

′ erfüllt. Eine Politik δ heißt negativ isoton, falls ausi

i′ folgtδi⊁′δ(i′) (d. h.δ(i)

δ(i)′ oderδ(i)

δ(i)′). Wir zeigen, daß unter gewissen Voraussetzungen einenegativ isotone optimale Politik existiert: Auch diskutieren wir einige Folgerungen für die Numerik, insbesondere hinsichtlich Howards Politikiteration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. White DJ (1977) Kernels of preference structures. Econometrica 45:91–100

    Google Scholar 

  2. Porteus EL (1975) On the optimality of structured policies in countable stage decision processes. Manag Sci 22:148–157

    Google Scholar 

  3. White CC (1980) The optimality of isotone strategies for Markov decision processes with utility criterion. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes. Academic Press, London, pp 261–275

    Google Scholar 

  4. White DJ (1981) Isotone optimal policies for structured Markov decision processes. Eur J Oper Res 7:392–402

    Google Scholar 

  5. Kalin D (1978) A note on monotone optimal policies for Markov decision processes. Math Prog 15:220–222

    Google Scholar 

  6. Serfozo RF (1976) Monotone optimal policies for Markov decision processes. Math Prog Study 6:202–215

    Google Scholar 

  7. Howard RA (1960) Dynamic programming and Markov processes. J Wiley, New York

    Google Scholar 

  8. Tijms H (1980) An algorithm for average costs denumerable state semi-Markov decision problems with applications to controlled production and queueing systems. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes. Academic Press, London, pp 143–179

    Google Scholar 

  9. Bellman R (1957) Dynamic programming. Princeton University Press, New Jersey

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

White, D.J. Negatively isotone optimal policies for random walk type Markov decision processes. OR Spektrum 4, 41–45 (1982). https://doi.org/10.1007/BF01837023

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01837023

Keywords

Navigation