Complexity of finite-horizon Markov decision process problems

Martin Mundhenk, Judy Goldsmith, Christopher Lusena, Eric Allender

Research output: Contribution to journalArticlepeer-review

128 Scopus citations

Abstract

Controlled stochastic systems occur in science, engineering, manufacturing, social sciences, and many other contexts. If the system is modeled as a Markov decision process (MDP) and will run ad infinitum, the optimal control policy can be computed in polynomial time using linear programming. The problems considered here assume that the time that the process will run is finite, and based on the size of the input. There are many factors that compound the complexity of computing the optimal policy. For instance, there are many factors that compound the complexity of this computation. For instance, if the controller does not have complete information about the state of the system, or if the system is represented in some very succinct manner, the optimal policy is provably not computable in time polynomial in the size of the input. We analyze the computational complexity of evaluating policies and of determining whether a sufficiently good policy exists for a MDP, based on a number of confounding factors, including the observability of the system state; the succinctness of the representation; the type of policy; even the number of actions relative to the number of states. In almost every case, we show that the decision problem is complete for some known complexity class. Some of these results are familiar from work by Papadimitriou and Tsitsiklis and others, but some, such as our PL-completeness proofs, are surprising. We include proofs of completeness for natural problems in the as yet little-studied classes NPPP.

Original languageEnglish
Pages (from-to)681-720
Number of pages40
JournalJournal of the ACM
Volume47
Issue number4
DOIs
StatePublished - 2000

Keywords

  • Computational complexity
  • Markov decision processes
  • NP
  • NP
  • PL
  • PSPACE
  • Partially observable Markov decision processes
  • Succinct representations

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Information Systems
  • Hardware and Architecture
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Complexity of finite-horizon Markov decision process problems'. Together they form a unique fingerprint.

Cite this