Operations Research
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
 QUICK SEARCH:   [advanced]


     


OPERATIONS RESEARCH,
Published online in Articles in Advance, July 29, 2009
DOI: 10.1287/opre.1090.0697
This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Google Scholar
Right arrow Articles by Zhang, H.

Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Hao Zhang

Marshall School of Business, University of Southern California, Los Angeles, California 90089
zhanghao{at}usc.edu

This paper presents a novel framework for studying partially observable Markov decision processes (POMDPs) with finite state, action, observation sets, and discounted rewards. The new framework is solely based on future-reward vectors associated with future policies, which is more parsimonious than the traditional framework based on belief vectors. It reveals the connection between the POMDP problem and two computational geometry problems, i.e., finding the vertices of a convex hull and finding the Minkowski sum of convex polytopes, which can help solve the POMDP problem more efficiently. The new framework can clarify some existing algorithms over both finite and infinite horizons and shed new light on them. It also facilitates the comparison of POMDPs with respect to their degree of observability, as a useful structural result.

Subject classifications: dynamic programming; Markov; analysis of algorithms; computational complexity; mathematics; combinatorics; computers/computer science; artificial intelligence.
History: Received January 2008; revision received July 2008; accepted November 2008.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH
Copyright © 2009 by INFORMS.