Simplified POMDP Algorithms with Performance Guarantees
Autonomous agents operating in real-world scenarios frequently encounter uncertainty and make decisions based on incomplete information. This challenge can be structured mathematically through the lens of partially observable Markov decision processes (POMDPs). While POMDPs offer a robust framework for planning under uncertainty, finding an optimal plan for a POMDP can be computationally intensive and is feasible only for simpler tasks. In response, the last two decades have witnessed the rise of approximate algorithms, like tree search and sample-based approaches, as leading solutions for tackling more complex POMDP problems. Despite their effectiveness, these algorithms typically offer only probabilistic guarantees or, in some cases, no formal guarantees at all. In our research, we have focused on addressing these limitations by developing a range of simplified algorithms with formal, deterministic guarantees. These simplified algorithms operate on a selected subset of the state and observation spaces, commonly considered in state-of-the-art algorithms, while providing mathematical guarantees and computational efficiencies compared to the non-simplified algorithm. Initially, we focused on a belief-dependent reward framework, simplifying the reward calculation by narrowing down the observation space. Then, we have applied a simplification to the state space in the context of hybrid-belief and data-association aware POMDPs, which otherwise may grow exponentially. Ultimately, we extended our approach to a broad POMDP framework, simplifying both state and observation spaces, and providing deterministic guarantees with respect to the optimal solution. |