Novel Class of Expected Value Bounds and Applications in Belief Space Planning
Planning under uncertainty in partially observable domains, often formulated as POMDPs, is an exceedingly difficult problem. Finding the globally optimal solution is intractable for all but the smallest problems as it requires reasoning about realization of the many random variables of the problem. Thus, tractable bounds with formal guarantees are attractive alternative to finding a globally optimal solution. In this paper, motivated by this line of reasoning, we formulate and prove novel probability theory bounds. First, we bound the expectation with respect to the partial expectation (seen to be directly proportional to the conditional expectation) and show that this is a generalization of the Markov inequality. Second, by merging our novel inequalities with Hoeffding’s inequality, we compose an additional novel bound, which allows for bounding expectations with respect to estimators of partial expectations. Finally, we apply these bounds to the context of planning; we prove bounds on the general value function with respect to the partial observation space. We then bound the conditional entropy with respect to the partial observation space and finally, with the use of our novel bounds, leverage the structure of beliefs in POMDPs to allow for reuse in calculations when eliminating certain realizations of the belief topology.
The work is towards M.Sc. degree under the supervision of Associate Professor Vadim Indelman, Department of Aerospace Engineering, Technion – Israel Institute of Technology.