Some Challenges in Planning for Time-Constrained, Dynamic, Multi-Agent, Stochastic Task Environments
Add to Google Calendar
For agents operating in uncertain task environments, planning based on Markov Decision Process models can in principle lead to optimal expected performance by finding policies that account for all possible eventualities. Unfortunately, in some task environments, the agents might not have enough time to converge on policies before they must begin taking action. In this talk, I will describe work that we have done, in collaboration with a research team led by Honeywell Labs, on using (and some might argue abusing) Markov Decision Process models for planning activities in time-constrained, dynamically-changing, multi-agent, stochastic task environments.
Among the questions we have faced are: What partial policy should an agent form when it has to start executing the policy before it is complete? How can an agent improve its partial policy while in the midst of executing it? What should an agent do if it reaches a state that its partial policy hasn't found an action for, or hasn't even included? How should an agent respond to receiving new information about its actions and events that could undermine the model upon which the (partial) policy it is in the midst of executing is based? And how should agents whose local actions can affect each other coordinate their partial policies, as well as their reasoning efforts about their policies?
I'll conclude the talk by describing how some of the insights from the above work are influencing our ongoing investigations (and potential future investigations for new students) into techniques for interaction planning in service-oriented computing environments, for coordinating the evolving schedules of humans, and for controlling the activities of unmanned vehicles in populated worlds.