While autonomous agents offer the promise of automated assistance for human teamwork, the usefulness of such assistance is limited by an agent's ability to understand the people it wants to help. No matter how well defined and tightly scoped a task is, people will always bring their diverse preferences, experiences, biases, etc. to bear even when acting as a team. An agent that can neither represent the heterogeneity across the team it will have to help, nor infer the subjective frame of reference of the individual team members, is doomed to apply a one-size-fits-all policy of assistance that is unlikely to maximize task performance. In this work, we present a methodology for first capturing the diversity across different people within an agent's initial hypothesis space and then using observations of a specific person's behavior to update posterior beliefs within that hypothesis space. For the former, we apply inverse reinforcement learning (IRL) to analyze a sample of human behavior at a task and then extract reward functions that best explain that behavior. For the latter, the agent uses a Bayesian Theory of Mind, by applying Partially Observable Markov Decision Processes, filled in by the IRL output, in a recursive manner to form beliefs about the type of people it is currently observing. We evaluate the effectiveness of this methodology using a search-and-rescue task in Minecraft and show the impact of different levels of granularity in the hypothesis space on the accuracy of the agent's inference about the human players and teams.
Keywords
AI,BEHAVIOR MODELING,HUMAN PERFORMANCE
Additional Keywords
Search and rescue, Inverse reinforcement learning