Main Subcommittee Category: Emerging Concepts and Innovative Technologies
How would you label your submission?: Industry
Abstract:
Artificial intelligence (AI)–based systems show great promise for supporting complex decision-making and planning. AI systems can consider a massive option space that far exceeds current human processes. Notably, AI systems, particularly deep reinforcement learning (DRL), achieved expert levels of play for strategy games, generating innovative strategies by exploring numerous courses of action (COAs) to make successful strategic choices in complex scenarios. The opportunity exists for AI systems to assist human planning staffs with constructing more high-quality plans, analyzing plan strengths and weaknesses more deeply, and exploring a larger number of plan alternatives in a fixed planning time. AI-based modeling and simulation could accelerate COA planning activities that require reasoning across multiple, interconnected domains. Such planning support must consider interacting effects across physical domains (e.g., air, land, sea, undersea) and interacting support functions (e.g., logistics, communications)— a massive action space to consider—which exceeds the action space considered by state-of-the-art game-playing systems. This paper reports on a new DRL approach, called neural program policies (NPPs), which incorporates structure and domain-specific information into policies described by deep neural networks to vastly reduce the action space into a learned, compact, and meaningful set. We first describe the policy domain-specific language (DSL) that abstracts the actions and observations employed by a deep reinforcement learner. Then, we apply NPPs to two problem categories—control problem benchmarks provided by OpenAI gym and multi-domain reasoning using StarCraft II simulation extended with sea, undersea, and novel support functions. We conclude with results for (1) generating multi-domain COA traces and (2) NPP-based agent performance (e.g., running a surrogate model ~10,000x faster than real time, vastly reducing the action space, and enabling the interpretation of AI-generated COA traces). The resulting system supports a human-AI team paradigm to increase the number and quality of multi-domain plans considered.
Keywords: AI;DEEP LEARNING
Additional Keywords not in the list above: Course of Action Generation, Military Decision Support, Domain Specific Language