Organizations of various sizes and across various industries face an increasing risk from cyberattacks. Developing a deeper understanding of these attacks can facilitate the creation of more effective strategies to mitigate them. Petri Nets with Players, Strategies, and Costs (PNPSC) is an extension of Petri nets specifically designed to model cyberattacks. This formalism has been the basis for a long-running research program consisting of several interconnected research projects. Projects within that program include automatically generating PNPSC nets from the MITRE Common Attack Pattern Enumeration and Classification (CAPEC) database of cyberattack patterns, verification and validation of the models using several complementary methods, composing multiple PNPSC nets into models of realistic computer systems, and using machine learning to improve the strategies of players present in the formalism. Previously, strategies employed by these players consisted of a limited number of rates for each transition. This paper focuses on extending the machine learning of these player strategies to use continuous transition rates. Two deep learning algorithms, Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) make use of function approximation to effectively improve player strategies when continuous rates are present. By using continuous transition rates, more granularity can be given to the real-world decisions represented by the transitions in the models. The training performance and effectiveness of developed strategies are compared to previous research using Monte Carlo reinforcement learning.
Keywords
CONCEPTUAL MODELING, CYBER, DEEP LEARNING, DISCRETE EVENT SIMULATION, MACHINE LEARNING
Additional Keywords
Petri nets