Abstract
We explore the question of whether team performance scores can be automatically predicted from team dialogue features. We use transcriptions from U.S. Navy military training exercises, which were designed to improve decision-making under stress. These exercises were scored by subject matter experts on various team performance indicators, e.g., situation updates, error correction, brevity, clarity.
In previous work (Georgila et al., I/ITSEC 2024) we presented a dialogue act annotation scheme for this dataset and experiments on automatic dialogue act labeling. Dialogue acts indicate the main purpose of an utterance, e.g., request-information, provide-information, suggestion, command. Here, we continue this work and focus on team performance prediction based on manually and automatically extracted dialogue features. To enhance our models with more informative features than dialogue acts, we develop a novel annotation scheme which handles lower-level task coordination, marking the initiation and resolution points for events such as commands, suggestions, and requests. There can be cases where initiating an event can trigger the initiation of a new event and so on. These nested events can show how issuing commands/suggestions/requests follows the chain of command downwards (from higher levels to lower levels) and then their resolution follows the chain of command upwards (from lower levels to higher levels). We use machine learning to build team performance prediction models which outperform baselines for each of the 11 team performance indicators, and report results varying the machine learning methods and dialogue features used.
Our contributions are: We develop a novel annotation scheme handling lower-level task coordination (initiations and resolutions of events). We use machine learning for predicting team performance using both manually annotated and automatically extracted dialogue features. Our experiments are performed on real-world data recording training exercises. Our work advances the state of natural language dialogue processing as a means to understand and predict team performance.