At the Air Force Research Laboratory (AFRL) in Mesa, AZ, there is an ongoing program of research on training in a 4-ship F-16 Distributed Mission Training (DMT) system. Typically, a team of four pilots comes to the laboratory for a week-long training exercise. They fly together as a 4-ship team on several missions designed to provide exposure to a range of combat scenarios. The missions involve extensive briefing and debriefing sessions in addition to the time in the simulators. In order to track the effects of training, several inter-related projects are underway to assess the effectiveness of training and to compare different training methods. In the project reported here, we assessed changes in the ways pilots understand important concepts related to the training. Pilots rated the relatedness of all pairs of 21 concepts from the domain of air-to-air engagements both before and after training. Measures of the internal consistency (Coherence) of the ratings and Pathfinder networks were derived from the ratings. Data from a group of the most experienced pilots (experts) provided a point of reference for the less-experienced pilots. At the beginning of the week, Coherence was significantly correlated with previous experience in fighter aircraft suggesting that providing consistent ratings depends on having a well-developed mental model of the domain. Also, there was a significant correlation between experience level and similarity to the expert reference group at the beginning of the week supporting the general validity of the measurement methods. There was a significant negative correlation between experience level and change in similarity to experts from pre- to post training ratings. Greater changes were found for the least-experienced pilots. As a result of these changes, correlations with prior experience level were no longer statistically significant at the end of the week. Further analyses on a group of the least-experience pilots (novices) lead to similar conclusions. In particular, there was a significant difference in mean Coherence between the experts and novices at the beginning of the week but not at the end of the week. Also novices showed significant pre- to post training increases in Coherence and in similarity to experts. These measurement methods appear to provide a basis for evaluating conceptual change. These assessment methods should prove to be a useful adjunct to performance-based methods of assessing training.