In recent years, 3D sensors have become increasingly ubiquitous, along with algorithms for integrating the measurements of these sensors over time to produce detailed and high-fidelity 3D models of both indoor and outdoor scenes. As large-scale 3D models become easier and cheaper to produce but remain prohibitively large and cumbersome, the emphasis has been slowly shifting from model production to effective storage, transfer, visualization and processing of these models, as well as their ease-of-use when a human agent is interacting with them. To this end, we propose a novel and fully-automated system for understanding the distinct components of a 3D scene and the contextual interactions between such components in order to get a better understanding of the scene contents and to segment the scene into various semantic categories of interest. Imbuing existing 3D models with such semantic attributes is an important first step in the broader 3D scene understanding problem, allowing automatic identification of different objects, parts of objects or types of terrain, which in turn allows for these categories to be targeted separately by simulation frameworks as well as various downstream processes. We show that through the use of these semantic attributes, it is possible to i) generate significantly more compact models without drastic degradations in quality and fidelity, allowing the targeting of mobile platforms with limited computational capabilities, ii) improve localization accuracy when estimating the full 6-DOF pose of a mobile agent situated in the scene, and iii) provide human agents with richer and smoother interactions with such 3D models during simulations and training scenarios.