According to the DoD data is becoming a critical strategic asset for future conflicts. The DoD expects data to drive key decisions within areas such as supply chain and battlefield weapons. However, before data can be analyzed and mined for key insights it first needs to be usable. Collecting and managing data is a continuous challenge for both the DoD and many other organizations. Disparate collection systems and storage makes generating actionable insights from the data cumbersome. In some cases, data management and configuration can make up the bulk of data analysis projects. As organizations seek to become data driven, they not only need to think about analysis strategies, but also data management to ensure data collected is usable. Work presented describes the development of a scalable data analytics pipeline used to analyze aircraft performance with-in a larger defensive counter air scenario. Currently today, air combat exercises and simulations generate large amounts of data with different structures without any method for combining and analyzing the data to improve chances of mission success. The architecture was designed to work optimally in an environment that ingests data in many different formats. As designed, this novel data pipeline allows for scalability, processing optimization/parallelization, and most importantly provides insulation from data format changes. The method developed pairs an unstructured data lake with a structured data warehouse to ensure a variety of data sources can be used to discover insights for improved warfighter decision making. This architecture allows for raw data formats from many different sources to be ingested, parsed, and stored into a common format such that the analytics techniques are re-usable and scalable. The final paper will describe the development and scaling process, providing a tangible example of how to manage data from a complex simulation for ingestion into a data analysis
Developing a Scalable Data Analytics Pipeline
Conference
I/ITSEC 2021
Track
Emerging Concepts and Innovative Technologies
3 Views
1 Downloads