Military units must embrace tools that harness available data to accelerate and improve decision-making. For example, machine learning techniques make it possible for brigade combat teams to forecast readiness rates for their combat fleets and the risk of harmful soldier behaviors, such as suicide.
The U.S. Army must not wait for advanced artificial intelligence (AI) models to make data-informed decisions a reality. Rapid data modeling is a practical way that field grade leaders and commanders can harness datacentric tools for readiness and operational success. Available data science methods offer many possibilities for making quantitative modeling accessible and valuable to commanders and staffs at the tactical level.
Beyond Spreadsheets
Effective data management is fundamental to rapid data modeling. Tactical units at division and below generate data about daily activities, but the data often only exists in spreadsheets within staff sections. Leveraging this data requires linking these isolated “data puddles” built for specific uses to form “data ponds” and, eventually, creating more expansive “data lakes,” or large pools of data. This is necessary to feed into the Army chief information officer’s guidance to adopt data mesh principles to standardize data management across the force.
An example of a data lake at the tactical level is the aggregation of measurements across personnel, supply, equipment readiness and training, known as PSRT, into a larger PSRT dataset. These individual measurements produce variables when combined into models, revealing how factors such as personnel- and equipment-fill percentages relate to training outcomes such as gunnery scores. Deliberately structuring and cleaning data to enable modeling is what transforms a data lake into a “data warehouse”—a central repository used to turn data into informed decisions.
Theory Into Action
Data is one element for rapid data modeling. Quantitative models also require logical frameworks, or theories, for connecting individual variables. For example, consider a proposal that heavier vehicle use associated with increased operational tempo is negatively related to fully mission-capable rates for combat equipment. After defining and operationalizing the variables (such as operational tempo and fully mission-capable rates) with expected meaningful relationships, theoretical frameworks can be translated into mathematical representations, or equations. These describe how changes in one variable relate to changes in the other.
Of the four commonly accepted forms of data analytics models, the three types of quantitative models meet the needs of most commands: descriptive, diagnostic and predictive.
Descriptive models are used to describe or represent a system or process as it exists. These models help identify meaningful variables in a system, offering insights into what factors correlate with results. A simple method for testing correlations is ordinary least squares regression, which fits a linear model through a set of data points to determine whether statistically significant relationships exist. This method is easily and quickly executed within Microsoft Excel spreadsheets to determine relationships with powerful results. An example is performing a least squares regression on maintenance data to determine which factors have the greatest impact on equipment readiness. A least squares regression is a statistical model estimating the relationship between variables.
Diagnostic modeling explores alternate scenarios to build counterfactuals, or alternative outcomes, and estimate the difference between what happened and what might have happened. It moves beyond correlation to estimate causality. Diagnostic models drive impact evaluations, allowing leaders to consider the impact of decisions or programs to determine effectiveness. An example outcome of this method is understanding how adhering to a service schedule improves or degrades equipment readiness.
Forecasting Outcomes
Predictive models consider patterns in historical data to forecast future outcomes. This allows units to anticipate scenarios, such as potential drops in equipment or personnel readiness levels. Units use predictive models to forecast battalion operational readiness rates based on planned training, as well as forecasting units at risk of harmful behavior based on historical serious incident reporting trends.
These tools are powerful models that provide commanders with greater understanding of decisions and risk.
To build these models, staffs require access to basic analytical tools, such as Excel, R-Studio or Jupyter Notebook. While Excel commonly is used by staffs, the latter two also are readily available on the Army Resource Cloud via cPROBE.
Algorithms—step-by-step computational procedures that process data and yield results—serve as the backbone of quantitative modeling, translating data into actionable insights. Understanding the functionality and applicability of algorithms is essential for building effective models.
Moreover, the integration of AI, particularly machine learning, can amplify the precision and predictive power of models, facilitating a more nuanced and dynamic understanding of complex military operations.
Some models use AI and others leverage AI to improve accuracy. The most prevalent form of AI in modeling is machine learning. Machine learning is a general term that describes an algorithm that can learn from additional data inputs or generalize trends from unseen data. At the highest levels, machine learning models learn from themselves.
However, simpler applications are just as powerful and easily utilized. Some examples of machine learning models are Random Forest Regressor, Gradient Boosting Regressor and Prophet. These models are ready-made tools that accept data and provide forecasts. They run in cPROBE on all types of data sets, and are employed with limited training or knowledge.
AI also can improve the accuracy of a model through optimization of variables. Utilizing a simple regression model, a machine learning program can determine the optimal value for variables. Examples of this include Bayesian or Random search. Both assess the features of the model and determine the ratio that produces the best results.
Practical Applications
Rapid data modeling has a range of applications, from unit risk analysis to resource allocation to predictive maintenance. The key measure of success is whether the resulting models help focus leader attention in the right places, direct resources to their most valuable uses or determine what is effective.
Data analytics can arm commanders with insights that guide distribution of resources, enhancing operational efficiency and effectiveness.
Simply describing the structure of data using descriptive modeling can be valuable. Basic linear regression models uncover hidden correlations within data sets, tying together factors not obviously linked. For example, a brigade in the 4th Infantry Division in March discovered a link between unit culture and retention results at the company, troop and battery levels. This enabled leaders to act on the hypothesis that greater investment in soldier professional development yields increased reenlistments.
Diagnostic modeling enables commanders to determine the effectiveness of programs. By enabling the modeling of counterfactual outcomes, data analytics allows leaders to estimate what might happen if a program or course of action is not pursued. Predictive modeling can leverage historical PSRT data through machine learning algorithms to forecast likely outcomes. Predictive models can enable proactive maintenance schedules and resource optimization. By simulating possible scenarios and future outcomes, rapid data modeling can provide insights that guide decision-making and maximize resources.
Teach Skills
Three main factors can help the Army harness the power of rapid data modeling. First, greater data literacy is needed at all levels. Modeling does not require an advanced education or specialty training. To lower the entry cost, basic data skills must be taught in professional military education, starting with captains career courses. Additionally, leaders must improve their understanding of what data and models reveal. A chart predicting a future state is simple; however, being able to understand the tools to determine if the data displayed by the chart is accurate and relevant should be second nature to decision-makers in a datacentric force. This is the difference between using data and employing it effectively for results.
Second, data management is critical for modeling that supports decisions. This involves turning isolated, disconnected data pools into larger data lakes. Standardized data collection and consolidation in a cloud-based system is critical to enable rapid data modeling. The VAULTIS (visible, accessible, understandable, linked, trustworthy, interoperable, secure) framework discussed in the 2020 DoD Data Strategy remains a valuable way for units to approach data management plans.
Third, leadership is critical. Integrating quantitative modeling into decision-making requires champions who generate insights and trust using quantitative methods. As battalion commanders, both authors of this article started brigade-level data teams nested within the 4th Infantry Division’s leader-development framework emphasizing creativity and innovation. We have started to make progress on predictive Stryker maintenance and the prevention of harmful behaviors. However, such efforts will not maintain momentum without a demand from higher commanders that emphasizes the importance of data analytics and utilizes results to inform decisions.
Adoption of rapid data modeling by tactical units is pivotal for a more agile, efficient and effective datacentric force. Such models empower leaders at all levels to harness the strength of data, enhancing decision-making accuracy and operational readiness.
As the military continues to navigate a complex global and budgetary landscape, the interplay of data, modeling and decision-making will be a defining factor in improving outcomes of military operations.
***
Lt. Col. Jon Bate is the commander of the 2nd Battalion, 23rd Infantry Regiment, 1st Stryker Brigade Combat Team, 4th Infantry Division, Fort Carson, Colorado. A Goodpaster Scholar in the Advanced Strategic Planning and Policy Program, School of Advanced Military Studies, he has a master’s in public policy from the Harvard Kennedy School, Massachusetts, and holds a doctorate in political science from Stanford University, California.
Lt. Col. Nate Platz is the commander of the 704th Brigade Support Battalion, 2nd Stryker Brigade Combat Team, 4th Infantry Division. A Major General James Wright Fellow at the College of William and Mary, Virginia, he has an MBA from the College of William and Mary.