Abstract
Traditionally, artificial intelligence (AI) algorithms are designed to align with ground truth or the consensus of trusted human decision-makers. While effective in domains where experts share similar backgrounds and reach agreement, this approach encounters challenges in complex fields like medical triage, where decisions are inherently difficult and often lack a single correct answer. Beyond expertise, individual attributes influence human decision-making. These Key Decision-Maker Attributes (KDMAs) are domain-agnostic characteristics that shape decision-making in specific contexts and can lead experts to disagree on difficult choices.
This research aims to develop models of KDMAs and their influence on human decision-making in triage, ultimately guiding the design of AI systems that align with trusted human decisions. Specifically, this paper explores challenges in modeling AI behavior relative to human decision-making in domains where expert consensus is not always achievable. Key obstacles include the lack of an established ground truth for mapping attributes to decisions, the absence of best practices for representing complex decisions in a machine-interpretable format, and the difficulty of obtaining high-quality, consistent data to inform ground truth development. Additionally, data collection methodology is crucial for eliciting meaningful information and understanding the influence of contextual factors on KDMAs and their interdependence.
To address these challenges, we developed a ground truth methodology by correlating validated psychometric tools with a set of hypothesized KDMAs. A scenario-based design process was implemented to elicit decision-making based on specific KDMAs. Human triage decisions were collected and analyzed using machine learning techniques to identify relationships between KDMAs and decision outcomes. The resulting model maps KDMAs to decisions, generating KDMA profiles that offer a multidimensional representation of decision-makers’ attributes. Results will be presented from two separate studies conducted across a 12 month period, providing insights into the challenges and, support for the feasibility, of this approach.