Abstract
The dominant narrative of artificial intelligence as a replacement for or augmentation of human judgment has obscured a more nuanced and productive possibility: that the greatest value of AI in organizational decision-making may emerge not from the accuracy of AI in isolation nor from human judgment alone, but from the structured interaction between them. Human-AI judgment complementarity—the capacity of human and AI decision-makers to make different but non-overlapping errors in ways that enable superior joint performance—represents a paradigm shift in how organizations should think about AI deployment. Yet realizing this complementary potential requires solving a set of deeply interconnected challenges: understanding when and why human and AI judgments diverge, enabling humans to appropriately rely on AI advice while retaining the ability to override it when necessary, calibrating trust in ways that reflect actual rather than miscalibrated AI reliability, and designing organizational processes that make productive disagreement rather than consensus the goal of human-AI collaboration. Drawing on three foundational references and twelve supplementary citations spanning human-AI teaming science, complementarity theory, appropriate reliance research, and organizational AI design, this paper develops a comprehensive framework for understanding and cultivating human-AI judgment complementarity in organizational settings. The analysis reveals that the metacognitive calibration crisis identified by the MIRROR benchmark, the structural auditing constraints revealed by the Verification Tax, and the practical demand for explainability in human resource analytics are not merely technical challenges but fundamental barriers to achieving the complementary potential of human-AI teams.
References
1. Xu, G., Murthy, S. V., & Jia, B. (2025). Enhancing intuitive decision-making and reliance through human–AI collaboration: A review. Informatics, 12(4), 135. https://doi.org/10.3390/informatics12040135
2. Cetinkaya, N. E., & Krämer, N. (2026). Between transparency and trust: Identifying key factors in AI system perception. Behaviour & Information Technology, 45(5), 840–854. https://doi.org/10.1080/0144929X.2025.2533358
3. Gonzalez, C., & Heidari, H. (2025). A cognitive approach to human–AI complementarity in dynamic decision-making. Nature Reviews Psychology, 4, 808–822. https://doi.org/10.1038/s44159-025-00499-x
4. Wang, J. Z. (2026). The Verification Tax: Fundamental Limits of AI Auditing in the Rare-Error Regime. arXiv preprint arXiv:2604.12951. https://doi.org/10.48550/arXiv.2604.12951
5. Bansal, G., Nushi, B., Kamar, E., Lasecki, W. S., Weld, D. S., & Horvitz, E. (2019). Beyond accuracy: The role of mental models in human-AI team performance. Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, 7(1), 2–11. https://doi.org/10.1609/hcomp.v7i1.5285
6. Tariq, S., Chhetri, M. B., Nepal, S., & Paris, C. (2025). A²C: A modular multi-stage collaborative decision framework for human–AI teams. Expert Systems with Applications, 282, 127318. https://doi.org/10.1016/j.eswa.2025.127318
7. McInerney, T. (2026). The algorithmic construction of epistemic injustice. In M. L. Flear, C. Davies-Tyrie, & D. Wincott (Eds.), Socio-Legal Studies of Epistemic Injustice and Spaces and Places (pp. 123–154). Palgrave Macmillan. https://doi.org/10.1007/978-3-032-07581-9_5
8. Hemmer, P., Schemmer, M., Kühl, N., Vössing, M., & Satzger, G. (2025). Complementarity in human-AI collaboration: Concept, sources, and evidence. European Journal of Information Systems, 34(6), 979–1002. https://doi.org/10.1080/0960085X.2025.2475962
9. Wang, J. Z. (2026). MIRROR: A Hierarchical Benchmark for Metacognitive Calibration in Large Language Models. arXiv preprint arXiv:2604.19809. https://doi.org/10.48550/arXiv.2604.19809
10. Bei, J., Liu, Z., Huang, J., Wang, X., & Yang, P. (2025, December). Strategic Human Resource Analytics with Explainable Artificial Intelligence: An Interpretable Prediction Framework for Employee Promotion to Support Managerial Decision-Making. In Proceedings of the 2025 6th International Conference on Computer Science and Management Technology (ICCSMT 2025) (pp. 77–82). ACM. https://doi.org/10.1145/3795154.3795166
11. Gonzalez, C. (2026). Toward a science of human–AI teaming for decision making: A complementarity framework. PNAS Nexus, 5(3), pgag030. https://doi.org/10.1093/pnasnexus/pgag030
12. Inkpen, K., Chappidi, S., Mallari, K., Nushi, B., Ramesh, D., Michelucci, P., Mandava, V., Vepřek, L. H., & Quinn, G. (2023). Advancing Human-AI Complementarity: The impact of user expertise and algorithmic tuning on joint decision making. ACM Transactions on Computer-Human Interaction, 30(5), 1–29. https://doi.org/10.1145/3534561
13. Schoeffer, J., De-Arteaga, M., & Kühl, N. (2024). Explanations, fairness, and appropriate reliance in human-AI decision-making. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24) (Article 836, pp. 1–18). ACM. https://doi.org/10.1145/3613904.3642621
14. Fügener, A., Walzner, D. D., & Gupta, A. (2025). Roles of artificial intelligence in collaboration with humans: Automation, augmentation, and the future of work. Management Science, 72(1), 538–557. https://doi.org/10.1287/mnsc.2024.05684
15. Pu, J., Chang, Y., Gao, S., Bao, S., Yan, K., Sun, X., Carvalhais, N., & Myneni, R. B. (2025). MCI GPP: Ensembling a global model- and climate-independent gross primary productivity for 2001–2023. Scientific Data, 12, 1965. https://doi.org/10.1038/s41597-025-06218-8
16. Chang, Y., Winkler, A. J., Noori, A., Knyazikhin, Y., & Myneni, R. B. (2025). Precipitation leads the long-term vegetation increase in the conterminous United States drylands. Environmental Research Letters, 20(4), 044006. https://doi.org/10.1088/1748-9326/adb985
17. Dai, Y., Chen, Z., Pradeepkumar, J., Matsubara, Y., Sun, J., Sakurai, Y., & Dong, Y. (2026). EpiGraph: Building Generalists for Evidence-Intensive Epilepsy Reasoning in the Wild. arXiv preprint arXiv:2605.09505. https://doi.org/10.48550/arXiv.2605.09505
18. Lu, Y., Hu, K., & Zhang, L. (2026). S³G: Stock State Space Graph for Enhanced Stock Trend Prediction. In ICASSP 2026–2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4081–4085). IEEE. https://doi.org/10.48550/arXiv.2603.24236
