Uncertainty-Aware Breast Ultrasound Explanation Cards: A Visual Communication Framework for Image-Based AI Diagnostic Support Using BreastMNIST_224

Tong  Ye; Xiaohan Chang; Eric  Zhong

doi:10.51903/ijgd.v3i2.3701

Authors

Tong Ye Computer Science, Northeastern University, CA, USA
Xiaohan Chang Computer Science, University of Connecticut, CT, USA
Eric Zhong Computer Science, USC, CA, USA

DOI:

https://doi.org/10.51903/ijgd.v3i2.3701

Keywords:

Breast cancer detection, uncertainty visualization, explainable artificial intelligence, UI/UX, visual communication, diagnostic interface, calibration, WDBC, BreastMNIST, explanation cards

Abstract

AI-assisted diagnostic interfaces should communicate more than a class label. They need to show the predicted risk, the uncertainty around that risk, the visual evidence that influenced the model, the limits of the evidence, and the appropriate next action. This paper presents an uncertainty-aware explanation-card framework for breast ultrasound decision-support screens. The empirical study was conducted on BreastMNIST_224, the 224 x 224 MedMNIST+ breast ultrasound benchmark with official train, validation, and test splits of 546, 78, and 156 images. The positive class was defined as malignant. Five image classifiers were trained on downsampled image grids, and the selected card model was a Platt-probability RBF SVM. On the official test split, the selected model achieved AUROC = 0.867 and AUPRC = 0.728. A validation-selected operating threshold of 0.254 gave accuracy = 0.769, sensitivity = 0.833, specificity = 0.746, Brier score = 0.125, and ECE = 0.068. The explanation card pairs malignant-risk probability with risk tier, uncertainty band, occlusion-sensitivity heatmap evidence, a limitation statement, and a review cue. In the held-out test set, the conservative Low-risk tier contained six cases and no malignant cases; all seven false negatives occurred in the Review tier rather than in Low risk. These findings support a prototype-level visual communication framework in which image evidence is shown together with uncertainty and safeguards, while diagnostic authority remains with the clinician.

References

Al-Dhabyani, W., Gomaa, M., Khaled, H., & Fahmy, A. (2020). Dataset of breast ultrasound images. Data in Brief, 28, 104863. https://doi.org/10.1016/j.dib.2019.104863

Amershi, S., Weld, D., Vorst, M., Chilton, L., Kim, J., Ruamviboonsuk, P., & Horvitz, E. (2019). Guidelines for human-AI interaction. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1-13. https://doi.org/10.1145/3290605.3300233

Cai, C. J., Winter, S., Steiner, D., Wilcox, L., & Terry, M. (2019). Hello AI: Uncovering the onboarding needs of medical practitioners for human-AI collaborative decision-making. Proceedings of the ACM on Human-Computer Interaction, 3(CSCW), 1-24. https://doi.org/10.1145/3359206

Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv. https://arxiv.org/abs/1702.08608

Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., & Yang, G.-Z. (2019). XAI - Explainable artificial intelligence. Science Robotics, 4(37), eaay7120. https://doi.org/10.1126/scirobotics.aay7120

Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. Proceedings of the 34th International Conference on Machine Learning, 70, 1321-1330.

Jason Kuhn, Yushan Chen, & Evelyn Chan. (2024). AI-Driven Mobile UI Pattern Recognition and Design Topic Mining on RICO: Semantic Clustering and Screenshot-Based Topic Classification. Journal of Advanced Computing Systems , 4(5), 67-83. https://doi.org/10.69987/JACS.2024.40506

Kelly, C. J., Karthikesalingam, A., Suleyman, M., Corrado, G., & King, D. (2019). Key challenges for delivering clinical impact with artificial intelligence. BMC Medicine, 17, 195. https://doi.org/10.1186/s12916-019-1426-2

Lipton, Z. C. (2018). The mythos of model interpretability. Queue, 16(3), 31-57. https://doi.org/10.1145/3236386.3241340

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.

Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1-38. https://doi.org/10.1016/j.artint.2018.07.007

Niculescu-Mizil, A., & Caruana, R. (2005). Predicting good probabilities with supervised learning. Proceedings of the 22nd International Conference on Machine Learning, 625-632. https://doi.org/10.1145/1102351.1102430

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135-1144. https://doi.org/10.1145/2939672.2939778

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. Proceedings of the IEEE International Conference on Computer Vision, 618-626. https://doi.org/10.1109/ICCV.2017.74

Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Deep inside convolutional networks: Visualising image classification models and saliency maps. International Conference on Learning Representations Workshop.

Tonekaboni, S., Joshi, S., McCradden, M. D., & Goldenberg, A. (2019). What clinicians want: Contextualizing explainable machine learning for clinical end use. Proceedings of the Machine Learning for Healthcare Conference, 106-124.

Yang, J., Shi, R., & Ni, B. (2021). MedMNIST classification decathlon: A lightweight AutoML benchmark for medical image analysis. 2021 IEEE 18th International Symposium on Biomedical Imaging, 191-195. https://doi.org/10.1109/ISBI48211.2021.9434062

Yang, J., Shi, R., Wei, D., Liu, Z., Zhao, L., Ke, B., Pfister, H., & Ni, B. (2023). MedMNIST v2: A large-scale lightweight benchmark for 2D and 3D biomedical image classification. Scientific Data, 10, 41. https://doi.org/10.1038/s41597-022-01721-8

Yushan Chen, & Evelyn Chan. (2023). Multimodal UI Representation Learning: Ablation of Screenshot, Wireframe, and View-Hierarchy Proxies on an Uploaded 168-Screen Dataset. Journal of Advanced Computing Systems , 3(1), 1-15. https://doi.org/10.69987/JACS.2023.30101

Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. European Conference on Computer Vision, 818-833. https://doi.org/10.1007/978-3-319-10590-1_53

Uncertainty-Aware Breast Ultrasound Explanation Cards: A Visual Communication Framework for Image-Based AI Diagnostic Support Using BreastMNIST_224

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

menunew