LLM-Style Explainable E-Commerce Recommendation Cards: A UI/UX Design Framework for Trust-Calibrated Product Recommendation

Boning Zhang; Yuxuan  Ren; Jocelyn Zou

doi:10.51903/ijgd.v3i2.3697

Authors

Boning Zhang Computer Science, Georgetown University, DC, USA
Yuxuan Ren Chemical Engineering, University of Washington, WA, USA
Jocelyn Zou Information Experience Design, Pratt Institute, NY, USA

DOI:

https://doi.org/10.51903/ijgd.v3i2.3697

Keywords:

explainable recommendation, e-commerce, LLM applications, UI/UX design, visual communication, trust calibration, Amazon Reviews 2023, human-centered AI

Abstract

This paper presents and empirically evaluates a UI/UX design framework for explainable e-commerce recommendation cards. The framework addresses a practical visual-communication problem: product lists can be useful but opaque, while explanation-heavy cards can create unwarranted confidence when the system has weak evidence. The revised study therefore uses the term LLM-style for the language condition and treats it as a grounded card-generation and confidence-display policy rather than as evidence of a black-box large-language-model recommender. Experiments were conducted on Amazon Reviews'23 All_Beauty raw reviews and item metadata, together with the Beauty_and_Personal_Care 5-core benchmark split as a larger same-domain warm-user check. The All_Beauty review file contains 701,528 review records from 631,986 users and 112,565 parent items, and the metadata file contains 112,590 parent items with near-complete title and image coverage. On the sparse All_Beauty all-user test, Recall@10 remained low for all methods, with the LLM-style reciprocal-rank reranker reaching 0.007945. On the All_Beauty warm-user slice, the same reranker reached Recall@10 of 0.008079. On the larger Beauty_and_Personal_Care 5-core test, it reached Recall@10 of 0.021463, improving over popularity and last-item co-history baselines but still indicating modest recommendation effectiveness. Card-level evaluation on All_Beauty shows that the LLM-style explanation plus confidence card achieved the highest confidence-discrimination AUC (0.700), while the review-evidence card offered a simpler evidence-forward alternative. The results support an interface-oriented conclusion: recommendation cards should separate ranking quality, grounded evidence, and confidence display, and UI/UX claims should be framed as proxy-based evidence until validated with a controlled user study.

References

Amershi, S., Weld, D., Vorrell, M., Lee, B., Kapoor, A., Fourney, A., Nushi, B., & Horvitz, E. (2019). Guidelines for human-AI interaction. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1-13. https://doi.org/10.1145/3290605.3300233

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., ... Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.

Chen, X., Zhang, Y., & Wen, J. (2022). Measuring why in recommender systems: A comprehensive survey on the evaluation of explainable recommendation. arXiv. https://arxiv.org/abs/2202.06466

Gedikli, F., Jannach, D., & Ge, M. (2014). How should I explain? A comparison of different explanation types for recommender systems. International Journal of Human-Computer Studies, 72(4), 367-382. https://doi.org/10.1016/j.ijhcs.2013.12.007

He, X., Liao, L., Zhang, H., Nie, L., Hu, X., & Chua, T.-S. (2017). Neural collaborative filtering. Proceedings of the 26th International Conference on World Wide Web, 173-182. https://doi.org/10.1145/3038912.3052569

Hou, Y., Li, J., He, Z., Yan, A., Chen, X., & McAuley, J. (2024). Bridging language and items for retrieval and recommendation. arXiv. https://arxiv.org/abs/2403.03952

Hu, Y., Koren, Y., & Volinsky, C. (2008). Collaborative filtering for implicit feedback datasets. Proceedings of the 2008 IEEE International Conference on Data Mining, 263-272. https://doi.org/10.1109/ICDM.2008.22

Jannach, D., Zanker, M., Felfernig, A., & Friedrich, G. (2010). Recommender systems: An introduction. Cambridge University Press.

Jason Kuhn, Yushan Chen, & Evelyn Chan. (2024). AI-Driven Mobile UI Pattern Recognition and Design Topic Mining on RICO: Semantic Clustering and Screenshot-Based Topic Classification. Journal of Advanced Computing Systems , 4(5), 67-83. https://doi.org/10.69987/JACS.2024.40506

Konstan, J. A., & Riedl, J. (2012). Recommender systems: From algorithms to user experience. User Modeling and User-Adapted Interaction, 22, 101-123. https://doi.org/10.1007/s11257-011-9112-x

Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30-37. https://doi.org/10.1109/MC.2009.263

Kula, M. (2015). Metadata embeddings for user and item cold-start recommendations. Proceedings of the 2nd Workshop on New Trends on Content-Based Recommender Systems, 14-21.

Lee, J. D., & See, K. A. (2004). Trust in automation: Designing for appropriate reliance. Human Factors, 46(1), 50-80. https://doi.org/10.1518/hfes.46.1.50_30392

Liao, Q. V., & Varshney, K. R. (2021). Human-centered explainable AI (XAI): From algorithms to user experiences. arXiv. https://arxiv.org/abs/2110.10790

Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.

Nielsen, J. (1994). Usability engineering. Morgan Kaufmann.

Norman, D. A. (2013). The design of everyday things: Revised and expanded edition. Basic Books.

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P. F., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730-27744.

Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, 3982-3992. https://doi.org/10.18653/v1/D19-1410

Rendle, S. (2010). Factorization machines. Proceedings of the 2010 IEEE International Conference on Data Mining, 995-1000. https://doi.org/10.1109/ICDM.2010.127

Rendle, S., Freudenthaler, C., Gantner, Z., & Schmidt-Thieme, L. (2009). BPR: Bayesian personalized ranking from implicit feedback. Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, 452-461.

Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). Why should I trust you? Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135-1144. https://doi.org/10.1145/2939672.2939778

Robertson, S., & Zaragoza, H. (2009). The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval, 3(4), 333-389. https://doi.org/10.1561/1500000019

Sinha, R., & Swearingen, K. (2002). The role of transparency in recommender systems. CHI '02 Extended Abstracts on Human Factors in Computing Systems, 830-831. https://doi.org/10.1145/506443.506619

Tintarev, N., & Masthoff, J. (2007). A survey of explanations in recommender systems. 2007 IEEE 23rd International Conference on Data Engineering Workshop, 801-810. https://doi.org/10.1109/ICDEW.2007.4401070

Tintarev, N., & Masthoff, J. (2012). Evaluating the effectiveness of explanations for recommender systems. User Modeling and User-Adapted Interaction, 22, 399-439. https://doi.org/10.1007/s11257-011-9117-5

Yushan Chen, & Evelyn Chan. (2023). Multimodal UI Representation Learning: Ablation of Screenshot, Wireframe, and View-Hierarchy Proxies on an Uploaded 168-Screen Dataset. Journal of Advanced Computing Systems , 3(1), 1-15. https://doi.org/10.69987/JACS.2023.30101

Zhang, Y., & Chen, X. (2020). Explainable recommendation: A survey and new perspectives. Foundations and Trends in Information Retrieval, 14(1), 1-101. https://doi.org/10.1561/1500000066

LLM-Style Explainable E-Commerce Recommendation Cards: A UI/UX Design Framework for Trust-Calibrated Product Recommendation

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite

menunew