Structured Visual Brief Interfaces for Advertising Design: A UI/UX Framework for Turning Creative Intentions into Designer-Editable Graphic Design Cards
DOI:
https://doi.org/10.51903/ijgd.v4i1.3702Keywords:
advertising brief, visual communication, UI/UX, openCOLE, LLM-Compatible DesignAbstract
This paper proposes and evaluates a UI/UX framework for advertising and graphic design interfaces. The framework converts a short creative intention into an editable structured brief card with eight slots: headline, sub-heading, visual object, background mood, call to action, audience, design risk, and brand tone. The motivation is that ordinary text prompts are easy to write but weak as design coordination artifacts because they mix marketing intent, visual instructions, copy hierarchy, and review cues in one statement. OpenCOLE provides an setting because its schema contains intention, description, keywords, heading/sub-heading fields for graphic design generation (Inoue et al., 2024). The study compares a prompt with a structured brief-card representation on the OpenCOLE Parquet dataset of 23,419 rows. The card is populated by deterministic rules; the study therefore evaluates an LLM-compatible interface layer rather than a live model-driven generation system. Across the full dataset, the structured card increased the structured-brief adequacy score from 0.299 to 0.878, a mean gain of 0.579 under paired Wilcoxon testing (p < .001). Intent coverage, keyword recall, heading hierarchy, CTA recognizability, semantic consistency, audience explicitness, and risk explicitness all improved. A deterministic advertising-oriented subset of 19,681 rows showed the same pattern, with the mean increasing from 0.313 to 0.881. The metrics evaluate whether a brief representation preserves and exposes information needed before production; they do not measure final design quality, designer preference, or user satisfaction. The results support the argument that generative design workflows should expose structured, designer-editable brief cards rather than relying on prompt expansion alone.
References
Agrawala, M., Li, W., & Berthouzoz, F. (2011). Design principles for visual communication. Communications of the ACM, 54(4), 60–69. https://doi.org/10.1145/1924421.1924439
Bertin, J. (1983). Semiology of graphics: Diagrams, networks, maps. University of Wisconsin Press.
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., ... Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.
Chen, Y., & Li, M. (2025). From hand-drawn sketches to interactive web prototypes: A reproducible vision-language approach with structural and visual consistency evaluation. Journal of Technology Informatics and Engineering, 4(2), 364–384. https://doi.org/10.51903/jtie.v4i2.490
CyberAgent AI Lab. (2024). cyberagent/opencole [Data set]. Hugging Face. https://huggingface.co/datasets/cyberagent/opencole
Gupta, K., Achille, A., Lazarow, J., Davis, L., Mahadevan, V., & Shrivastava, A. (2021). LayoutTransformer: Layout generation and completion with self-attention. Proceedings of the IEEE/CVF International Conference on Computer Vision, 1004–1014.
Heer, J., Bostock, M., & Ogievetsky, V. (2010). A tour through the visualization zoo. Communications of the ACM, 53(6), 59–67. https://doi.org/10.1145/1743546.1743567
Inoue, N., Kikuchi, K., Simo-Serra, E., Otani, M., & Yamaguchi, K. (2023). LayoutDM: Discrete diffusion model for controllable layout generation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10167–10176.
Inoue, N., Masui, K., Shimoda, W., & Yamaguchi, K. (2024). OpenCOLE: Towards reproducible automatic graphic design generation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 8131–8135.
Jia, P., Li, C., Liu, Z., Shen, Y., Chen, X., Yuan, Y., Zheng, Y., Chen, D., Li, J., Xie, X., & others. (2023). COLE: A hierarchical generation framework for graphic design. arXiv. https://arxiv.org/abs/2311.16974
Jiang, S., Wang, Z., Hertzmann, A., Jin, H., & Fu, Y. (2019). Visual font pairing. IEEE Transactions on Multimedia, 21(8), 2086–2097.
Kotler, P., & Keller, K. L. (2016). Marketing management (15th ed.). Pearson.
Landa, R. (2016). Advertising by design: Generating and designing creative ideas across media (3rd ed.). Wiley.
Li, J., Yang, J., Hertzmann, A., Zhang, J., & Xu, T. (2019). LayoutGAN: Generating graphic layouts with wireframe discriminators. International Conference on Learning Representations.
Liu, H., Li, C., Wu, Q., & Lee, Y. J. (2023). Visual instruction tuning. Advances in Neural Information Processing Systems, 36, 34892–34916.
Lok, S., & Feiner, S. (2001). A survey of automated layout techniques for information presentations. Smart Graphics, 214–229.
Meggs, P. B., & Purvis, A. W. (2016). Meggs’ history of graphic design (6th ed.). Wiley.
Nielsen, J. (1994). Usability engineering. Morgan Kaufmann.
Norman, D. A. (2013). The design of everyday things (Rev. ed.). Basic Books.
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, S., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J., & Lowe, R. (2022). Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35, 27730–27744.
Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., Müller, J., Penna, J., & Rombach, R. (2023). SDXL: Improving latent diffusion models for high-resolution image synthesis. arXiv. https://arxiv.org/abs/2307.01952
Qiu, Q., Wang, X., & Otani, M. (2023). Multimodal color recommendation in vector graphic documents. Proceedings of the 31st ACM International Conference on Multimedia, 1231–1240.
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, D. (2022). High-resolution image synthesis with latent diffusion models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684–10695.
Rosenfeld, L., Morville, P., & Arango, J. (2015). Information architecture: For the web and beyond (4th ed.). O’Reilly Media.
Shneiderman, B., Plaisant, C., Cohen, M., Jacobs, S., Elmqvist, N., & Diakopoulos, N. (2016). Designing the user interface: Strategies for effective human-computer interaction (6th ed.). Pearson.
Xu, H., Chen, Y., & Med, A. (2025). Automatic detection and explanation of dark patterns from interface microcopy: Empirical comparison of BERT-style encoders, RoBERTa-style encoders, and LLM-style decoders on the ec-darkpattern dataset. Journal of Technology Informatics and Engineering, 4(3), 590–612. https://doi.org/10.51903/jtie.v4i3.491
Yamaguchi, K. (2021). CanvasVAE: Learning to generate vector graphic documents. Proceedings of the IEEE/CVF International Conference on Computer Vision, 5481–5489.
Yang, X., Mei, T., Xu, Y.-Q., Rui, Y., & Li, S. (2016). Automatic generation of visual-textual presentation layout. ACM Transactions on Multimedia Computing, Communications, and Applications, 12(2), 1–22.
Yuan, L.-P., Zhou, Z., Zhao, J., Guo, Y., Du, F., & Qu, H. (2021). InfoColorizer: Interactive recommendation of color palettes for infographics. IEEE Transactions on Visualization and Computer Graphics, 28(12), 4252–4266.
Zhao, N., Cao, Y., & Lau, R. W. H. (2018). Modeling fonts in context: Font prediction on web designs. Computer Graphics Forum, 37(7), 385–395.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Jinyi Mu, Yifei Lu, Evelyn Hwang

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.









5.png)
