id |
ecaade2024_117 |
authors |
Su, Xinyu; Luo, Jianhe; Liu, Zidong; Yan, Gaoliang |
year |
2024 |
title |
Text to Terminal: A framework for generating airport terminal layout with large-scale language-image models |
source |
Kontovourkis, O, Phocas, MC and Wurzer, G (eds.), Data-Driven Intelligence - Proceedings of the 42nd Conference on Education and Research in Computer Aided Architectural Design in Europe (eCAADe 2024), Nicosia, 11-13 September 2024, Volume 1, pp. 469–478 |
doi |
https://doi.org/10.52842/conf.ecaade.2024.1.469
|
summary |
Large-scale language-image (LLI) models present novel opportunities for architectural design by facilitating its multimodal process via text-image interactions. However, the inherent two-dimensionality of their outputs restricts their utility in architectural practice. Airport terminals, characterized by their flexibility and patterned forms, with most of the design operations occurring at the level of master plan, indicating a promising application area for LLI models. We propose a workflow that, in the early design phase, employs a fine-tuned Stable Diffusion model to generate terminal design solutions from textual descriptions and a site image, followed by a quantitative evaluation from an architectural expert's viewpoint. We created our dataset by collecting satellite images of 295 airport terminals worldwide and annotating them in terms of size and form. Using Terminal 2 of Zhengzhou Xinzheng International Airport as a case study, we scored the original and generated solutions on three airside evaluation metrics, verifying the validity of the proposed method. Our study bridges image generation and expert architectural design assessments, providing valuable insights into the practical application of LLI models in architectural practice and introducing a new method for the intelligent design of large-scale public buildings. |
keywords |
Multimodal Machine Learning, Diffusion Model, Text-to-Architecture, Airport Terminal Configuration Design, Post-Generation Evaluation |
series |
eCAADe |
email |
|
full text |
file.pdf (2,061,756 bytes) |
references |
Content-type: text/plain
|
Chen, X., Huang, X., and Wang, J. (2023)
Configuration Design Method of Airport Terminals based on Green and Low-carbon Operation: A Case Study based on Jieyang Chaoshan Airport and Guangzhou Baiyun Airport Terminal 2
, South Architecture, (5), pp. 68-74
|
|
|
|
de Neufville, R. (1975)
DESIGNING THE AIRPORT TERMINAL
, Transportation Research Board Special Report. Conference on Airport Landside CapacityTransportation Systems CenterFederal Aviation AdministrationU.S. Department of Transportation
|
|
|
|
Guida, G. (2023)
Multimodal Architecture: Applications of Language in a Machine Learning Aided Design Process
, CAADRIA 2023: Human-Centric, Ahmedabad, pp. 561-570. Available at: https://doi.org/10.52842/conf.caadria.2023.2.561
|
|
|
|
Hu, E.J. et al. (2021)
LoRA: Low-Rank Adaptation of Large Language Models
, International Conference on Learning Representations
|
|
|
|
Kim, F.C., Johanes, M. and Huang, J. (2023)
Text2Form Diffusion: Framework for learning curated architectural vocabulary
, eCAADe 2023: Digital Design Reconsidered, Graz, Austria, pp. 79-88. Available at: https://doi.org/10.52842/conf.ecaade.2023.1.079
|
|
|
|
Li, X. (2017)
Research on Systematic Design Methods of Hub-airport Terminal
, PhD thesis. Xi'an University of Architecture and Technology
|
|
|
|
Paananen, V., Oppenlaender, J. and Visuri, A. (2023)
Using text-to-image generation for architectural design ideation
, International Journal of Architectural Computing [Preprint]. Available at: https://doi.org/10.1177/14780771231222783
|
|
|
|
Podell, D. et al. (2023)
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
, The Twelfth International Conference on Learning Representations
|
|
|
|
Ramesh, A. et al. (2022)
Hierarchical Text-Conditional Image Generation with CLIP Latents
, arXiv. Available at: https://doi.org/10.48550/arXiv.2204.06125
|
|
|
|
Ren, B. and Yang, H. (2019)
Designing an Air-Land Integrated Comprehensive Transportation Hub: On the Design of T2 Terminal and GTC Project of Zhengzhou Xinzheng International Airport
, Architectural Journal, (9), pp. 90-93
|
|
|
|
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022)
High-resolution image synthesis with latent diffusion models
, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10684-10695
|
|
|
|
Transportation Research Board (2010)
Airport Passenger Terminal Planning and Design, Volume 1: Guidebook
, Washington, D.C.: Transportation Research Board. Available at: https://doi.org/10.17226/22964
|
|
|
|
Zhang, S. et al. (2023)
Text-to-Garden: Generating Traditional Chinese Garden Design From Text-Descriptions at Scale With Multimodal Machine Learning
, CAADRIA 2023: Human-Centric, Ahmedabad, pp. 79-88. Available at: https://doi.org/10.52842/conf.caadria.2023.1.079
|
|
|
|
last changed |
2024/11/17 22:05 |
|