id |
caadria2024_197 |
authors |
Xia, Shengtao, Cheng, Yiming and Tian, Runjia |
year |
2024 |
title |
ARCHICLIP: Enhanced Contrastive Language–Image Pre-training Model With Architectural Prior Knowledge |
source |
Nicole Gardner, Christiane M. Herr, Likai Wang, Hirano Toshiki, Sumbul Ahmad Khan (eds.), ACCELERATED DESIGN - Proceedings of the 29th CAADRIA Conference, Singapore, 20-26 April 2024, Volume 1, pp. 69–78 |
doi |
https://doi.org/10.52842/conf.caadria.2024.1.069
|
summary |
In the rapidly evolving field of Generative AI, architects and designers increasingly rely on generative models for their workflows. While previous efforts focused on functional or building performance aspects, designers often prioritize novelty in architectural design, necessitating machines to evaluate abstract qualities. This article aims to enhance architectural style classification using CLIP, a Contrastive Language–Image Pre-training method. The proposed workflow involves fine-tuning the CLIP model on a dataset of over 1 million architecture-specific image-text pairs. The dataset includes project descriptions and tags, aiming at capturing spatial quality. Fine-tuned CLIP models outperform pre-trained ones in architecture-specific tasks, showcasing potential applications in training diffusion models, guiding generative models, and developing specialized search engines for architecture. Although the dataset awaits human designer review, this research offers a promising avenue for advancing generative tools in architectural design. |
keywords |
machine learning, generative design, Contrastive Language-Image Pre-training, artificial intelligence |
series |
CAADRIA |
email |
|
full text |
file.pdf (1,190,364 bytes) |
references |
Content-type: text/plain
|
Cherti, M., Beaumont, R., Wightman, R., Wortsman, M., Ilharco, G., Gordon, C., ... & Jitsev, J. (2023)
Reproducible scaling laws for contrastive language-image learning
, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2818-2829)
|
|
|
|
Larochelle, H., Erhan, D., & Bengio, Y. (2008)
Universal language model fine-tuning for text classification
, AAAI (Vol. 1, No. 2, p. 3)
|
|
|
|
Phelan, N., Davis, D., & Anderson, C. (2017)
Evaluating architectural layouts with neural networks
, Proceedings of the Symposium on Simulation for Architecture and Urban Design (pp. 1-7)
|
|
|
|
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., ... & Sutskever, I. (2021)
Learning transferable visual models from natural language supervision
, International conference on machine learning (pp. 8748-8763). PMLR
|
|
|
|
Sharma, P., Ding, N., Goodman, S., & Soricut, R. (2018)
LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
, Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/p18-1238
|
|
|
|
Xu, Z., Tao, D., Zhang, Y., Wu, J., & Tsoi, A. C. (2014)
Architectural style classification using multinomial latent logistic regression
, Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13 (pp. 600-615). Springer International Publishing
|
|
|
|
Zheng, H., Moosavi, V., & Akbarzadeh, M. (2020)
Machine learning assisted evaluations in structural design and construction
, Automation in Construction, 119, 103346
|
|
|
|
last changed |
2024/11/17 22:05 |
|