Feature Extraction for Real Estate Images and Titles with LLMs
Abstract
Images and titles often contain rich latent information about their associated objects, particularly on web-based platforms. Real estate websites provide a clear example, where listing images and titles provide important details that assist users in their decision-making. However, these unstructured elements cannot be directly utilized in downstream machine learning tasks, since their contextual meaning is not directly interpretable. This work aims to transform listing images and titles into structured, tabular representations, making them suitable for analytical and predictive modeling. To this end, we propose a modular framework based on state-of-the-art large language models. The framework incorporates ReAct, LLM-as-a-Judge, and few-shot prompting techniques. Its performance is evaluated on a real-world real estate dataset and compared with BERT and CLIP-based baselines. Experimental results demonstrate that our framework achieves up to a 44.26% improvement in recall for listing attributes, such as the presence of a balcony or the furnishing status of a property.
Keywords
Supporting Institution
Project Number
References
- T. Brown et al., Language models are few-shot learners, Adv. Neural Inf. Process. Syst., 33 (2020), 1877–1901.
- J. Dagdelen et al., Structured information extraction from scientific text with large language models, Nat. Commun., 15(1) (2024), 1418.
- S. Desai and G. Durrett, Calibration of pre-trained transformers, in Proc. 2020 Conf. Empirical Methods Nat. Lang. Process. (EMNLP), Association for Computational Linguistics, (2020), 295–302.
- F. Feng, Y. Yang, D. Cer, N. Arivazhagan, and W. Wang, Language-agnostic BERT sentence embedding, arXiv preprint arXiv:2007.01852, 2020. https://arxiv.org/abs/2007.01852
- Y. Guo, C. Wang, S. X. Yu, F. McKenna, and K. H. Law, AdaLN: a vision transformer for multidomain learning and predisaster building information extraction from images, J. Comput. Civ. Eng., 36(5) (2022), 04022024.
- T. Gupta, M. Zaki, N. M. A. Krishnan, and Mausam, MatSciBERT: a materials domain language model for text mining and information extraction, Comput. Mater. Sci., 8(1) (2022), 102.
- K. Han, Y. Wang, J. Guo, Y. Tang, and E. Wu, Vision GNN: an image is worth graph of nodes, Adv. Neural Inf. Process. Syst., 35 (2022), 8291–8303.
- ilab-core, LLM-based real estate information extraction, GitHub repository. https://github.com/ilab-core/llm-based-real-estate-information-extraction
Details
Primary Language
English
Subjects
Natural Language Processing, Artificial Intelligence (Other)
Journal Section
Research Article
Authors
Afra Arslan
*
0009-0006-4857-5155
Türkiye
Tan Doruk Yetki
0009-0000-7304-2605
Türkiye
Arda Yücel
0009-0003-2926-9257
Türkiye
Hacer Turgut
0000-0002-7680-0878
Türkiye
Ömür Bali
0009-0005-4907-649X
Türkiye
Early Pub Date
June 19, 2026
Publication Date
June 30, 2026
Submission Date
November 24, 2025
Acceptance Date
May 5, 2026
Published in Issue
Year 2026 Volume: 9 Number: 3
