Research Article

Feature Extraction for Real Estate Images and Titles with LLMs

Volume: 9 Number: 3 June 30, 2026

Feature Extraction for Real Estate Images and Titles with LLMs

Abstract

Images and titles often contain rich latent information about their associated objects, particularly on web-based platforms. Real estate websites provide a clear example, where listing images and titles provide important details that assist users in their decision-making. However, these unstructured elements cannot be directly utilized in downstream machine learning tasks, since their contextual meaning is not directly interpretable. This work aims to transform listing images and titles into structured, tabular representations, making them suitable for analytical and predictive modeling. To this end, we propose a modular framework based on state-of-the-art large language models. The framework incorporates ReAct, LLM-as-a-Judge, and few-shot prompting techniques. Its performance is evaluated on a real-world real estate dataset and compared with BERT and CLIP-based baselines. Experimental results demonstrate that our framework achieves up to a 44.26% improvement in recall for listing attributes, such as the presence of a balcony or the furnishing status of a property.

Keywords

Supporting Institution

TUBITAK

Project Number

124E135

References

  1. T. Brown et al., Language models are few-shot learners, Adv. Neural Inf. Process. Syst., 33 (2020), 1877–1901.
  2. J. Dagdelen et al., Structured information extraction from scientific text with large language models, Nat. Commun., 15(1) (2024), 1418.
  3. S. Desai and G. Durrett, Calibration of pre-trained transformers, in Proc. 2020 Conf. Empirical Methods Nat. Lang. Process. (EMNLP), Association for Computational Linguistics, (2020), 295–302.
  4. F. Feng, Y. Yang, D. Cer, N. Arivazhagan, and W. Wang, Language-agnostic BERT sentence embedding, arXiv preprint arXiv:2007.01852, 2020. https://arxiv.org/abs/2007.01852
  5. Y. Guo, C. Wang, S. X. Yu, F. McKenna, and K. H. Law, AdaLN: a vision transformer for multidomain learning and predisaster building information extraction from images, J. Comput. Civ. Eng., 36(5) (2022), 04022024.
  6. T. Gupta, M. Zaki, N. M. A. Krishnan, and Mausam, MatSciBERT: a materials domain language model for text mining and information extraction, Comput. Mater. Sci., 8(1) (2022), 102.
  7. K. Han, Y. Wang, J. Guo, Y. Tang, and E. Wu, Vision GNN: an image is worth graph of nodes, Adv. Neural Inf. Process. Syst., 35 (2022), 8291–8303.
  8. ilab-core, LLM-based real estate information extraction, GitHub repository. https://github.com/ilab-core/llm-based-real-estate-information-extraction

Details

Primary Language

English

Subjects

Natural Language Processing, Artificial Intelligence (Other)

Journal Section

Research Article

Early Pub Date

June 19, 2026

Publication Date

June 30, 2026

Submission Date

November 24, 2025

Acceptance Date

May 5, 2026

Published in Issue

Year 2026 Volume: 9 Number: 3

APA
Arslan, A., Yetki, T. D., Yücel, A., Turgut, H., Bali, Ö., Işıklar Alptekin, G., & Orman, G. K. (2026). Feature Extraction for Real Estate Images and Titles with LLMs. Sakarya University Journal of Computer and Information Sciences, 9(3), 690-699. https://doi.org/10.35377/saucis...1829206
AMA
1.Arslan A, Yetki TD, Yücel A, et al. Feature Extraction for Real Estate Images and Titles with LLMs. SAUCIS. 2026;9(3):690-699. doi:10.35377/saucis.1829206
Chicago
Arslan, Afra, Tan Doruk Yetki, Arda Yücel, et al. 2026. “Feature Extraction for Real Estate Images and Titles With LLMs”. Sakarya University Journal of Computer and Information Sciences 9 (3): 690-99. https://doi.org/10.35377/saucis. 1829206.
EndNote
Arslan A, Yetki TD, Yücel A, Turgut H, Bali Ö, Işıklar Alptekin G, Orman GK (June 1, 2026) Feature Extraction for Real Estate Images and Titles with LLMs. Sakarya University Journal of Computer and Information Sciences 9 3 690–699.
IEEE
[1]A. Arslan et al., “Feature Extraction for Real Estate Images and Titles with LLMs”, SAUCIS, vol. 9, no. 3, pp. 690–699, June 2026, doi: 10.35377/saucis...1829206.
ISNAD
Arslan, Afra - Yetki, Tan Doruk - Yücel, Arda - Turgut, Hacer - Bali, Ömür - Işıklar Alptekin, Gülfem - Orman, Günce Keziban. “Feature Extraction for Real Estate Images and Titles With LLMs”. Sakarya University Journal of Computer and Information Sciences 9/3 (June 1, 2026): 690-699. https://doi.org/10.35377/saucis. 1829206.
JAMA
1.Arslan A, Yetki TD, Yücel A, Turgut H, Bali Ö, Işıklar Alptekin G, Orman GK. Feature Extraction for Real Estate Images and Titles with LLMs. SAUCIS. 2026;9:690–699.
MLA
Arslan, Afra, et al. “Feature Extraction for Real Estate Images and Titles With LLMs”. Sakarya University Journal of Computer and Information Sciences, vol. 9, no. 3, June 2026, pp. 690-9, doi:10.35377/saucis. 1829206.
Vancouver
1.Afra Arslan, Tan Doruk Yetki, Arda Yücel, Hacer Turgut, Ömür Bali, Gülfem Işıklar Alptekin, Günce Keziban Orman. Feature Extraction for Real Estate Images and Titles with LLMs. SAUCIS. 2026 Jun. 1;9(3):690-9. doi:10.35377/saucis. 1829206

 

INDEXING & ABSTRACTING & ARCHIVING

 

31045 31044   ResimLink - Resim Yükle  31047 

31043 28939 28938 34240
 

 

29070    The papers in this journal are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License