Araştırma Makalesi
BibTex RIS Kaynak Göster

From a Single RGB Image to 3D Reconstruction and BIM-Compatible Objects: An Application for Digital Twinning Projects

Yıl 2026, Cilt: 38 Sayı: 1, 138 - 151, 20.03.2026
https://doi.org/10.7240/jeps.1772740
https://izlik.org/JA72ZC53TX

Öz

Digital twinning projects for smart campuses are often carried out using dense sensor networks (IoT), advanced building and energy management systems (BMS/EMS), and expensive 3D scanning infrastructures such as LiDAR or multi-camera rigs. This study presents a practical, sensor-free approach that starts from a single RGB image and integrates 2D scene parsing and monocular depth estimation into 3D reconstruction, object-level segmentation, and automatic BIM transfer. Boundary-sensitive semantic maps generated by SERNet-Former_v2 and single-image depth outputs are reprojected into point clouds. Planar extraction then identifies major architectural elements (walls, floors, ceilings), while openings (doors, windows) and structural components (columns) are segmented at the object level. The pipeline produces BIM-compatible geometry and parameters. The method is evaluated on benchmark datasets such as ScanNet and TUM CMS Indoor Point Cloud, pioneering comprehensive digital twinning projects such as TUM2TWIN, with an aim of smart campus for making the TUM campuses as the best-surveyed places. Under the single-image setting, 3D accuracy reaches 78.8%, while 2D→3D projection consistency achieves 81.9%. Depth estimation is measured by RMSE, and 3D reconstruction by ICP-RMSE and 3D RMSE, with a minimal reconstruction error of 0.02 RMSE. The entire workflow runs on a A100-40GB GPU with single inputs having 512×512 pixel crop sizes under 6s per sample. The resulting BIM-compliant spatial parameters will serve as programmatic inputs to semantic models and Model Predictive Control (MPC) schemes, enabling cost-efficient and scalable early-stage assessments. Without requiring domain adaptation, this workflow offers a rapid and effective alternative for field deployment and digital documentation in architecture and engineering applications.

Etik Beyan

This article has not been published or submitted elsewhere for publication. Research and publication ethics have been followed, and all sources used have been properly cited. Since the study does not involve human or animal subjects, no ethics committee approval is required.

Destekleyen Kurum

Hacettepe University

Kaynakça

  • Erişen, S. (2023). An empirical study of the technoparks in Turkey in investigating the challenges and potential of designing intelligent spaces. Sustainability, 15(13), 10150.
  • Erişen, S. (2023). A systematic approach to optimizing energy-efficient automated systems with learning models for thermal comfort control in indoor spaces. Buildings, 13(7), 1824.
  • Balaji, B., Bhattacharya, A., Fierro, G., Gao, J., Gluck, J., Hong, D., Johansen, A., Koh, J., Ploennigs, J., Agarwal, Y., Culler, D., Gupta, R. K., Agarwal, R., vd. (2016). Brick: Towards a unified metadata schema for buildings. Proceedings of the 3rd ACM International Conference on Systems for Energy-Efficient Built Environments (BuildSys).
  • Project Haystack. (2019). *Project Haystack: Open source initiative for semantic tagging of data.*
  • Tang, P., Huber, D., Akinci, B., Lipman, R., ve Lytle, A. (2010). Automatic reconstruction of as-built building information models from laser-scanned point clouds: A review. Automation in Construction, 19(7), 829–843.
  • Volk, R., Stengel, J., ve Schultmann, F. (2014). Building information modeling (BIM) for existing buildings—Literature review and future needs. Automation in Construction, 38, 109–127.
  • Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., ve Niessner, M. (2017). ScanNet: Richly-annotated 3D reconstructions of indoor scenes. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5828–5839.
  • Chang, A. X., Dai, A., Funkhouser, T., Halber, M., Niessner, M., Savva, M., Song, S., Zeng, A., ve Zhang, Y. (2017). Matterport3D: Learning from RGB-D data in indoor environments. Proceedings of the International Conference on 3D Vision (3DV).
  • Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alonso, I., Alvarez, J. M., ve Lu, T. (2021). SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems (NeurIPS).
  • Cheng, B., Misra, I., Schwing, A., Kirillov, A., ve Girdhar, R. (2022). Masked-attention mask transformer for universal image segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  • Bhat, S. F., Birkl, R., Wofk, D., Wonka, P., ve Müller, M. (2023). Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288.
  • Yin, W., Zhang, C., Chen, H., Cai, Z., Yu, G., Wang, K., Chen, X., ve Shen, C. (2023). Metric3D: Towards zero-shot metric 3D prediction from a single image. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 9043–9053.
  • Wysocki, O., Schwab, B., Biswanath, M. K., Zhang, Q., Zhu, J., Froech, T., Heeramaglore, M. ve diğerleri. (2025). TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset. arXiv:2505.07396. https://tum2t.win/ "TUM2TWIN"
  • buildingSMART International. (2023). Industry foundation classes (IFC) specification (IFC 4.x series).
  • buildingSMART International. (2021). ifcJSON: A JSON serialization for IFC.
  • Erişen, S. (2025). Efficient segmentation using attention-fusion modules with dense predictions. IEEE Access, 13, 107552–107565.
  • Erişen, S., Mehranfar, M., ve Borrmann, A. (2025, Mayıs). A multi-task framework for 2D scene parsing and 3D reconstruction in indoor space documentation. EG-ICE 2025 Workshop on Intelligent Computing in Engineering.
  • Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., ve Savarese, S. (2016). 3D semantic parsing of large-scale indoor spaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1534–1543.
  • Mehranfar, M., Vega-Torres, M., Braun, A., ve Borrmann, A. (2024). Automated data-driven method for creating digital building models from dense point clouds and images through semantic segmentation and parametric model fitting. Advanced Engineering Informatics, 62, 102862.
  • Wan, L., Rossa, F., Welfonder, T., Petrova, E., ve Pauwels, P. (2025). Enabling scalable model predictive control design for building HVAC systems using semantic data modelling. Automation in Construction, 170, 105929.
  • Erişen, S., Mehranfar, M., ve Borrmann, A. (2025). Single image to semantic BIM: Domain-adapted 3D reconstruction and annotations via multi-task deep learning. Remote Sensing, 17(16), 2910.
  • Curless, B., ve Levoy, M. (1996). A volumetric method for building complex models from range images. Proceedings of SIGGRAPH, 303–312.
  • Fischler, M. A., ve Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
  • Kazhdan, M., Bolitho, M., ve Hoppe, H. (2006). Poisson surface reconstruction. Eurographics Symposium on Geometry Processing (SGP), 61–70.
  • Delaunay, B. (1934). Sur la sphère vide. Izvestia Akademii Nauk SSSR, Otdelenie Matematicheskikh i Estestvennykh Nauk, 7(6), 793–800.
  • Besl, P. J., ve McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239–256.
  • Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., ve Torralba, A. (2017). Scene parsing through ADE20K dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 633–641.
  • Song, S., Lichtenberg, S. P., ve Xiao, J. (2015). SUN RGB-D: A RGB-D scene understanding benchmark suite. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 567–576.
  • Silberman, N., Hoiem, D., Kohli, P., ve Fergus, R. (2012). Indoor segmentation and support inference from RGBD images. Proceedings of the European Conference on Computer Vision (ECCV), 746–760.
  • Erişen, S., Mehranfar, M., ve Borrmann, A. (2025). 2D Scene Parsing Dataset of TUM CMS Indoor Point Cloud. mediaTUM. doi: 10.14459/2025mp1779501
  • Erişen, S. (2021). Incremental transformation of spatial intelligence from smart systems to sensorial infrastructures. Building Research and Information, 49(1), 113-126.

Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama

Yıl 2026, Cilt: 38 Sayı: 1, 138 - 151, 20.03.2026
https://doi.org/10.7240/jeps.1772740
https://izlik.org/JA72ZC53TX

Öz

Akıllı kampüsler için dijital ikiz projeleri genellikle yoğun sensör ağları ve pahalı 3B tarama altyapıları ile yürütülmektedir. Bu çalışma, yalnızca tek bir RGB görüntüden hareketle 2B sahne ayrıştırma ve monoküler derinlik kestirimini birleştirip 3B rekonstrüksiyon, nesne düzeyi ayrıştırma ve Bina Bilgisi Modelleme (BIM)’e otomatik nesne aktarımını mümkün kılan ve ek sensörlere ihtiyaç duymayan pratik bir yaklaşım sunar. Temelde SERNet-Former_v2 ile üretilen sınır-duyarlı 2B sınıf haritaları ve tek-görüntü derinlik çıktıları, geri izdüşüm ile nokta bulutuna dönüştürülür. Düzlem çıkarımı ile duvar, zemin, tavan gibi nesneler ve kapı, pencere gibi açıklıklar ayrıştırılır. Kolon, duvar gibi yapısal elemanlar nesne düzeyinde tanımlanır, BIM uyumlu geometri ve parametreler elde edilir. Bu yaklaşım, akıllı kampüs hedefiyle çok kapsamlı dijital ikiz çalışması olarak geliştirilen TUM2TWIN projesine öncülük eden ScanNet ve TUM CMS Indoor Point Cloud gibi veri kümeleri üzerinde sınanmıştır: Tek-görüntü koşulunda 3B doğruluk 78.8%, 2B→3B projeksiyon tutarlılığı ise 81.9% oranında ölçülmüştür. Derinlik için RMSE, 3B için ICP-RMSE ve 3D RMSE kullanılmış, 0.02 RMSE değeri elde edilmiştir. Tek A100-40GB GPU’da tekli RGB girdilerde 512×512 piksel kırpma boyutu ile örnek başına altı saniyenin altında çalışma süresi sağlanmıştır. Elde edilen BIM-uyumlu mekansal parametreler semantik modellere ve Model Öngörülü Denetim tasarımlarına programatik girdi sağlayarak, erken faz değerlendirmelerini düşük maliyet ve hızlı geri bildirimle ölçeklenebilir kılacaktır. Ek alan uyarlaması gerektirmeyen bu hat, mimarlık ve mühendislik uygulamalarında sahaya hızlı geçiş ve dijital belgeleme için uygulanabilir bir alternatif sunar.

Etik Beyan

Bu makale daha önce herhangi bir yerde yayımlanmamış veya yayımlanmak üzere değerlendirilmemektedir. Araştırma ve yayın etiği kurallarına uyulmuş, kullanılan tüm kaynaklara uygun şekilde atıf yapılmıştır. Çalışma insan veya hayvan denekleri içermediğinden etik kurul onayı gerektirmemektedir.

Destekleyen Kurum

Hacettepe Universitesi

Kaynakça

  • Erişen, S. (2023). An empirical study of the technoparks in Turkey in investigating the challenges and potential of designing intelligent spaces. Sustainability, 15(13), 10150.
  • Erişen, S. (2023). A systematic approach to optimizing energy-efficient automated systems with learning models for thermal comfort control in indoor spaces. Buildings, 13(7), 1824.
  • Balaji, B., Bhattacharya, A., Fierro, G., Gao, J., Gluck, J., Hong, D., Johansen, A., Koh, J., Ploennigs, J., Agarwal, Y., Culler, D., Gupta, R. K., Agarwal, R., vd. (2016). Brick: Towards a unified metadata schema for buildings. Proceedings of the 3rd ACM International Conference on Systems for Energy-Efficient Built Environments (BuildSys).
  • Project Haystack. (2019). *Project Haystack: Open source initiative for semantic tagging of data.*
  • Tang, P., Huber, D., Akinci, B., Lipman, R., ve Lytle, A. (2010). Automatic reconstruction of as-built building information models from laser-scanned point clouds: A review. Automation in Construction, 19(7), 829–843.
  • Volk, R., Stengel, J., ve Schultmann, F. (2014). Building information modeling (BIM) for existing buildings—Literature review and future needs. Automation in Construction, 38, 109–127.
  • Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., ve Niessner, M. (2017). ScanNet: Richly-annotated 3D reconstructions of indoor scenes. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 5828–5839.
  • Chang, A. X., Dai, A., Funkhouser, T., Halber, M., Niessner, M., Savva, M., Song, S., Zeng, A., ve Zhang, Y. (2017). Matterport3D: Learning from RGB-D data in indoor environments. Proceedings of the International Conference on 3D Vision (3DV).
  • Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alonso, I., Alvarez, J. M., ve Lu, T. (2021). SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems (NeurIPS).
  • Cheng, B., Misra, I., Schwing, A., Kirillov, A., ve Girdhar, R. (2022). Masked-attention mask transformer for universal image segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
  • Bhat, S. F., Birkl, R., Wofk, D., Wonka, P., ve Müller, M. (2023). Zoedepth: Zero-shot transfer by combining relative and metric depth. arXiv preprint arXiv:2302.12288.
  • Yin, W., Zhang, C., Chen, H., Cai, Z., Yu, G., Wang, K., Chen, X., ve Shen, C. (2023). Metric3D: Towards zero-shot metric 3D prediction from a single image. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 9043–9053.
  • Wysocki, O., Schwab, B., Biswanath, M. K., Zhang, Q., Zhu, J., Froech, T., Heeramaglore, M. ve diğerleri. (2025). TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset. arXiv:2505.07396. https://tum2t.win/ "TUM2TWIN"
  • buildingSMART International. (2023). Industry foundation classes (IFC) specification (IFC 4.x series).
  • buildingSMART International. (2021). ifcJSON: A JSON serialization for IFC.
  • Erişen, S. (2025). Efficient segmentation using attention-fusion modules with dense predictions. IEEE Access, 13, 107552–107565.
  • Erişen, S., Mehranfar, M., ve Borrmann, A. (2025, Mayıs). A multi-task framework for 2D scene parsing and 3D reconstruction in indoor space documentation. EG-ICE 2025 Workshop on Intelligent Computing in Engineering.
  • Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., ve Savarese, S. (2016). 3D semantic parsing of large-scale indoor spaces. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1534–1543.
  • Mehranfar, M., Vega-Torres, M., Braun, A., ve Borrmann, A. (2024). Automated data-driven method for creating digital building models from dense point clouds and images through semantic segmentation and parametric model fitting. Advanced Engineering Informatics, 62, 102862.
  • Wan, L., Rossa, F., Welfonder, T., Petrova, E., ve Pauwels, P. (2025). Enabling scalable model predictive control design for building HVAC systems using semantic data modelling. Automation in Construction, 170, 105929.
  • Erişen, S., Mehranfar, M., ve Borrmann, A. (2025). Single image to semantic BIM: Domain-adapted 3D reconstruction and annotations via multi-task deep learning. Remote Sensing, 17(16), 2910.
  • Curless, B., ve Levoy, M. (1996). A volumetric method for building complex models from range images. Proceedings of SIGGRAPH, 303–312.
  • Fischler, M. A., ve Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
  • Kazhdan, M., Bolitho, M., ve Hoppe, H. (2006). Poisson surface reconstruction. Eurographics Symposium on Geometry Processing (SGP), 61–70.
  • Delaunay, B. (1934). Sur la sphère vide. Izvestia Akademii Nauk SSSR, Otdelenie Matematicheskikh i Estestvennykh Nauk, 7(6), 793–800.
  • Besl, P. J., ve McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2), 239–256.
  • Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., ve Torralba, A. (2017). Scene parsing through ADE20K dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 633–641.
  • Song, S., Lichtenberg, S. P., ve Xiao, J. (2015). SUN RGB-D: A RGB-D scene understanding benchmark suite. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 567–576.
  • Silberman, N., Hoiem, D., Kohli, P., ve Fergus, R. (2012). Indoor segmentation and support inference from RGBD images. Proceedings of the European Conference on Computer Vision (ECCV), 746–760.
  • Erişen, S., Mehranfar, M., ve Borrmann, A. (2025). 2D Scene Parsing Dataset of TUM CMS Indoor Point Cloud. mediaTUM. doi: 10.14459/2025mp1779501
  • Erişen, S. (2021). Incremental transformation of spatial intelligence from smart systems to sensorial infrastructures. Building Research and Information, 49(1), 113-126.
Toplam 31 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Görüntü İşleme, Derin Öğrenme, Nöral Ağlar, Modelleme ve Simülasyon, İnşaat Mühendisliğinde Sayısal Modelleme, İnşaat Yapım Mühendisliği, Mimari Mühendislik
Bölüm Araştırma Makalesi
Yazarlar

Serdar Erişen 0000-0002-7192-0889

Gönderilme Tarihi 28 Ağustos 2025
Kabul Tarihi 23 Ocak 2026
Yayımlanma Tarihi 20 Mart 2026
DOI https://doi.org/10.7240/jeps.1772740
IZ https://izlik.org/JA72ZC53TX
Yayımlandığı Sayı Yıl 2026 Cilt: 38 Sayı: 1

Kaynak Göster

APA Erişen, S. (2026). Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama. International Journal of Advances in Engineering and Pure Sciences, 38(1), 138-151. https://doi.org/10.7240/jeps.1772740
AMA 1.Erişen S. Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama. JEPS. 2026;38(1):138-151. doi:10.7240/jeps.1772740
Chicago Erişen, Serdar. 2026. “Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama”. International Journal of Advances in Engineering and Pure Sciences 38 (1): 138-51. https://doi.org/10.7240/jeps.1772740.
EndNote Erişen S (01 Mart 2026) Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama. International Journal of Advances in Engineering and Pure Sciences 38 1 138–151.
IEEE [1]S. Erişen, “Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama”, JEPS, c. 38, sy 1, ss. 138–151, Mar. 2026, doi: 10.7240/jeps.1772740.
ISNAD Erişen, Serdar. “Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama”. International Journal of Advances in Engineering and Pure Sciences 38/1 (01 Mart 2026): 138-151. https://doi.org/10.7240/jeps.1772740.
JAMA 1.Erişen S. Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama. JEPS. 2026;38:138–151.
MLA Erişen, Serdar. “Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama”. International Journal of Advances in Engineering and Pure Sciences, c. 38, sy 1, Mart 2026, ss. 138-51, doi:10.7240/jeps.1772740.
Vancouver 1.Serdar Erişen. Tek RGB Görüntüden 3B Rekonstrüksiyon ve BIM-Uyumlu Nesnelere: Dijital İkiz Projeleri için Bir Uygulama. JEPS. 01 Mart 2026;38(1):138-51. doi:10.7240/jeps.1772740