Uzaktan Algılama Görüntülerinden Bina Çıkarımında FPN-ResNeXt50 ve VGG16-UNet Modellerinin Karşılaştırılması

Serhat Şenses; Emine Tanır Kayıkçı

doi:10.48123/rsgis.1622130

Araştırma Makalesi

Comparison of FPN-ResNeXt50 and VGG16-UNet Models for Building Extraction from Remote Sensing Images

Yıl 2025, Cilt: 6 Sayı: 2, 144 - 167, 27.09.2025

Serhat Şenses , Emine Tanır Kayıkçı

https://doi.org/10.48123/rsgis.1622130

Öz

Extraction of buildings from remote sensing images is an important data in obtaining information required for a wide range of fields such as land use, climate and environmental research, disaster monitoring and prevention, and zoning applications. In this study, image segmentation and classification applications were performed for 5 different urban textures belonging to the cities of Chicago, Austin, Tirol, Vienna and Kitsap, which have different urban planning models including urban and rural settlements. The results obtained by using different artificial neural networks and segmentation algorithms to extract buildings from images were compared. The success rates of two different models trained on the Inria dataset in extracting buildings were examined. Building extraction was performed by using ResNeXt50 and FPN architecture together as the backbone and the results were compared with the extraction results obtained by using the U net backbone and VGG16 architecture. The ResNeXt50 and FPN models gave the best extraction result with 96.74% accuracy on the test data.

Anahtar Kelimeler

Deep learning , Artificial neural networks , Segmentation , Classification

Kaynakça

Attarzadeh, R., & Momeni, M. (2012). Object-based building extraction from high resolution satellite imagery. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 39, 57–60.
Bansal, K., & Singh, A. (2023). Development of VGG-16 transfer learning framework for geographical landmark recognition. Intelligent Decision Technologies, 17(3), 799–810.
Benali, A., Dermeche, H., Belhadj, S., Adnane, A., & Amar, R. H. E. (2014, April 14–16). Buildings extraction of very high spatial resolution satellite images [Conference presentation]. 2014 International Conference on Multimedia Computing and Systems (ICMCS), Marrakesh, Morocco.
Bengio, Y. (2009). Learning deep architectures for AI. Now Foundations and Trends.
de Souza, I. E., Cazarin, C. L., Veronez, M. R., Gonzaga, L., & Falcao, A. X. (2022). User-guided data expansion modeling to train deep neural networks with little supervision. IEEE Geoscience and Remote Sensing Letters, 19, Article 6515505. https://doi.org/10.1109/LGRS.2022.3201437
Fradkin, M., Maitre, H., & Roux, M. (2001). Building detection from multiple aerial images in dense urban areas. Computer Vision and Image Understanding, 82(3), 181–207.
Ghiasi, G., Lin, T. Y., & Le, Q. V. (2019, June 15–20). Nas-fpn: Learning scalable feature pyramid architecture for object detection [Conference presentation]. 2019 IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA.
Guo, J., Pan, Z., Lei, B., & Ding, C. (2017). Automatic color correction for multi-source remote sensing images with Wasserstein CNN. Remote Sensing, 9(5), Article 483. https://doi.org/10.3390/rs9050483
He, K., Zhang, X., Ren, S., & Sun, J. (2016, June 26–July 1). Deep residual learning for image recognition [Conference presentation]. IEEE conference on computer vision and pattern recognition, Las Vegas, Nevada.
Ismail, A. R., Nisa, S. Q., Shaharuddin, S. A., Masni, S. I., & Amin, S. A. S. (2024). Utilising VGG-16 of convolutional neural network for medical image classification. International Journal on Perceptive and Cognitive Computing, 10(1), 113–118.
Jin, X., & Davis, C. H. (2005). Automated building extraction from high-resolution satellite imagery in urban areas using structural, contextual, and spectral information. EURASIP Journal on Advances in Signal Processing, 2005, Article 745309. https://doi.org/10.1155/ASP.2005.2196
Kaur, T., & Gandhi, T. K. (2019, December 19–21). Automated brain image classification based on VGG-16 and transfer learning [Conference presentation]. 2019 International Conference on Information Technology (ICIT), Bhubaneswar, India.
Khatriker, S., & Kumar, M. (2018). Building footprint extraction from high resolution satellite imagery using segmentation. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 123–128.
Lari, Z., & Ebadi, H. (2007, May 29–June 1). Automated building extraction from high-resolution satellite imagery using spectral and structural information based on artificial neural networks [Workshop presentation]. ISPRS Hannover Workshop, Hannover, Germany.
Li, Q., Shi, Y., & Zhu, X. X. (2022, July 17–22). Feature and output consistency training for semi-supervised building footprint generation [Symposium presentation]. IGARSS 2022 – IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia.
Li, Q., Mou, L., Sun, Y., Hua, Y., Shi, Y., & Zhu, X. X. (2024). A review of building extraction from remote sensing imagery: Geometrical structures and semantic attributes. IEEE Transactions on Geoscience and Remote Sensing, 62, Article 4702315. https://doi.org/10.1109/TGRS.2024.3369723
Li, W., He, C., Fang, J., Zheng, J., Fu, H., & Yu, L. (2019). Semantic segmentation-based building footprint extraction using very high-resolution satellite images and multi-source GIS data. Remote Sensing, 11(4), Article 403. https://doi.org/10.3390/rs11040403
Li, Z., Zhang, Z., Chen, D., Zhang, L., Zhu, L., Wang, Q., ... & Peng, X. (2022). HCRB-MSAN: Horizontally connected residual blocks-based multi-scale attention network for semantic segmentation of buildings in HSR remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 5534–5544.
Liu, S., Bai, X., Fang, M., Li, L., & Hung, C. C. (2022). Mixed graph convolution and residual transformation network for skeleton-based action recognition. Applied Intelligence, 52(2), 1544–1555.
Maggiori, E., Tarabalka, Y., Charpiat, G., & Alliez, P. (2017, July 23–28). Can semantic labeling methods generalize to any city? The INRIA aerial image labeling benchmark [Symposium presentation]. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.
Milosavljević, A. (2020). Automated processing of remote sensing imagery using deep semantic segmentation: A building footprint extraction case. ISPRS International Journal of Geo-Information, 9(8), Article 486. https://doi.org/10.3390/ijgi9080486
Nurkarim, W., & Wijayanto, A. W. (2023). Building footprint extraction and counting on very high-resolution satellite imagery using object detection deep learning framework. Earth Science Informatics, 16(1), 515–532.
Prathiba, A. P., Rastogi, K., Jain, G. V., & Govind Kumar, V. V. (2019). Building footprint extraction from very-high-resolution satellite image using object-based image analysis (OBIA) technique. In J. Ghosh & I. da Silva (Eds.), Applications of Geomatics in Civil Engineering (pp. 517–529). Springer Singapore.
Pravitasari, A. A., Iriawan, N., Almuhayar, M., Azmi, T., Irhamah, I., Fithriasari, K., ... & Ferriastuti, W. (2020). UNet-VGG16 with transfer learning for MRI-based brain tumor segmentation. TELKOMNIKA (Telecommunication Computing Electronics and Control), 18(3), 1310–1318.
Ps, P., & Aithal, B. H. (2023). Building footprint extraction from very high-resolution satellite images using deep learning. Journal of Spatial Science, 68(3), 487–503.
Ronneberger, O., Fischer, P., & Brox, T. (2015, October 5–9). U-net: Convolutional networks for biomedical image segmentation [Conference presentation]. 18th International Conference on medical image computing and computer-assisted intervention (MICCAI 2015), Munich, Germany.
Shackelford, A. K., & Davis, C. H. (2003). A combined fuzzy pixel-based and object-based approach for classification of high-resolution multispectral data over urban areas. IEEE Transactions on Geoscience and Remote Sensing, 41(10), 2354–2363.
Srivastava, V., Avudaiammal, R., & V George, S. (2024). Investigations on extraction of buildings from RS imagery using deep learning models. International Journal of Remote Sensing, 45(1), 68–100.
Sewada, R., & Goyal, H. (2025). A novel VGG-16 adaptation for multi-band satellite image classification: Optimized preprocessing and class-specific augmentation. Journal of Computational and Cognitive Engineering. Advance online publication. https://doi.org/10.47852/bonviewJCCE5202480
Tahir, M. Z., Lyu, X., Nasir, M., & Zhang, S. (2025). Advanced image enhancement and a lightweight feature pyramid network for detecting microaneurysms in diabetic retinopathy screening. International Journal of Imaging Systems and Technology, 35(1), Article e70004. https://doi.org/10.1002/ima.70004
Tejeswari, B., Sharma, S. K., Kumar, M., & Gupta, K. (2022). Building footprint extraction from space-borne imagery using deep neural networks. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 43, 641–647.
Turker M., San D. (2010) Building detection from pan-sharpened IKONOS imagery through support vector machines classification. Int Arch Photogramm Remote Sens Spat Inf Sci, 38(Part 8), 841–846.
Wang, L., Fang, S., Meng, X., & Li, R. (2022). Building extraction with Vision Transformer. IEEE Transactions on Geoscience and Remote Sensing, 60, Article 5625711. https://doi.org/10.1109/TGRS.2022.3186634
Wang, S., Shao, Z., Hou, D., Wang, Y., Wang, J., & Cai, B. (2025). CANet: A spatial structure constraint and local semantic awareness based network for weakly supervised building extraction. IEEE Transactions on Geoscience and Remote Sensing, 63, Article 4501919. https://doi.org/10.1109/TGRS.2025.3537099
Whiteside, T. G., Boggs, G. S., & Maier, S. W. (2011). Comparing object-based and pixel-based classifications for mapping savannas. International Journal of Applied Earth Observation and Geoinformation, 13(6), 884–893.
Yao, Y., Jiang, Z., Zhang, H., Cai, B., Meng, G., & Zuo, D. (2017, July 23–28). Chimney and condensing tower detection based on Faster R-CNN in high-resolution remote sensing images [Symposium presentation]. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.
Zheng, C., & Wang, L. (2014). Semantic segmentation of remote sensing imagery using object-based Markov random field model with regional penalties. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8(5), 1924–1935.
Zhong, Z., Li, J., Cui, W., & Jiang, H. (2016, July 10–15). Fully convolutional networks for building and road extraction: Preliminary results [Symposium presentation]. 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.
Zhou, X., & Zhang, L. (2022). SA-FPN: An effective feature pyramid network for crowded human detection. Applied Intelligence, 52(11), 12556–12568.
Xu, Z., Xu, C., Cui, Z., Zheng, X., & Yang, J. (2022, June 18–24). CVNet: Contour vibration network for building extraction [Conference presentation]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.

Uzaktan Algılama Görüntülerinden Bina Çıkarımında FPN-ResNeXt50 ve VGG16-UNet Modellerinin Karşılaştırılması

Yıl 2025, Cilt: 6 Sayı: 2, 144 - 167, 27.09.2025

Serhat Şenses , Emine Tanır Kayıkçı

https://doi.org/10.48123/rsgis.1622130

Öz

Uzaktan algılama görüntülerinden binaların çıkarılması, arazi kullanımı, iklim ve çevre araştırmaları, afet izleme ve önlemesi imar uygulamaları, gibi çok çeşitli alanlar için gerekli olan bir bilgilerin elde edilmesinde önemli bir veri olarak karşımıza çıkmaktadır. Bu çalışmada, kentsel yerleşim alanları ve kırsal yerleşim alanlarını içeren farklı şehir planlama modellerine sahip Şikago, Austin, Tirol, Viyana ve Kitsap şehirlerine ait 5 farklı kent dokusu için görüntü bölütleme ve sınıflandırma uygulaması yapılmıştır. Görüntülerden binaların çıkarılması için farklı yapay sinir ağları ile bölütleme algoritmaları kullanılarak elde edilen sonuçlar karşılaştırılmıştır. Inria veri seti üzerinde eğitilmiş iki farklı modelin binaların çıkarımında elde ettikleri başarı oranları incelenmiştir. Omurga olarak ResNeXt50 ve FPN mimarisi birlikte kullanılarak bina çıkarımı yapılmış ve sonuçları U net omurgası ile VGG16 mimarisi kullanılarak elde edilen çıkarım sonuçları ile karşılaştırılmıştır. ResNeXt50 ve FPN modeli test verileri üzerinde %96,74 doğrulukla en iyi çıkarım sonucunu vermiştir.

Anahtar Kelimeler

Derin öğrenme , Yapay sinir ağları , Bölütleme , Sınıflandırma

Kaynakça

Attarzadeh, R., & Momeni, M. (2012). Object-based building extraction from high resolution satellite imagery. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 39, 57–60.
Bansal, K., & Singh, A. (2023). Development of VGG-16 transfer learning framework for geographical landmark recognition. Intelligent Decision Technologies, 17(3), 799–810.
Benali, A., Dermeche, H., Belhadj, S., Adnane, A., & Amar, R. H. E. (2014, April 14–16). Buildings extraction of very high spatial resolution satellite images [Conference presentation]. 2014 International Conference on Multimedia Computing and Systems (ICMCS), Marrakesh, Morocco.
Bengio, Y. (2009). Learning deep architectures for AI. Now Foundations and Trends.
de Souza, I. E., Cazarin, C. L., Veronez, M. R., Gonzaga, L., & Falcao, A. X. (2022). User-guided data expansion modeling to train deep neural networks with little supervision. IEEE Geoscience and Remote Sensing Letters, 19, Article 6515505. https://doi.org/10.1109/LGRS.2022.3201437
Fradkin, M., Maitre, H., & Roux, M. (2001). Building detection from multiple aerial images in dense urban areas. Computer Vision and Image Understanding, 82(3), 181–207.
Ghiasi, G., Lin, T. Y., & Le, Q. V. (2019, June 15–20). Nas-fpn: Learning scalable feature pyramid architecture for object detection [Conference presentation]. 2019 IEEE/CVF conference on computer vision and pattern recognition, Long Beach, CA, USA.
Guo, J., Pan, Z., Lei, B., & Ding, C. (2017). Automatic color correction for multi-source remote sensing images with Wasserstein CNN. Remote Sensing, 9(5), Article 483. https://doi.org/10.3390/rs9050483
He, K., Zhang, X., Ren, S., & Sun, J. (2016, June 26–July 1). Deep residual learning for image recognition [Conference presentation]. IEEE conference on computer vision and pattern recognition, Las Vegas, Nevada.
Ismail, A. R., Nisa, S. Q., Shaharuddin, S. A., Masni, S. I., & Amin, S. A. S. (2024). Utilising VGG-16 of convolutional neural network for medical image classification. International Journal on Perceptive and Cognitive Computing, 10(1), 113–118.
Jin, X., & Davis, C. H. (2005). Automated building extraction from high-resolution satellite imagery in urban areas using structural, contextual, and spectral information. EURASIP Journal on Advances in Signal Processing, 2005, Article 745309. https://doi.org/10.1155/ASP.2005.2196
Kaur, T., & Gandhi, T. K. (2019, December 19–21). Automated brain image classification based on VGG-16 and transfer learning [Conference presentation]. 2019 International Conference on Information Technology (ICIT), Bhubaneswar, India.
Khatriker, S., & Kumar, M. (2018). Building footprint extraction from high resolution satellite imagery using segmentation. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 123–128.
Lari, Z., & Ebadi, H. (2007, May 29–June 1). Automated building extraction from high-resolution satellite imagery using spectral and structural information based on artificial neural networks [Workshop presentation]. ISPRS Hannover Workshop, Hannover, Germany.
Li, Q., Shi, Y., & Zhu, X. X. (2022, July 17–22). Feature and output consistency training for semi-supervised building footprint generation [Symposium presentation]. IGARSS 2022 – IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia.
Li, Q., Mou, L., Sun, Y., Hua, Y., Shi, Y., & Zhu, X. X. (2024). A review of building extraction from remote sensing imagery: Geometrical structures and semantic attributes. IEEE Transactions on Geoscience and Remote Sensing, 62, Article 4702315. https://doi.org/10.1109/TGRS.2024.3369723
Li, W., He, C., Fang, J., Zheng, J., Fu, H., & Yu, L. (2019). Semantic segmentation-based building footprint extraction using very high-resolution satellite images and multi-source GIS data. Remote Sensing, 11(4), Article 403. https://doi.org/10.3390/rs11040403
Li, Z., Zhang, Z., Chen, D., Zhang, L., Zhu, L., Wang, Q., ... & Peng, X. (2022). HCRB-MSAN: Horizontally connected residual blocks-based multi-scale attention network for semantic segmentation of buildings in HSR remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 5534–5544.
Liu, S., Bai, X., Fang, M., Li, L., & Hung, C. C. (2022). Mixed graph convolution and residual transformation network for skeleton-based action recognition. Applied Intelligence, 52(2), 1544–1555.
Maggiori, E., Tarabalka, Y., Charpiat, G., & Alliez, P. (2017, July 23–28). Can semantic labeling methods generalize to any city? The INRIA aerial image labeling benchmark [Symposium presentation]. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.
Milosavljević, A. (2020). Automated processing of remote sensing imagery using deep semantic segmentation: A building footprint extraction case. ISPRS International Journal of Geo-Information, 9(8), Article 486. https://doi.org/10.3390/ijgi9080486
Nurkarim, W., & Wijayanto, A. W. (2023). Building footprint extraction and counting on very high-resolution satellite imagery using object detection deep learning framework. Earth Science Informatics, 16(1), 515–532.
Prathiba, A. P., Rastogi, K., Jain, G. V., & Govind Kumar, V. V. (2019). Building footprint extraction from very-high-resolution satellite image using object-based image analysis (OBIA) technique. In J. Ghosh & I. da Silva (Eds.), Applications of Geomatics in Civil Engineering (pp. 517–529). Springer Singapore.
Pravitasari, A. A., Iriawan, N., Almuhayar, M., Azmi, T., Irhamah, I., Fithriasari, K., ... & Ferriastuti, W. (2020). UNet-VGG16 with transfer learning for MRI-based brain tumor segmentation. TELKOMNIKA (Telecommunication Computing Electronics and Control), 18(3), 1310–1318.
Ps, P., & Aithal, B. H. (2023). Building footprint extraction from very high-resolution satellite images using deep learning. Journal of Spatial Science, 68(3), 487–503.
Ronneberger, O., Fischer, P., & Brox, T. (2015, October 5–9). U-net: Convolutional networks for biomedical image segmentation [Conference presentation]. 18th International Conference on medical image computing and computer-assisted intervention (MICCAI 2015), Munich, Germany.
Shackelford, A. K., & Davis, C. H. (2003). A combined fuzzy pixel-based and object-based approach for classification of high-resolution multispectral data over urban areas. IEEE Transactions on Geoscience and Remote Sensing, 41(10), 2354–2363.
Srivastava, V., Avudaiammal, R., & V George, S. (2024). Investigations on extraction of buildings from RS imagery using deep learning models. International Journal of Remote Sensing, 45(1), 68–100.
Sewada, R., & Goyal, H. (2025). A novel VGG-16 adaptation for multi-band satellite image classification: Optimized preprocessing and class-specific augmentation. Journal of Computational and Cognitive Engineering. Advance online publication. https://doi.org/10.47852/bonviewJCCE5202480
Tahir, M. Z., Lyu, X., Nasir, M., & Zhang, S. (2025). Advanced image enhancement and a lightweight feature pyramid network for detecting microaneurysms in diabetic retinopathy screening. International Journal of Imaging Systems and Technology, 35(1), Article e70004. https://doi.org/10.1002/ima.70004
Tejeswari, B., Sharma, S. K., Kumar, M., & Gupta, K. (2022). Building footprint extraction from space-borne imagery using deep neural networks. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 43, 641–647.
Turker M., San D. (2010) Building detection from pan-sharpened IKONOS imagery through support vector machines classification. Int Arch Photogramm Remote Sens Spat Inf Sci, 38(Part 8), 841–846.
Wang, L., Fang, S., Meng, X., & Li, R. (2022). Building extraction with Vision Transformer. IEEE Transactions on Geoscience and Remote Sensing, 60, Article 5625711. https://doi.org/10.1109/TGRS.2022.3186634
Wang, S., Shao, Z., Hou, D., Wang, Y., Wang, J., & Cai, B. (2025). CANet: A spatial structure constraint and local semantic awareness based network for weakly supervised building extraction. IEEE Transactions on Geoscience and Remote Sensing, 63, Article 4501919. https://doi.org/10.1109/TGRS.2025.3537099
Whiteside, T. G., Boggs, G. S., & Maier, S. W. (2011). Comparing object-based and pixel-based classifications for mapping savannas. International Journal of Applied Earth Observation and Geoinformation, 13(6), 884–893.
Yao, Y., Jiang, Z., Zhang, H., Cai, B., Meng, G., & Zuo, D. (2017, July 23–28). Chimney and condensing tower detection based on Faster R-CNN in high-resolution remote sensing images [Symposium presentation]. 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA.
Zheng, C., & Wang, L. (2014). Semantic segmentation of remote sensing imagery using object-based Markov random field model with regional penalties. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8(5), 1924–1935.
Zhong, Z., Li, J., Cui, W., & Jiang, H. (2016, July 10–15). Fully convolutional networks for building and road extraction: Preliminary results [Symposium presentation]. 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.
Zhou, X., & Zhang, L. (2022). SA-FPN: An effective feature pyramid network for crowded human detection. Applied Intelligence, 52(11), 12556–12568.
Xu, Z., Xu, C., Cui, Z., Zheng, X., & Yang, J. (2022, June 18–24). CVNet: Contour vibration network for building extraction [Conference presentation]. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.

Toplam 40 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Fotogrametri ve Uzaktan Algılama
Bölüm	Araştırma Makaleleri
Yazarlar	Serhat Şenses 0000-0002-9648-5844 Emine Tanır Kayıkçı 0000-0001-8259-5543
Yayımlanma Tarihi	27 Eylül 2025
Gönderilme Tarihi	17 Ocak 2025
Kabul Tarihi	11 Eylül 2025
Yayımlandığı Sayı	Yıl 2025 Cilt: 6 Sayı: 2

Kaynak Göster

APA	Şenses, S., & Tanır Kayıkçı, E. (2025). Uzaktan Algılama Görüntülerinden Bina Çıkarımında FPN-ResNeXt50 ve VGG16-UNet Modellerinin Karşılaştırılması. Türk Uzaktan Algılama ve CBS Dergisi, 6(2), 144-167. https://doi.org/10.48123/rsgis.1622130

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

Turkish Journal of Remote Sensing and GIS (Türk Uzaktan Algılama ve CBS Dergisi), Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License ile lisanlanmıştır.