Comparison of Different U-Net Models for Building Extraction from High-Resolution Aerial Imagery

Fırat Erdem; Uğur Avdan

doi:10.30897/ijegeo.684951

Research Article

Year 2020, Volume: 7 Issue: 3, 221 - 227, 06.12.2020

Fırat Erdem , Uğur Avdan

https://doi.org/10.30897/ijegeo.684951

Cited By: 14

Abstract

References

Boonpook, W., Tan, Y., Ye, Y., Torteeka, P., Torsri, K. and Dong, S. (2018). A Deep Learning Approach on Building Detection from Unmanned Aerial Vehicle-Based Images in Riverbank Monitoring. Sensors, 18(11), 3921.
Cheng, G., Han, J. (2016). A survey on object detection in optical remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 117, 11-28.
Chollet, F. Keras. Retrieved 3 February 2020 from https://github.com/fchollet/keras/
Zhang, F., Du, B., Zhang, L. (2016). Scene classification via a gradient boosting random convolutional network framework. IEEE Trans. Geosci. Remote Sens., 54 (3), 1793–1802.
Ghanea, M., Moallem, P., Momeni, M. (2016). Building extraction from high-resolution satellite images in urban areas: recent methods and strategies against significant challenges. International journal of remote sensing, 37(21), 5234-5248.
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K. Q. (2017). Densely Connected Convolutional Networks. In CVPR, arXiv:1608.0699.
Long, J., Shelhamer, E., Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 3431-3440.
Li, X., Yao, X., Fang, Y. (2018). Building-A-Nets: Robust Building Extraction From High-Resolution Remote Sensing Images With Adversarial Networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, (99), 1-8.
Lin, J., Jing, W., Song, H., Chen, G. (2019). ESFNet: Efficient Network for Building Extraction From High-Resolution Aerial Images. IEEE Access, 7, 54285-54294.
Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P. (2017). Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 3226-3229.
Ronneberger, O., Fischer, P., Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, 234-241.
Simonyan, K., Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition, arXiv preprint, arXiv:1409.1556.
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A. A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-First AAAI Conference on Artificial Intelligence.
Hu, W., Huang, Y., Wei, L., Zhang, F., Li, H. (2015). Deep convolutional neural networks for hyperspectral image classification, J. Sensors, vol. 2015, Article ID 258619, 1–12.
Huang, W., Xiao, L., Wei, Z., Liu, H., Tang, S. (2015). A new pan sharpening method with deep neural networks. IEEE Geosci. Remote Sens. Lett., 12 (5), 1037–1041.
Chen, X., Xiang, S., Liu, C. L., Pan, C. H. (2014). Vehicle detection in satellite images by hybrid deep convolutional neural networks. IEEE Geosci. Remote Sens. Lett., 11 (10), 1797–1801.
Yang, H., Wu, P., Yao, X., Wu, Y., Wang, B., Xu, Y. (2018). Building extraction in very high resolution imagery by dense-attention networks. Remote Sensing, 10 (11), 1768.
Zeiler, M. D. (2012). Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.

Comparison of Different U-Net Models for Building Extraction from High-Resolution Aerial Imagery

Year 2020, Volume: 7 Issue: 3, 221 - 227, 06.12.2020

Fırat Erdem , Uğur Avdan

https://doi.org/10.30897/ijegeo.684951

Cited By: 14

Abstract

Building extraction from high-resolution aerial imagery plays an important role in geospatial applications such as urban planning, telecommunication, disaster monitoring, navigation, updating geographic databases, and urban dynamic monitoring. Automatic building extraction is a challenging task, as the buildings in different regions have different spectral and geometric properties. Therefore, the classical image processing techniques are not sufficient for automatic building extraction from high-resolution aerial imagery applications. Deep learning and semantic segmentation models, which have gained popularity in recent years, have been used for automatic object extraction from high-resolution images. U-Net model, which was originally developed for biomedical image processing, was used for building extraction. The encoder part of the U-Net model has been modified with Vgg16, InceptionResNetV2, and DenseNet121 convolutional neural networks. Therefore, building extraction was performed using Vgg16 U-Net, InceptionResNetV2 U-Net, and DenseNet121 U-Net models. In the fourth method, the results obtained from each U-Net model were combined in order to obtain the final result by maximum voting. This study aims to compare the performance of these four methods in building extraction from high-resolution aerial imagery. Images of Chicago from the Inria Aerial Image Labeling Dataset were used in the study. The images used have 0.3 m spatial resolution, 8-bit radiometric resolution and 3-band (red, green, and blue bands). Images consist of 36 tiles and they were divided into image subsets of 512x512 pixels. Thus, a total of 2715 image subsets were formed. 80% of the image subsets (2172 image subset) were used as training and 20% (543 image subset) as testing. To evaluate the accuracy of methods, the F1 score of the building class was employed. The F1 scores for building class have been calculated as 0.866, 0.860, 0.856, and 0.877 on test images for U-Net Vgg16, U-Net InceptionResNetV2, U-Net DenseNet121, and majority voting method, respectively.

Keywords

Aerial imagery, building extraction, deep learning, remote sensing, semantic segmentation, U-Net

References

Boonpook, W., Tan, Y., Ye, Y., Torteeka, P., Torsri, K. and Dong, S. (2018). A Deep Learning Approach on Building Detection from Unmanned Aerial Vehicle-Based Images in Riverbank Monitoring. Sensors, 18(11), 3921.
Cheng, G., Han, J. (2016). A survey on object detection in optical remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 117, 11-28.
Chollet, F. Keras. Retrieved 3 February 2020 from https://github.com/fchollet/keras/
Zhang, F., Du, B., Zhang, L. (2016). Scene classification via a gradient boosting random convolutional network framework. IEEE Trans. Geosci. Remote Sens., 54 (3), 1793–1802.
Ghanea, M., Moallem, P., Momeni, M. (2016). Building extraction from high-resolution satellite images in urban areas: recent methods and strategies against significant challenges. International journal of remote sensing, 37(21), 5234-5248.
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K. Q. (2017). Densely Connected Convolutional Networks. In CVPR, arXiv:1608.0699.
Long, J., Shelhamer, E., Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 3431-3440.
Li, X., Yao, X., Fang, Y. (2018). Building-A-Nets: Robust Building Extraction From High-Resolution Remote Sensing Images With Adversarial Networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, (99), 1-8.
Lin, J., Jing, W., Song, H., Chen, G. (2019). ESFNet: Efficient Network for Building Extraction From High-Resolution Aerial Images. IEEE Access, 7, 54285-54294.
Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P. (2017). Can semantic labeling methods generalize to any city? the inria aerial image labeling benchmark. In 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 3226-3229.
Ronneberger, O., Fischer, P., Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, 234-241.
Simonyan, K., Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition, arXiv preprint, arXiv:1409.1556.
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A. A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-First AAAI Conference on Artificial Intelligence.
Hu, W., Huang, Y., Wei, L., Zhang, F., Li, H. (2015). Deep convolutional neural networks for hyperspectral image classification, J. Sensors, vol. 2015, Article ID 258619, 1–12.
Huang, W., Xiao, L., Wei, Z., Liu, H., Tang, S. (2015). A new pan sharpening method with deep neural networks. IEEE Geosci. Remote Sens. Lett., 12 (5), 1037–1041.
Chen, X., Xiang, S., Liu, C. L., Pan, C. H. (2014). Vehicle detection in satellite images by hybrid deep convolutional neural networks. IEEE Geosci. Remote Sens. Lett., 11 (10), 1797–1801.
Yang, H., Wu, P., Yao, X., Wu, Y., Wang, B., Xu, Y. (2018). Building extraction in very high resolution imagery by dense-attention networks. Remote Sensing, 10 (11), 1768.
Zeiler, M. D. (2012). Adadelta: an adaptive learning rate method. arXiv preprint arXiv:1212.5701.

There are 18 citations in total.

Details

Primary Language	English
Subjects	Engineering
Journal Section	Research Articles
Authors	Fırat Erdem 0000-0002-6163-1979 Uğur Avdan 0000-0001-7873-9874
Publication Date	December 6, 2020
Published in Issue	Year 2020 Volume: 7 Issue: 3

Cite

APA	Erdem, F., & Avdan, U. (2020). Comparison of Different U-Net Models for Building Extraction from High-Resolution Aerial Imagery. International Journal of Environment and Geoinformatics, 7(3), 221-227. https://doi.org/10.30897/ijegeo.684951