In modern geographical applications, the demand for up-to-date and accurate building maps is increasing, driven by essential needs in sustainable urban planning, sprawl monitoring, natural hazard mitigation, crisis management, smart city initiatives, and the establishment of climate-resilient urban environments. The unregulated growth in urbanization and settlement patterns poses multifaceted challenges, including ecological imbalances, loss of arable land, and increasing risk of drought. Leveraging recent technologies in remote sensing and artificial intelligence, particularly in the fields of very high-resolution satellite imagery and aerial photography, presents promising solutions for rapidly acquiring precise building maps. This research aims to investigate the efficiency of an ensemble deep learning framework comprising DeepLabV3+, UNet++, Pix2pix, Feature Pyramid Network, and Pyramid Scene Parsing Network architectures for the semantic segmentation of buildings. By employing the Wuhan University Aerial Building Dataset, characterized by a spatial resolution of 0.3 meters, as the training and testing dataset, the study assesses the performance of the proposed ensemble model. The findings reveal notable accuracies, with intersection over union metrics reaching 90.22% for DeepLabV3+, 91.01% for UNet++, 83.50% for Pix2pix, 88.90% for FPN, 88.20% for PSPNet, and finally at 91.06% for the ensemble model. These results reveal the potential of integrating diverse deep learning architectures to enhance the precision of building semantic segmentation.
Primary Language | English |
---|---|
Subjects | Photogrammetry and Remote Sensing |
Journal Section | Research Article |
Authors | |
Early Pub Date | March 14, 2025 |
Publication Date | |
Submission Date | November 19, 2024 |
Acceptance Date | February 7, 2025 |
Published in Issue | Year 2025 Volume: 10 Issue: 3 |