Single-Image HDR Reconstruction with Attention-Driven Autoencoder

Cevher Renk; Serdar Çiftçi

Araştırma Makalesi

Single-Image HDR Reconstruction with Attention-Driven Autoencoder

Yıl 2025, Cilt: 15 Sayı: 3, 45 - 54, 29.09.2025

Öz

High Dynamic Range (HDR) imaging enables the representation of details in both bright and dark areas of a scene, aligning closely with human visual perception. Traditional multi-exposure HDR methods face challenges such as ghosting, hardware dependency, and high processing costs. This study adopts a baseline model that synthesizes multi-exposure representations from a single SDR image using a U-Net-like autoencoder, which is enhanced by integrating five distinct attention mechanisms: Spatial, Channel, Bottleneck, Squeeze-and-Excitation and Self Attention. Each attention module is individually embedded into the encoder and decoder layers to form separate model variants, all trained independently on the DrTMO dataset. Quantitative evaluations based on SSIM, PSNR, and LPIPS demonstrate that the Spatial Attention variant delivers the best performance across all metrics. The results highlight that incorporating attention mechanisms into autoencoder-based HDR reconstruction architectures significantly enhances both structural fidelity and perceptual image quality, making them promising for efficient single-image HDR synthesis.

Anahtar Kelimeler

HDR Reconstruction , Autoencoder , Attention Mechanisms

Kaynakça

[1] P.-H. Le, Q. Le, R. Nguyen, and B.-S. Hua, “Single-image HDR reconstruction by multi-exposure generation,” in Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis. (WACV), 2023, pp. 4063–4072.
[2] Z. Liu, Y. Wang, B. Zeng, and S. Liu, “Ghost-free high dynamic range imaging with context-aware transformer,” in Proc. Eur. Conf. Comput. Vis. (ECCV), Cham, Switzerland: Springer, Oct. 2022, pp. 344–360.
[3] Y.-L. Liu, W.-S. Lai, Y.-S. Chen, Y.-L. Kao, M.-H. Yang, Y.-Y. Chuang, and J.-B. Huang, “Single-image HDR reconstruction by learning to reverse the camera pipeline,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 1651–1660.
[4] X. Chen, Y. Liu, Z. Zhang, Y. Qiao, and C. Dong, “HDRUNet: Single image HDR reconstruction with denoising and dequantization,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 354–363. [5] K.-Y. Chen and J.-J. Leou, “Low light image enhancement using autoencoder-based deep neural networks,” in Image Processing, Computer Vision, and Pattern Recognition and Information and Knowledge Engineering, Las Vegas, NV, USA: Springer, 2025, pp. 73–84. doi: 10.1007/978-3-031-85933-5_6.
[6] A. Sharif, S. M. Naqvi, M. Biswas, and S. Kim, “A two-stage deep network for high dynamic range image reconstruction,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 550–559.
[7] S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, et al., “Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 6881–6890.
[8] Q. Yan, T. Hu, G. Chen, W. Dong, and Y. Zhang, “Boosting HDR image reconstruction via semantic knowledge transfer,” arXiv preprint arXiv:2503.15361, 2025.
[9] J. Ma and H. Zhang, “HDR-DANet: Single HDR image reconstruction via dual attention,” Multimedia Syst., vol. 31, no. 1, pp. 1–12, 2025.
[10] S. Y. Kim, J. Oh, and M. Kim, “JSI-GAN: GAN-based joint super-resolution and inverse tone-mapping with pixel-wise task-specific filters for UHD HDR video,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 7, Apr. 2020, pp. 11287–11295.
[11] Y. I. Park, J. W. Song, and S. J. Kang, “65‐3: Invited paper: Deep learning‐based image enhancement for HDR imaging,” in SID Symp. Dig. Tech. Papers, vol. 53, no. 1, Jun. 2022, pp. 865–868.
[12] A. de Santana Correia and E. L. Colombini, “Attention, please! A survey of neural attention models in deep learning,” Artif. Intell. Rev., vol. 55, no. 8, pp. 6037–6124, 2022.
[13] J. Shen and T. Wu, “Learning spatially-adaptive squeeze-excitation networks for few-shot image synthesis,” in Proc. IEEE Int. Conf. Image Process. (ICIP), Oct. 2023, pp. 2855–2859.
[14] Z. Zhang, Z. Xu, X. Gu, and J. Xiong, “Cross-CBAM: A lightweight network for scene segmentation,” arXiv preprint arXiv:2306.02306, 2023.
[15] S. Mekruksavanich and A. Jitpattanakul, “Hybrid convolution neural network with channel attention mechanism for sensor-based human activity recognition,” Sci. Rep., vol. 13, no. 1, p. 12067, 2023.
[16] A. Srinivas, T.-Y. Lin, N. Parmar, J. Shlens, P. Abbeel, and A. Vaswani, “Bottleneck transformers for visual recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 16519–16529.
[17] H. Y. Lin, Y. R. Lin, W. C. Lin, and C. C. Chang, “Reconstructing high dynamic range image from a single low dynamic range image using histogram learning,” Appl. Sci., vol. 14, no. 21, p. 9847, 2024.
[18] H. Zhang, K. Zu, J. Lu, Y. Zou, and D. Meng, “EPSANet: An efficient pyramid squeeze attention block on convolutional neural network,” in Proc. Asian Conf. Comput. Vis. (ACCV), 2022, pp. 1161–1177.
[19] B. Zhang, S. Gu, B. Zhang, J. Bao, D. Chen, F. Wen, et al., “StyleSwin: Transformer-based GAN for high-resolution image generation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 11304–11314.
[20] Y. Al Najjar, “Comparative analysis of image quality assessment metrics: MSE, PSNR, SSIM and FSIM,” Int. J. Sci. Res. (IJSR), vol. 13, no. 3, pp. 110–114, 2024.
[21] M. Azimi, “PU21: A novel perceptually uniform encoding for adapting existing quality metrics for HDR,” in Proc. Picture Coding Symp. (PCS), Jun. 2021, pp. 1–5.
[22] A. Ghildyal and F. Liu, “Shift-tolerant perceptual similarity metric,” in Proc. Eur. Conf. Comput. Vis. (ECCV), Cham, Switzerland: Springer, Oct. 2022, pp. 91–107.

Dikkat Tabanlı Otokodlayıcı ile Tek Görüntüden YDA Yapılandırması

Yıl 2025, Cilt: 15 Sayı: 3, 45 - 54, 29.09.2025

Cevher Renk , Serdar Çiftçi

Öz

Yüksek Dinamik Aralıklı (YDA) görüntüleme, sahnedeki aydınlık ve karanlık bölgelerdeki ayrıntıların eş zamanlı temsilini sağlayarak insan görsel algısına daha yakın sonuçlar üretir. Geleneksel YDA yöntemleri çoğunlukla çoklu pozlamaya dayandığından, hareketli sahnelerde hayaletlenme, donanımsal sınırlamalar ve yüksek işlem süresi gibi dezavantajlar barındırır. Bu çalışmada, yalnızca tek bir Standart Dinamik Aralık (SDA) görüntüden sahte çoklu pozlamalar üreterek YDA sahne oluşturan temel bir mimari referans alınmış ve U-Net benzeri otokodlayıcı (encoder) yapısı, farklı dikkat mekanizmalarıyla (attention mechanisms) yeniden yapılandırılmıştır. Modelin kodlayıcı (encoder) ve kodçözücü (decoder) katmanlarına sırasıyla Spatial, Channel, Bottleneck, Squeeze-and-Excitation ve Self Attention olmak üzere beş dikkat mekanizması tekil olarak entegre edilmiş ve her bir varyant DrTMO veri seti üzerinde bağımsız şekilde eğitilerek karşılaştırmalı analiz yapılmıştır. Deneysel değerlendirmeler SSIM, PSNR ve LPIPS metrikleri üzerinden gerçekleştirilmiş; Spatial Attention modülünün tüm kriterlerde en başarılı sonuçları sunduğu gözlemlenmiştir. Elde edilen bulgular, dikkat mekanizmalarının otokodlayıcı tabanlı tek görüntüden YDA üretim modellerine entegre edilmesinin, yapısal tutarlılık ve algısal kalite üzerinde olumlu etkiler yarattığını göstermektedir.

Anahtar Kelimeler

YDA Yeniden Yapılandırma , Otokodlayıcı , Dikkat Mekanizmaları

Etik Beyan

Bu çalışmada etik kurul izni gerektiren bir deney, anket, insan ya da hayvan deneyi yapılmamıştır. Makalede kullanılan tüm veriler kamuya açık kaynaklardan temin edilmiş olup, herhangi bir özel ya da kişisel veri kullanılmamıştır. Yazar(lar), çalışmanın tüm akademik ve etik kurallara uygun olarak hazırlandığını, intihal, sahtecilik, çarpıtma, tekrar yayın, dilimleme, haksız yazarlık ve benzeri etik dışı davranışlarda bulunulmadığını beyan eder. Çalışmaya katkı sağlayan tüm yazarlar, makale içeriğini onaylamış ve gönderim süreci hakkında bilgilendirilmiştir. Herhangi bir çıkar çatışması bulunmamaktadır.

Destekleyen Kurum

Bu çalışma herhangi bir kurum ya da kuruluş tarafından maddi olarak desteklenmemiştir.

Kaynakça

[1] P.-H. Le, Q. Le, R. Nguyen, and B.-S. Hua, “Single-image HDR reconstruction by multi-exposure generation,” in Proc. IEEE/CVF Winter Conf. Appl. Comput. Vis. (WACV), 2023, pp. 4063–4072.
[2] Z. Liu, Y. Wang, B. Zeng, and S. Liu, “Ghost-free high dynamic range imaging with context-aware transformer,” in Proc. Eur. Conf. Comput. Vis. (ECCV), Cham, Switzerland: Springer, Oct. 2022, pp. 344–360.
[3] Y.-L. Liu, W.-S. Lai, Y.-S. Chen, Y.-L. Kao, M.-H. Yang, Y.-Y. Chuang, and J.-B. Huang, “Single-image HDR reconstruction by learning to reverse the camera pipeline,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020, pp. 1651–1660.
[4] X. Chen, Y. Liu, Z. Zhang, Y. Qiao, and C. Dong, “HDRUNet: Single image HDR reconstruction with denoising and dequantization,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 354–363. [5] K.-Y. Chen and J.-J. Leou, “Low light image enhancement using autoencoder-based deep neural networks,” in Image Processing, Computer Vision, and Pattern Recognition and Information and Knowledge Engineering, Las Vegas, NV, USA: Springer, 2025, pp. 73–84. doi: 10.1007/978-3-031-85933-5_6.
[6] A. Sharif, S. M. Naqvi, M. Biswas, and S. Kim, “A two-stage deep network for high dynamic range image reconstruction,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 550–559.
[7] S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, et al., “Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 6881–6890.
[8] Q. Yan, T. Hu, G. Chen, W. Dong, and Y. Zhang, “Boosting HDR image reconstruction via semantic knowledge transfer,” arXiv preprint arXiv:2503.15361, 2025.
[9] J. Ma and H. Zhang, “HDR-DANet: Single HDR image reconstruction via dual attention,” Multimedia Syst., vol. 31, no. 1, pp. 1–12, 2025.
[10] S. Y. Kim, J. Oh, and M. Kim, “JSI-GAN: GAN-based joint super-resolution and inverse tone-mapping with pixel-wise task-specific filters for UHD HDR video,” in Proc. AAAI Conf. Artif. Intell., vol. 34, no. 7, Apr. 2020, pp. 11287–11295.
[11] Y. I. Park, J. W. Song, and S. J. Kang, “65‐3: Invited paper: Deep learning‐based image enhancement for HDR imaging,” in SID Symp. Dig. Tech. Papers, vol. 53, no. 1, Jun. 2022, pp. 865–868.
[12] A. de Santana Correia and E. L. Colombini, “Attention, please! A survey of neural attention models in deep learning,” Artif. Intell. Rev., vol. 55, no. 8, pp. 6037–6124, 2022.
[13] J. Shen and T. Wu, “Learning spatially-adaptive squeeze-excitation networks for few-shot image synthesis,” in Proc. IEEE Int. Conf. Image Process. (ICIP), Oct. 2023, pp. 2855–2859.
[14] Z. Zhang, Z. Xu, X. Gu, and J. Xiong, “Cross-CBAM: A lightweight network for scene segmentation,” arXiv preprint arXiv:2306.02306, 2023.
[15] S. Mekruksavanich and A. Jitpattanakul, “Hybrid convolution neural network with channel attention mechanism for sensor-based human activity recognition,” Sci. Rep., vol. 13, no. 1, p. 12067, 2023.
[16] A. Srinivas, T.-Y. Lin, N. Parmar, J. Shlens, P. Abbeel, and A. Vaswani, “Bottleneck transformers for visual recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2021, pp. 16519–16529.
[17] H. Y. Lin, Y. R. Lin, W. C. Lin, and C. C. Chang, “Reconstructing high dynamic range image from a single low dynamic range image using histogram learning,” Appl. Sci., vol. 14, no. 21, p. 9847, 2024.
[18] H. Zhang, K. Zu, J. Lu, Y. Zou, and D. Meng, “EPSANet: An efficient pyramid squeeze attention block on convolutional neural network,” in Proc. Asian Conf. Comput. Vis. (ACCV), 2022, pp. 1161–1177.
[19] B. Zhang, S. Gu, B. Zhang, J. Bao, D. Chen, F. Wen, et al., “StyleSwin: Transformer-based GAN for high-resolution image generation,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), 2022, pp. 11304–11314.
[20] Y. Al Najjar, “Comparative analysis of image quality assessment metrics: MSE, PSNR, SSIM and FSIM,” Int. J. Sci. Res. (IJSR), vol. 13, no. 3, pp. 110–114, 2024.
[21] M. Azimi, “PU21: A novel perceptually uniform encoding for adapting existing quality metrics for HDR,” in Proc. Picture Coding Symp. (PCS), Jun. 2021, pp. 1–5.
[22] A. Ghildyal and F. Liu, “Shift-tolerant perceptual similarity metric,” in Proc. Eur. Conf. Comput. Vis. (ECCV), Cham, Switzerland: Springer, Oct. 2022, pp. 91–107.

Toplam 21 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Elektrik Mühendisliği (Diğer)
Bölüm	Akademik ve/veya teknolojik bilimsel makale
Yazarlar	Cevher Renk Serdar Çiftçi 0000-0001-7074-2876
Yayımlanma Tarihi	29 Eylül 2025
Gönderilme Tarihi	5 Mayıs 2025
Kabul Tarihi	8 Temmuz 2025
Yayımlandığı Sayı	Yıl 2025 Cilt: 15 Sayı: 3

Kaynak Göster

APA	Renk, C., & Çiftçi, S. (2025). Single-Image HDR Reconstruction with Attention-Driven Autoencoder. EMO Bilimsel Dergi, 15(3), 45-54.
AMA	Renk C, Çiftçi S. Single-Image HDR Reconstruction with Attention-Driven Autoencoder. EMO Bilimsel Dergi. Eylül 2025;15(3):45-54.
Chicago	Renk, Cevher, ve Serdar Çiftçi. “Single-Image HDR Reconstruction with Attention-Driven Autoencoder”. EMO Bilimsel Dergi 15, sy. 3 (Eylül 2025): 45-54.
EndNote	Renk C, Çiftçi S (01 Eylül 2025) Single-Image HDR Reconstruction with Attention-Driven Autoencoder. EMO Bilimsel Dergi 15 3 45–54.
IEEE	C. Renk ve S. Çiftçi, “Single-Image HDR Reconstruction with Attention-Driven Autoencoder”, EMO Bilimsel Dergi, c. 15, sy. 3, ss. 45–54, 2025.
ISNAD	Renk, Cevher - Çiftçi, Serdar. “Single-Image HDR Reconstruction with Attention-Driven Autoencoder”. EMO Bilimsel Dergi 15/3 (Eylül2025), 45-54.
JAMA	Renk C, Çiftçi S. Single-Image HDR Reconstruction with Attention-Driven Autoencoder. EMO Bilimsel Dergi. 2025;15:45–54.
MLA	Renk, Cevher ve Serdar Çiftçi. “Single-Image HDR Reconstruction with Attention-Driven Autoencoder”. EMO Bilimsel Dergi, c. 15, sy. 3, 2025, ss. 45-54.
Vancouver	Renk C, Çiftçi S. Single-Image HDR Reconstruction with Attention-Driven Autoencoder. EMO Bilimsel Dergi. 2025;15(3):45-54.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

EMO BİLİMSEL DERGİ
Elektrik, Elektronik, Bilgisayar, Biyomedikal, Kontrol Mühendisliği Bilimsel Hakemli Dergisi
TMMOB ELEKTRİK MÜHENDİSLERİ ODASI
IHLAMUR SOKAK NO:10 KIZILAY/ANKARA
TEL: +90 (312) 425 32 72 (PBX) - FAKS: +90 (312) 417 38 18
bilimseldergi@emo.org.tr