Araştırma Makalesi
BibTex RIS Kaynak Göster

A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet

Yıl 2025, Cilt: 4 Sayı: 3, 557 - 579, 20.10.2025
https://doi.org/10.62520/fujece.1653548

Öz

Image-to-image translation is one of the major image processing tasks in the computer vision field that can be utilized in many types of applications such as style transfer, image enhancement, and more. This study introduces a novel approach for image-to-image translation based on a conditional generator adversarial network with a new hybrid generator architecture that combines the U-Net and ResNet architectures. This combination allows the model to benefit from both of their advantages due to their high compatibility. The discriminator uses the PatchGAN architecture for patch-wise discrimination. The model was evaluated by using the SSIM and PSNR which are standard metrics for image quality evaluation. The results are also compared to previous work that uses the same evaluation criteria and datasets. Furthermore, a public survey was conducted in which the participants were asked to choose the image that most closely resembled the target image between the proposed model and another study. The outcome of both the evaluation metrics and the public survey successfully demonstrated that the proposed image-to-image translation method is superior to that of previous studies.

Etik Beyan

There is no need for an ethics committee approval in the prepared article. There is no conflict of interest with any person/institution in the prepared article.

Kaynakça

  • H. Hoyez, C. Schockaert, J. Rambach, B. Mirbach, and D. Stricker, “Unsupervised image-to-image translation: A review,” Sensors, vol. 22, no. 21, p. 8540, 2022.
  • P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, 2017.
  • E. U. R. Mohammed, N. R. Soora, and S. W. Mohammed, “A comprehensive literature review on convolutional neural networks,” Computer Science Publications, 2022.
  • A. Kamil, and T. Shaikh, “Literature review of generative models for image-to-image translation problems,” in Proc. Int. Conf. Comput. Intell. Knowl. Economy (ICCIKE), Dubai, United Arab Emirates, 2019.
  • M. Mirza, and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint, arXiv:1411.1784, 2014.
  • C. Koç, and F. Özyurt, “An examination of synthetic images produced with DCGAN according to the size of data and epoch,” Firat Univ. J. Exp. Comput. Eng., vol. 2, no. 1, pp. 32–37, 2023.
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, X. Bing, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 27, pp. 2672–2680, 2014.
  • G. Perarnau, J. Van De Weijer, B. Raducanu, and J. M. Álvarez, “Invertible conditional GANs for image editing,” arXiv preprint, arXiv:1611.06355, 2016.
  • A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary classifier GANs,” in Proc. Int. Conf. Mach. Learn. (ICML), Sydney, Australia, 2017.
  • O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Med. Image Comput. Comput.-Assist. Interv. – MICCAI 2015, Munich, Germany, 2015.
  • Y. Ji, H. Zhang, and Q. M. J. Wu, “Saliency detection via conditional adversarial image-to-image network,” Neurocomputing, vol. 316, pp. 357–368, 2018.
  • X. Mao, S. Wang, L. Zheng, and Q. Huang, “Semantic invariant cross-domain image generation with generative adversarial networks,” Neurocomputing, vol. 293, pp. 55–63, 2018.
  • Y. Gan, J. Gong, M. Ye, Y. Qian, and K. Liu, “Unpaired cross-domain image translation with augmented auxiliary domain information,” Neurocomputing, vol. 316, pp. 112–123, 2018.
  • S. Mo, M. Cho, and J. Shin, “InstaGAN: Instance-aware image-to-image translation,” arXiv preprint, arXiv:1812.10889, 2018.
  • Y. Cho, R. Malav, G. Pandey, and A. Kim, “DehazeGAN: Underwater haze image restoration using unpaired image-to-image translation,” IFAC-PapersOnLine, vol. 52, no. 21, pp. 82–85, 2019.
  • D. Yang, S. Hong, Y. Jang, T. Zhao, and H. Lee, “Diversity-sensitive conditional generative adversarial networks,” arXiv preprint, arXiv:1901.09024, 2019.
  • M.-Y. Liu, X. Huang, A. Mallya, T. Karras, T. Aila, J. Lehtinen, and J. Kautz, “Few-shot unsupervised image-to-image translation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Seoul, South Korea, 2019.
  • W. Xu, S. Keshmiri, and G. Wang, “Toward learning a unified many-to-many mapping for diverse image translation,” Pattern Recognit., vol. 93, pp. 570–580, 2019.
  • L. Ye, B. Zhang, M. Yang, and W. Lian, “Triple-translation GAN with multi-layer sparse representation for face image synthesis,” Neurocomputing, vol. 358, pp. 294–308, 2019.
  • Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, “StarGAN v2: Diverse image synthesis for multiple domains,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Seattle, WA, USA, 2020.
  • P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Pix2Pix datasets,” UC Berkeley, Feb. 9, 2017. [Online]. Available: https://efrosgans.eecs.berkeley.edu/pix2pix/datasets/. [Accessed: Apr. 2, 2024].
  • K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, 2016.
  • Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004.
  • S. Mallat, “Compression,” in A Wavelet Tour of Signal Processing, 3rd ed., Boston, MA, USA: Academic Press, 2009, pp. 481–533.

U-Net ve ResNet Entegrasyonu ile Görüntüden Görüntüye Dönüşüm için Hibrit Koşullu GAN Tasarımı

Yıl 2025, Cilt: 4 Sayı: 3, 557 - 579, 20.10.2025
https://doi.org/10.62520/fujece.1653548

Öz

Görüntüden görüntüye dönüşüm, bilgisayarla görme alanının temel görüntü işleme görevlerinden biri olup, stil transferi, görüntü iyileştirme ve benzeri birçok uygulamada kullanılabilmektedir. Bu çalışmada, koşullu üretici çekişmeli ağlara (Conditional GAN) dayalı ve U-Net ile ResNet mimarilerini bir araya getiren hibrit bir üretici mimarisi üzerine kurulu yeni bir yaklaşım sunulmaktadır. Bu birleşim, yüksek uyumlulukları sayesinde modelin her iki mimarinin avantajlarından yararlanmasına olanak sağlamaktadır. Ayırt edici ağ ise PatchGAN mimarisi üzerine inşa edilerek yama bazlı ayrım yapmaktadır. Modelin performansı, görüntü kalitesi değerlendirmesinde standart kabul edilen SSIM ve PSNR metrikleri kullanılarak ölçülmüş, ayrıca aynı ölçütler ve veri kümeleri üzerinden değerlendirilen önceki çalışmalarla karşılaştırılmıştır. Bunun yanı sıra, katılımcılardan önerilen modelin çıktıları ile başka bir çalışmanın çıktıları arasından hedef görüntüye en çok benzeyeni seçmelerinin istendiği bir kamuoyu anketi de yapılmıştır. Hem değerlendirme metriklerinden hem de kamuoyu anketinden elde edilen bulgular, önerilen görüntüden görüntüye dönüşüm yöntemimizin önceki çalışmalara kıyasla üstün olduğunu açıkça ortaya koymaktadır.

Etik Beyan

Hazırlanan makalede etik kurul onayına gerek yoktur. Hazırlanan makalede herhangi bir kişi/kurumla çıkar çatışması bulunmamaktadır.

Kaynakça

  • H. Hoyez, C. Schockaert, J. Rambach, B. Mirbach, and D. Stricker, “Unsupervised image-to-image translation: A review,” Sensors, vol. 22, no. 21, p. 8540, 2022.
  • P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Honolulu, HI, USA, 2017.
  • E. U. R. Mohammed, N. R. Soora, and S. W. Mohammed, “A comprehensive literature review on convolutional neural networks,” Computer Science Publications, 2022.
  • A. Kamil, and T. Shaikh, “Literature review of generative models for image-to-image translation problems,” in Proc. Int. Conf. Comput. Intell. Knowl. Economy (ICCIKE), Dubai, United Arab Emirates, 2019.
  • M. Mirza, and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint, arXiv:1411.1784, 2014.
  • C. Koç, and F. Özyurt, “An examination of synthetic images produced with DCGAN according to the size of data and epoch,” Firat Univ. J. Exp. Comput. Eng., vol. 2, no. 1, pp. 32–37, 2023.
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, X. Bing, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 27, pp. 2672–2680, 2014.
  • G. Perarnau, J. Van De Weijer, B. Raducanu, and J. M. Álvarez, “Invertible conditional GANs for image editing,” arXiv preprint, arXiv:1611.06355, 2016.
  • A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary classifier GANs,” in Proc. Int. Conf. Mach. Learn. (ICML), Sydney, Australia, 2017.
  • O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Med. Image Comput. Comput.-Assist. Interv. – MICCAI 2015, Munich, Germany, 2015.
  • Y. Ji, H. Zhang, and Q. M. J. Wu, “Saliency detection via conditional adversarial image-to-image network,” Neurocomputing, vol. 316, pp. 357–368, 2018.
  • X. Mao, S. Wang, L. Zheng, and Q. Huang, “Semantic invariant cross-domain image generation with generative adversarial networks,” Neurocomputing, vol. 293, pp. 55–63, 2018.
  • Y. Gan, J. Gong, M. Ye, Y. Qian, and K. Liu, “Unpaired cross-domain image translation with augmented auxiliary domain information,” Neurocomputing, vol. 316, pp. 112–123, 2018.
  • S. Mo, M. Cho, and J. Shin, “InstaGAN: Instance-aware image-to-image translation,” arXiv preprint, arXiv:1812.10889, 2018.
  • Y. Cho, R. Malav, G. Pandey, and A. Kim, “DehazeGAN: Underwater haze image restoration using unpaired image-to-image translation,” IFAC-PapersOnLine, vol. 52, no. 21, pp. 82–85, 2019.
  • D. Yang, S. Hong, Y. Jang, T. Zhao, and H. Lee, “Diversity-sensitive conditional generative adversarial networks,” arXiv preprint, arXiv:1901.09024, 2019.
  • M.-Y. Liu, X. Huang, A. Mallya, T. Karras, T. Aila, J. Lehtinen, and J. Kautz, “Few-shot unsupervised image-to-image translation,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), Seoul, South Korea, 2019.
  • W. Xu, S. Keshmiri, and G. Wang, “Toward learning a unified many-to-many mapping for diverse image translation,” Pattern Recognit., vol. 93, pp. 570–580, 2019.
  • L. Ye, B. Zhang, M. Yang, and W. Lian, “Triple-translation GAN with multi-layer sparse representation for face image synthesis,” Neurocomputing, vol. 358, pp. 294–308, 2019.
  • Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, “StarGAN v2: Diverse image synthesis for multiple domains,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Seattle, WA, USA, 2020.
  • P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Pix2Pix datasets,” UC Berkeley, Feb. 9, 2017. [Online]. Available: https://efrosgans.eecs.berkeley.edu/pix2pix/datasets/. [Accessed: Apr. 2, 2024].
  • K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, 2016.
  • Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: From error visibility to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600–612, 2004.
  • S. Mallat, “Compression,” in A Wavelet Tour of Signal Processing, 3rd ed., Boston, MA, USA: Academic Press, 2009, pp. 481–533.
Toplam 24 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Otomatik Yazılım Mühendisliği, Yazılım Mimarisi, Pekiştirmeli Öğrenme, Yazılım Mühendisliği (Diğer)
Bölüm Araştırma Makalesi
Yazarlar

Khaled Al Hariri 0009-0007-2419-8321

Muhammet Paşaoğlu 0009-0008-0929-7740

Erkut Arıcan 0000-0003-4528-3203

Yayımlanma Tarihi 20 Ekim 2025
Gönderilme Tarihi 7 Mart 2025
Kabul Tarihi 31 Temmuz 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 4 Sayı: 3

Kaynak Göster

APA Al Hariri, K., Paşaoğlu, M., & Arıcan, E. (2025). A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet. Firat University Journal of Experimental and Computational Engineering, 4(3), 557-579. https://doi.org/10.62520/fujece.1653548
AMA Al Hariri K, Paşaoğlu M, Arıcan E. A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet. Firat University Journal of Experimental and Computational Engineering. Ekim 2025;4(3):557-579. doi:10.62520/fujece.1653548
Chicago Al Hariri, Khaled, Muhammet Paşaoğlu, ve Erkut Arıcan. “A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet”. Firat University Journal of Experimental and Computational Engineering 4, sy. 3 (Ekim 2025): 557-79. https://doi.org/10.62520/fujece.1653548.
EndNote Al Hariri K, Paşaoğlu M, Arıcan E (01 Ekim 2025) A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet. Firat University Journal of Experimental and Computational Engineering 4 3 557–579.
IEEE K. Al Hariri, M. Paşaoğlu, ve E. Arıcan, “A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet”, Firat University Journal of Experimental and Computational Engineering, c. 4, sy. 3, ss. 557–579, 2025, doi: 10.62520/fujece.1653548.
ISNAD Al Hariri, Khaled vd. “A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet”. Firat University Journal of Experimental and Computational Engineering 4/3 (Ekim2025), 557-579. https://doi.org/10.62520/fujece.1653548.
JAMA Al Hariri K, Paşaoğlu M, Arıcan E. A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet. Firat University Journal of Experimental and Computational Engineering. 2025;4:557–579.
MLA Al Hariri, Khaled vd. “A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet”. Firat University Journal of Experimental and Computational Engineering, c. 4, sy. 3, 2025, ss. 557-79, doi:10.62520/fujece.1653548.
Vancouver Al Hariri K, Paşaoğlu M, Arıcan E. A Hybrid Conditional GAN Design for Image-to-Image Translation Integrating U-Net and ResNet. Firat University Journal of Experimental and Computational Engineering. 2025;4(3):557-79.