Research Article

Image-to-Image Translation with CNN Based Perceptual Similarity Metrics

Volume: 9 Number: Issue:1 June 6, 2024
EN TR

Image-to-Image Translation with CNN Based Perceptual Similarity Metrics

Abstract

Image-to-image translation is the process of transforming images from different domains. Generative Adversarial Networks (GANs), and Convolutional Neural Networks (CNNs) are widely used in image translation. This study aims to find the most effective loss function for GAN architectures and synthesize better images. For this, experimental results were obtained by changing the loss functions on the Pix2Pix method, one of the basic GAN architectures. The exist loss function used in the Pix2Pix method is the Mean Absolute Error (MAE). It is called the L_1metric. In this study, the effect of convolutional-based perceptual similarity CONTENT, LPIPS, and DISTS metrics on image-to-image translation was applied on the loss function in Pix2Pix architecture. In addition, the effects on image-to-image translation were analyzed using perceptual similarity metrics ( L_1_CONTENT, L_1_LPIPS, and L_1_DISTS) with the original L_1 loss at a rate of 50%. Performance analyzes of the methods were performed with the Cityscapes, Denim2Mustache, Maps, and Papsmear datasets. Visual results were analyzed with conventional (FSIM, HaarPSI, MS-SSIM, PSNR, SSIM, VIFp and VSI) and up-to-date (FID and KID) image comparison metrics. As a result, it has been observed that better results are obtained when convolutional-based methods are used instead of conventional methods for the loss function of GAN architectures. It has been observed that LPIPS and DISTS methods can be used in the loss function of GAN architectures in the future

Keywords

References

  1. Zhu, X. X., Tuia, D., Mou, L., Xia, G. S., Zhang, L., Xu, F., & Fraundorfer, F. (2017). Deep learning in remote sensing: A comprehensive review and list of resources. IEEE geoscience and remote sensing magazine, 5(4), 8-36.
  2. Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. Advances in neural information processing systems, 27.
  3. Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 1725-1732).
  4. Koushik, J. (2016). Understanding convolutional neural networks. arXiv preprint arXiv:1605.09081.
  5. Guo, Y., Liu, Y., Oerlemans, A., Lao, S., Wu, S., & Lew, M. S. (2016). Deep learning for visual understanding: A review. Neurocomputing, 187, 27-48.
  6. Van Den Oord, A., Kalchbrenner, N., & Kavukcuoglu, K. (2016, June). Pixel recurrent neural networks. In International conference on machine learning (pp. 1747-1756). PMLR.
  7. Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., & Graves, A. (2016). Conditional image generation with pixelcnn decoders. Advances in neural information processing systems, 29.
  8. Salimans, T., Karpathy, A., Chen, X., & Kingma, D. P. (2017). Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications. arXiv preprint arXiv:1701.05517.

Details

Primary Language

English

Subjects

Image Processing, Deep Learning

Journal Section

Research Article

Publication Date

June 6, 2024

Submission Date

January 31, 2024

Acceptance Date

March 13, 2024

Published in Issue

Year 2024 Volume: 9 Number: Issue:1

APA
Altun Güven, S., Şahin, E., & Talu, M. F. (2024). Image-to-Image Translation with CNN Based Perceptual Similarity Metrics. Computer Science, 9(Issue:1), 84-98. https://doi.org/10.53070/bbd.1429596
AMA
1.Altun Güven S, Şahin E, Talu MF. Image-to-Image Translation with CNN Based Perceptual Similarity Metrics. JCS. 2024;9(Issue:1):84-98. doi:10.53070/bbd.1429596
Chicago
Altun Güven, Sara, Emrullah Şahin, and Muhammed Fatih Talu. 2024. “Image-to-Image Translation With CNN Based Perceptual Similarity Metrics”. Computer Science 9 (Issue:1): 84-98. https://doi.org/10.53070/bbd.1429596.
EndNote
Altun Güven S, Şahin E, Talu MF (June 1, 2024) Image-to-Image Translation with CNN Based Perceptual Similarity Metrics. Computer Science 9 Issue:1 84–98.
IEEE
[1]S. Altun Güven, E. Şahin, and M. F. Talu, “Image-to-Image Translation with CNN Based Perceptual Similarity Metrics”, JCS, vol. 9, no. Issue:1, pp. 84–98, June 2024, doi: 10.53070/bbd.1429596.
ISNAD
Altun Güven, Sara - Şahin, Emrullah - Talu, Muhammed Fatih. “Image-to-Image Translation With CNN Based Perceptual Similarity Metrics”. Computer Science 9/Issue:1 (June 1, 2024): 84-98. https://doi.org/10.53070/bbd.1429596.
JAMA
1.Altun Güven S, Şahin E, Talu MF. Image-to-Image Translation with CNN Based Perceptual Similarity Metrics. JCS. 2024;9:84–98.
MLA
Altun Güven, Sara, et al. “Image-to-Image Translation With CNN Based Perceptual Similarity Metrics”. Computer Science, vol. 9, no. Issue:1, June 2024, pp. 84-98, doi:10.53070/bbd.1429596.
Vancouver
1.Sara Altun Güven, Emrullah Şahin, Muhammed Fatih Talu. Image-to-Image Translation with CNN Based Perceptual Similarity Metrics. JCS. 2024 Jun. 1;9(Issue:1):84-98. doi:10.53070/bbd.1429596

The Creative Commons Attribution 4.0 International License 88x31.png is applied to all research papers published by JCS and

A Digital Object Identifier (DOI) Logo_TM.png is assigned for each published paper