Research Article

A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment

Volume: 5 Number: 1 June 29, 2026
EN TR

A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment

Abstract

Accurately assessing the perceptual similarity of visual data is a critical requirement in many multimedia applications, including video coding, streaming, and image restoration. Traditional pixel-based metrics such as Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), however, are limited in their ability to capture high-level semantic and contextual distortions perceived by humans, as they primarily focus on low-level structural differences. This limitation often results in inaccurate quality estimates, particularly when evaluating subtle texture losses or compression-induced visual artifacts. To address these challenges, this study introduces a new, fully neural, training-free evaluation metric, Neural Siamese Perceptual Quality (NSPQ). NSPQ measures perceptual similarity by comparing the feature embeddings of reference and distorted images using a pre-trained AlexNet-based Siamese Neural Network and translating the distance between embeddings into a perceptual similarity score. This approach shifts the focus of visual quality assessment from pixel-level differences to high-level feature representations, thereby providing estimates that better align with human visual perception. The proposed method was extensively evaluated on compressed natural video sequences obtained from the Xiph.org dataset as well as on a variety of distortions in the BAPPS dataset. Experimental results demonstrate that NSPQ achieves higher perceptual accuracy than traditional metrics. In the Xiph.org scenario, NSPQ reached an average performance of 99%, substantially surpassing SSIM (96%), normalized PSNR (39%), and VMAF (56%). In the BAPPS dataset, NSPQ obtained an average score of 85% in the p0 scenario, outperforming SSIM (65%). In the p1 scenario, NSPQ more accurately reflected distortion severity with an average score of 38%, remaining below the constant SSIM value of 65%. Overall, the findings indicate that NSPQ can serve not only as a complementary quality measure but also as a primary evaluation metric in applications such as video encoding, image restoration, and quality control.

Keywords

Image Quality Assessment, PSNR, SSIM, VMAF, AlexNet, Siamese Neural Networks

Supporting Institution

The Scientific and Technological Research Council of Türkiye (TÜBİTAK)

Project Number

5249902

Thanks

This work is supported by The Scientific and Technological Research Council of Türkiye (TÜBİTAK) 1515 Frontier R&D Laboratories Support Program for Türk Telekom 6G R&D Lab under project number 5249902.

References

  1. Afnan, F. Ullah, Yaseen, J. Lee, S. Jamil, and O.-J. Kwon, ‘‘Subjective assessment of objective image quality metrics range guaranteeing visually lossless compression,’’ Sensors, vol. 23, no. 3, 2023. [Online]. Available: https://www.mdpi.com/1424-8220/23/3/1297
  2. Arabboev, M., S. Begmatov, M. Rikhsivoev, K. Nosirov, and S. Saydiakbarov, ‘‘A comprehensive review of image super-resolution metrics: classical and ai-based approaches,’’ Acta IMEKO, vol. 13, no. 1, pp. 1–8, 2024.
  3. Bampis, C. G., Z. Li, and A. C. Bovik, ‘‘Spatiotemporal feature integration and model fusion for full reference video quality assessment,’’ IEEE Transactions on Circuits and Systems for Video Technology, vol. 29, no. 8, pp. 2256–2270, 2018.
  4. Cong, H., L. Fu, R. Zhang, Y. Zhang, H. Wang, J. He, and J. Gao, ‘‘Image quality assessment with gradient siamese network,’’ in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1201–1210.
  5. Daş, R., B. Polat, and G. Tuna, ‘‘Derin Öğrenme ile resim ve videolarda nesnelerin tanınması ve takibi,’’ Fırat Üniversitesi Mühendislik Bilimleri Dergisi, vol. 31, no. 2, p. 571–581, 2019.
  6. Demirol, D., Das, R., & Hanbay, D. (2019, September). Büyük veri üzerine perspektif bir bakış. In 2019 International Artificial Intelligence and Data Processing Symposium (IDAP) (pp. 1-9). IEEE.
  7. Deng, S., J. Han, and Y. Xu, ‘‘Vmaf based rate-distortion optimization for video coding,’’ in 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP). IEEE, 2020, pp. 1–6.
  8. Deshpande, R. G., L. L. Ragha, and S. K. Sharma, ‘‘Video quality assessment through psnr estimation for different compression standards,’’ Indonesian Journal of Electrical Engineering and Computer Science, vol. 11, no. 3, pp. 918–924, 2018.
  9. Ding, K., K. Ma, S. Wang, and E. P. Simoncelli, ‘‘Image quality assessment: Unifying structure and texture similarity,’’ IEEE transactions on pattern analysis and machine intelligence, vol. 44, no. 5, pp. 2567–2581, 2020.
  10. Dosovitskiy, A. and T. Brox, ‘‘Generating images with perceptual similarity metrics based on deep networks,’’ in Advances in Neural Information Processing Systems, D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, Eds., vol. 29. Curran Associates, Inc., 2016. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2016/file/371bce7dc83817b7893bcdeed13799b5-Paper.pdf
APA
Toprak, A. G., Mercan, Ö. B., & Özdem, M. (2026). A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment. Sivas Cumhuriyet Üniversitesi Bilim Ve Teknoloji Dergisi, 5(1), 40-52. https://doi.org/10.69560/cujast.1843247
AMA
1.Toprak AG, Mercan ÖB, Özdem M. A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment. CUJAST. 2026;5(1):40-52. doi:10.69560/cujast.1843247
Chicago
Toprak, Amine Gonca, Öykü Berfin Mercan, and Mehmet Özdem. 2026. “A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment”. Sivas Cumhuriyet Üniversitesi Bilim Ve Teknoloji Dergisi 5 (1): 40-52. https://doi.org/10.69560/cujast.1843247.
EndNote
Toprak AG, Mercan ÖB, Özdem M (June 1, 2026) A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment. Sivas Cumhuriyet Üniversitesi Bilim ve Teknoloji Dergisi 5 1 40–52.
IEEE
[1]A. G. Toprak, Ö. B. Mercan, and M. Özdem, “A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment”, CUJAST, vol. 5, no. 1, pp. 40–52, June 2026, doi: 10.69560/cujast.1843247.
ISNAD
Toprak, Amine Gonca - Mercan, Öykü Berfin - Özdem, Mehmet. “A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment”. Sivas Cumhuriyet Üniversitesi Bilim ve Teknoloji Dergisi 5/1 (June 1, 2026): 40-52. https://doi.org/10.69560/cujast.1843247.
JAMA
1.Toprak AG, Mercan ÖB, Özdem M. A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment. CUJAST. 2026;5:40–52.
MLA
Toprak, Amine Gonca, et al. “A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment”. Sivas Cumhuriyet Üniversitesi Bilim Ve Teknoloji Dergisi, vol. 5, no. 1, June 2026, pp. 40-52, doi:10.69560/cujast.1843247.
Vancouver
1.Amine Gonca Toprak, Öykü Berfin Mercan, Mehmet Özdem. A Siamese Neural Network-Based Perceptual Quality Metric for Image and Video Assessment. CUJAST. 2026 Jun. 1;5(1):40-52. doi:10.69560/cujast.1843247