Automatic generation of in-vehicle images: StyleGAN-ADA vs. MSG-GAN

Sahar Azadi; Sandra Dixe; Joao Leite; Joao Borges; Sandro Queiros; Jeime Fonseca

doi:10.62189/ci.1261718

Research Article

BibTex

RIS

Cite

Year 2025, Volume: 5 Issue: 1, 23 - 31, 30.06.2025

Sahar Azadi , Sandra Dixe , Joao Leite , Joao Borges , Sandro Queiros , Jeime Fonseca

https://doi.org/10.62189/ci.1261718

Abstract

Project Number

UIDB/00319/2020

References

[1] Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 8–13 December 2014; MIT Press: Cambridge, MA, USA, 2014; pp. 2672–2680.
[2] Perez, L.; Wang, J. The effectiveness of data augmentation in image classification using deep learning. arXiv 2017, arXiv:1712.04621. https://doi.org/10.48550/arXiv.1712.04621.
[3] Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved techniques for training GANs. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Curran Associates Inc.: Red Hook, NY, USA, 2016; pp. 2234–2242.
[4] Lin, Z.; Khetan, A.; Fanti, G.; Oh, S. PacGAN: The power of two samples in generative adversarial networks. In Proceedings of the 31st Conference on Neural Information Processing Systems, Montreal, Canada, 3–8 December 2018; Curran Associates Inc.: Red Hook, NY, USA, 2018; pp. 1498–1507.
[5] Arjovsky, M.; Bottou, L. Towards principled methods for training generative adversarial networks. arXiv 2017, arXiv:1701.04862. https://doi.org/10.48550/arXiv.1701.04862.
[6] Zhang, D.; Khoreva, A. PA-GAN: Improving GAN training by progressive augmentation. arXiv 2019, arXiv:1901.10422. https://doi.org/10.48550/arXiv.1901.10422.
[7] LeCun, Y.; Cortes, C.; Burges, C.J. MNIST handwritten digit database. AT&T Labs. 2010. Available online: http://yann.lecun.com/exdb/mnist (accessed on 18 Dec 2024).
[8] Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 3730–3738. https://doi.org/10.1109/ICCV.2015.425.
[9] Zhu, K.; Liu, X.; Yang, H. A survey of generative adversarial networks. In Proceedings of the Chinese Automation Congress, Xi'an, China, 30 November–2 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2768–2773. https://doi.org/10.1109/CAC.2018.8623645.
[10] Yi, X.; Walia, E.; Babyn, P. Generative adversarial network in medical imaging: a review. Med. Image Anal. 2019, 58, 101552. https://doi.org/10.1016/j.media.2019.101552.
[11] Turhan, C.G.; Bilge, H.S. Recent trends in deep generative models: a review. In Proceedings of the 3rd International Conference on Computer Science and Engineering, Sarajevo, Bosnia and Herzegovina, 20–23 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 574–579. https://doi.org/10.1109/UBMK.2018.8566353.
[12] Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2016, arXiv:1511.06434. https://doi.org/10.48550/arXiv.1511.06434.
[13] Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. arXiv 2018, arXiv:1710.10196. https://doi.org/10.48550/arXiv.1710.10196.
[14] Karnewar, A.; Wang, O. MSG-GAN: Multi-scale gradients for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 7799–7808. https://doi.org/10.1109/CVPR42600.2020.00782.
[15] Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Cham, Switzerland, 2015; pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.
[16] Deepak, S.; Ameer, P.M. MSG-GAN based synthesis of brain MRI with meningioma for data augmentation. In Proceedings of the 2020 IEEE International Conference on Electronics, Computing and Communication Technologies, Bangalore, India, 2–4 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. https://doi.org/10.1109/CONECCT50063.2020.9198672.
[17] Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of Wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, California, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 5767–5777.
[18] Mescheder, L.; Geiger, A.; Nowozin, S. Which training methods for GANs do actually converge? In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; PMLR: 2018; pp. 3481–3490.
[19] Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, California, USA, 15–20 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 4401–4410.
[20] Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Washington, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 8110–8119. https://doi.org/10.1109/CVPR42600.2020.00813.
[21] Karras, T.; Aittala, M.; Hellsten, J.; Laine, S.; Lehtinen, J.; Aila, T. Training generative adversarial networks with limited data. In Proceedings of the 34th International Conference on Neural Information Processing Systems, New York, USA, 6–12 December 2020; Curran Associates Inc.: Red Hook, NY, USA, 2020; pp. 12104–12114.
[22] Dixe, S.; Leite, J.; Azadi, S.; Faria, P.; Mendes, J.; Fonseca, J.C.; Borges, J.; Queiros, S. In-car damage dirt and stain estimation with RGB images. In Proceedings of the 13th International Conference on Agents and Artificial Intelligence, Vienna, Austria, 4–6 February 2021; Scitepress: 2021; pp. 672–679.
[23] Faria, P.; Dixe, S.; Leite, J.; Azadi, S.; Mendes, J.; Fonseca, J.C.; Borges, J.; Queiros, S. In-car state classification with RGB images. In Proceedings of the 20th International Conference on Intelligent Systems Design and Applications, [Location unknown], 12–15 December 2020; Springer: Cham, Switzerland, 2020; pp. 435–445.
[24] Xu, Q.; Huang, G.; Yuan, Y.; Guo, C.; Sun, Y.; Wu, F.; Zhang, C.; Lin, D. An empirical study on evaluation metrics of generative adversarial networks. arXiv 2018, arXiv:1806.07755. https://doi.org/10.48550/arXiv.1806.07755.
[25] Gretton, A.; Borgwardt, K.; Rasch, M.; Schölkopf, B.; Smola, A. A kernel method for the two-sample-problem. In Proceedings of the 20th International Conference on Neural Information Processing Systems, British Columbia, Canada, 4–9 December 2006; MIT Press: Cambridge, MA, USA, 2006; pp. 513–520.
[26] Lopez-Paz, D.; Oquab, M. Revisiting classifier two-sample tests. arXiv 2017, arXiv:1610.06545. https://doi.org/10.48550/arXiv.1610.06545.
[27] Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems, California, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6629–6640.

Automatic generation of in-vehicle images: StyleGAN-ADA vs. MSG-GAN

Year 2025, Volume: 5 Issue: 1, 23 - 31, 30.06.2025

Sahar Azadi , Sandra Dixe , Joao Leite , Joao Borges , Sandro Queiros , Jeime Fonseca

https://doi.org/10.62189/ci.1261718

Abstract

Deep learning-based methodologies are a key component towards the goal of autonomous driving. For a successful application, these models require a significant amount of training data, which is difficult, time-consuming, and expensive to collect. This study assesses the effectiveness of Generative Adversarial Networks (GANs) in generating high-quality training images for in-vehicle applications using a limited dataset. Two advanced GAN architectures were compared for their ability to produce realistic in-vehicle RGB images. The results showed that the StyleGAN-ADA outperformed the MSG-GAN, generating images with better fidelity and accuracy, making it more suitable for scenarios with limited data. However, challenges such as mode collapse and long training times, particularly for high-resolution images, were identified. The models’ reliance on the quality and diversity of the training dataset also limits their effectiveness in real-world applications. This research highlights the potential of GANs to reduce the lack of data in autonomous driving, pointing to future approaches for optimizing these models.

Keywords

Deep learning , Evaluation metrics , Generative Adversarial Networks , Generative models

Supporting Institution

FCT - Fundação para a Ciência e Tecnologia

Project Number

UIDB/00319/2020

Thanks

This work has been supported by FCT - Fundação para a Ciência e Tecnologia within the scope of the R&D Units project: UIDB/00319/2020.

References

[1] Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, Canada, 8–13 December 2014; MIT Press: Cambridge, MA, USA, 2014; pp. 2672–2680.
[2] Perez, L.; Wang, J. The effectiveness of data augmentation in image classification using deep learning. arXiv 2017, arXiv:1712.04621. https://doi.org/10.48550/arXiv.1712.04621.
[3] Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved techniques for training GANs. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Curran Associates Inc.: Red Hook, NY, USA, 2016; pp. 2234–2242.
[4] Lin, Z.; Khetan, A.; Fanti, G.; Oh, S. PacGAN: The power of two samples in generative adversarial networks. In Proceedings of the 31st Conference on Neural Information Processing Systems, Montreal, Canada, 3–8 December 2018; Curran Associates Inc.: Red Hook, NY, USA, 2018; pp. 1498–1507.
[5] Arjovsky, M.; Bottou, L. Towards principled methods for training generative adversarial networks. arXiv 2017, arXiv:1701.04862. https://doi.org/10.48550/arXiv.1701.04862.
[6] Zhang, D.; Khoreva, A. PA-GAN: Improving GAN training by progressive augmentation. arXiv 2019, arXiv:1901.10422. https://doi.org/10.48550/arXiv.1901.10422.
[7] LeCun, Y.; Cortes, C.; Burges, C.J. MNIST handwritten digit database. AT&T Labs. 2010. Available online: http://yann.lecun.com/exdb/mnist (accessed on 18 Dec 2024).
[8] Liu, Z.; Luo, P.; Wang, X.; Tang, X. Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 3730–3738. https://doi.org/10.1109/ICCV.2015.425.
[9] Zhu, K.; Liu, X.; Yang, H. A survey of generative adversarial networks. In Proceedings of the Chinese Automation Congress, Xi'an, China, 30 November–2 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2768–2773. https://doi.org/10.1109/CAC.2018.8623645.
[10] Yi, X.; Walia, E.; Babyn, P. Generative adversarial network in medical imaging: a review. Med. Image Anal. 2019, 58, 101552. https://doi.org/10.1016/j.media.2019.101552.
[11] Turhan, C.G.; Bilge, H.S. Recent trends in deep generative models: a review. In Proceedings of the 3rd International Conference on Computer Science and Engineering, Sarajevo, Bosnia and Herzegovina, 20–23 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 574–579. https://doi.org/10.1109/UBMK.2018.8566353.
[12] Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2016, arXiv:1511.06434. https://doi.org/10.48550/arXiv.1511.06434.
[13] Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. arXiv 2018, arXiv:1710.10196. https://doi.org/10.48550/arXiv.1710.10196.
[14] Karnewar, A.; Wang, O. MSG-GAN: Multi-scale gradients for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 7799–7808. https://doi.org/10.1109/CVPR42600.2020.00782.
[15] Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; Springer: Cham, Switzerland, 2015; pp. 234–241. https://doi.org/10.1007/978-3-319-24574-4_28.
[16] Deepak, S.; Ameer, P.M. MSG-GAN based synthesis of brain MRI with meningioma for data augmentation. In Proceedings of the 2020 IEEE International Conference on Electronics, Computing and Communication Technologies, Bangalore, India, 2–4 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. https://doi.org/10.1109/CONECCT50063.2020.9198672.
[17] Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of Wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, California, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 5767–5777.
[18] Mescheder, L.; Geiger, A.; Nowozin, S. Which training methods for GANs do actually converge? In Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; PMLR: 2018; pp. 3481–3490.
[19] Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, California, USA, 15–20 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 4401–4410.
[20] Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Washington, USA, 13–19 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 8110–8119. https://doi.org/10.1109/CVPR42600.2020.00813.
[21] Karras, T.; Aittala, M.; Hellsten, J.; Laine, S.; Lehtinen, J.; Aila, T. Training generative adversarial networks with limited data. In Proceedings of the 34th International Conference on Neural Information Processing Systems, New York, USA, 6–12 December 2020; Curran Associates Inc.: Red Hook, NY, USA, 2020; pp. 12104–12114.
[22] Dixe, S.; Leite, J.; Azadi, S.; Faria, P.; Mendes, J.; Fonseca, J.C.; Borges, J.; Queiros, S. In-car damage dirt and stain estimation with RGB images. In Proceedings of the 13th International Conference on Agents and Artificial Intelligence, Vienna, Austria, 4–6 February 2021; Scitepress: 2021; pp. 672–679.
[23] Faria, P.; Dixe, S.; Leite, J.; Azadi, S.; Mendes, J.; Fonseca, J.C.; Borges, J.; Queiros, S. In-car state classification with RGB images. In Proceedings of the 20th International Conference on Intelligent Systems Design and Applications, [Location unknown], 12–15 December 2020; Springer: Cham, Switzerland, 2020; pp. 435–445.
[24] Xu, Q.; Huang, G.; Yuan, Y.; Guo, C.; Sun, Y.; Wu, F.; Zhang, C.; Lin, D. An empirical study on evaluation metrics of generative adversarial networks. arXiv 2018, arXiv:1806.07755. https://doi.org/10.48550/arXiv.1806.07755.
[25] Gretton, A.; Borgwardt, K.; Rasch, M.; Schölkopf, B.; Smola, A. A kernel method for the two-sample-problem. In Proceedings of the 20th International Conference on Neural Information Processing Systems, British Columbia, Canada, 4–9 December 2006; MIT Press: Cambridge, MA, USA, 2006; pp. 513–520.
[26] Lopez-Paz, D.; Oquab, M. Revisiting classifier two-sample tests. arXiv 2017, arXiv:1610.06545. https://doi.org/10.48550/arXiv.1610.06545.
[27] Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Proceedings of the 31st International Conference on Neural Information Processing Systems, California, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6629–6640.

There are 27 citations in total.

Details

Primary Language	English
Subjects	Image Processing, Distributed Systems and Algorithms, Autonomous Agents and Multiagent Systems
Journal Section	Research Articles
Authors	Sahar Azadi 0000-0001-7002-8496 Sandra Dixe 0000-0003-4595-3828 Joao Leite 0000-0003-1452-7842 Joao Borges 0000-0002-5880-033X Sandro Queiros 0000-0001-5259-1891 Jeime Fonseca 0000-0001-6703-3278
Project Number	UIDB/00319/2020
Early Pub Date	May 18, 2025
Publication Date	June 30, 2025
Submission Date	January 29, 2024
Acceptance Date	January 16, 2025
Published in Issue	Year 2025 Volume: 5 Issue: 1

Cite

Vancouver	Azadi S, Dixe S, Leite J, Borges J, Queiros S, Fonseca J. Automatic generation of in-vehicle images: StyleGAN-ADA vs. MSG-GAN. Computers and Informatics. 2025;5(1):23-31.

Download Cover Image

Article Files

Full Text

Computers and Informatics is licensed under CC BY-NC 4.0