Performance Analysis of CNN Channel Attention Modules for Image Classification Task

Mehmet Sarıgül

doi:10.21605/cukurovaumfd.1273688

Research Article

Görüntü Sınıflandırma Görevi için CNN Kanal Dikkat Modüllerinin Performans Analizi

Year 2023, Volume: 38 Issue: 1, 35 - 40, 30.03.2023

Mehmet Sarıgül

https://doi.org/10.21605/cukurovaumfd.1273688

Abstract

Popüler bir derin öğrenme modeli olan evrişimli sinir ağlarının temsil gücünün arttırılması, son zamanlarda sıcak çalışma konularından biridir. Kanal dikkati bu konuda izlenen yaygın bir stratejidir. Bu stratejide, konvolüsyon işleminden sonra yerleştirilen bir modül ile kanallar arası ilişkiden yararlanılır. Son zamanlarda, bu bağlamda başarılı kanal dikkat modülleri önerilmiştir. Bu makalede, üç popüler kanal dikkat yapısı olan Sıkıştır-ve-uyarım ağları (SeNet), Etkin Kanal Dikkat Ağları (Eca-Net) ve Evrişimsel Blok Dikkat Modülü (CBAM) üzerine beş farklı görüntü veriseti kullanılarak sınıflandırma görevi için performans analizi yapılmıştır. Elde edilen sonuçlara göre SeNet, deneylerin çoğunda diğerlerinin performansını geride bırakan en başarılı kanal dikkat modülü olmuştur. ResNet18 ve ResNet34 temel modelleriyle yapılan deneylerde, SeNet modülü beş veri kümesinden üçünde en yüksek performansı göstermiştir. ResNet50 temel modeli içinse SeNet, tüm veri kümeleri için en yüksek doğruluk değerlerine sahip kanal dikkat modülü olmuştur.

Keywords

Evrişimli sinir ağları, Kanal dikkati, Görüntü sınıflandırması

References

⦁ Hu, J., Shen, L., Sun, G., 2018. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7132-7141.
⦁ Wang, Q., Wu, B, Zhu, P., Li, P., Zuo, W., Hu Q., 2020. Supplementary Material for ‘ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, WA, USA, 13-19.
⦁ Woo, S., Park, J., Lee, J.Y., Kweon, I.S., 2018. Cbam: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision, 3-19.
⦁ He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
⦁ Krizhevsky, A., Sutskever, I., Hinton, G. E., 2017. Imagenet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 60(6), 84-90.
⦁ Peterson, L.E., 2009. K-Nearest Neighbor. Scholarpedia, 4(2), 1883.
⦁ Gardner, M. W., Dorling, S. R., 1998. Artificial Neural Networks (The Multilayer Perceptron)-A Review of Applications in the Atmospheric Sciences. Atmospheric Environment, 32(14-15), 2627-2636.
⦁ Afkham, H.M., Targhi, A.T., Eklundh, J.O., Pronobis, A., 2008. Joint Visual Vocabulary for Animal Classification. In 2008 19th International Conference on Pattern Recognition, 1-4.
⦁ Wang, J., Markert, K., Everingham M., 2009. Learning Models for Object Recognition from Natural Language Descriptions. In BMVC, 1, 2.
⦁ Nilsback M.E., Zisserman, A., 2006. A Visual Vocabulary for Flower Classification. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 1447-1454.
⦁ Lazebnik, S., Schmid, C., Ponce, J., 2005. A Maximum Entropy Framework for Part-based Texture and Object Recognition. In Tenth IEEE International Conference on Computer Vision. 1, 832-838.
⦁ Lazebnik, S., Schmid, C. Ponce, J., 2004. Semi-local Affine Parts for Object Recognition. In British Machine Vision Conference, 779-788.

Performance Analysis of CNN Channel Attention Modules for Image Classification Task

Year 2023, Volume: 38 Issue: 1, 35 - 40, 30.03.2023

Mehmet Sarıgül

https://doi.org/10.21605/cukurovaumfd.1273688

Abstract

Increasing the representation power of convolutional neural networks, a popular deep learning model, is one of the hot study topics recently. Channel attention is a common strategy followed in this regard. In this strategy, the inter-channel relationship is exploited by a module placed after the convolution operation. Recently, successful channel attention modules are proposed in this context. In this article, a performance analysis of three popular channel attention structures which are Squeeze-and-Excitation Networks (SeNet), Efficient Channel Attention Networks (Eca-Net), and Convolutional Block Attention Module (CBAM), is performed using five different image datasets for the classification task. According to the obtained results, SeNet is the most successful channel attention module surpassing the other’s performance in the majority of the experiments. In experiments with the ResNet18 and ResNet34 base models, the SeNet module showed the highest performance in three of the five datasets. For the ResNet50 baseline, SeNet was the channel attention module with the highest accuracy values for all datasets.

Keywords

Convolutional neural networks, Channel Attention, Image classification

References

⦁ Hu, J., Shen, L., Sun, G., 2018. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7132-7141.
⦁ Wang, Q., Wu, B, Zhu, P., Li, P., Zuo, W., Hu Q., 2020. Supplementary Material for ‘ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Seattle, WA, USA, 13-19.
⦁ Woo, S., Park, J., Lee, J.Y., Kweon, I.S., 2018. Cbam: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision, 3-19.
⦁ He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778.
⦁ Krizhevsky, A., Sutskever, I., Hinton, G. E., 2017. Imagenet Classification with Deep Convolutional Neural Networks. Communications of the ACM, 60(6), 84-90.
⦁ Peterson, L.E., 2009. K-Nearest Neighbor. Scholarpedia, 4(2), 1883.
⦁ Gardner, M. W., Dorling, S. R., 1998. Artificial Neural Networks (The Multilayer Perceptron)-A Review of Applications in the Atmospheric Sciences. Atmospheric Environment, 32(14-15), 2627-2636.
⦁ Afkham, H.M., Targhi, A.T., Eklundh, J.O., Pronobis, A., 2008. Joint Visual Vocabulary for Animal Classification. In 2008 19th International Conference on Pattern Recognition, 1-4.
⦁ Wang, J., Markert, K., Everingham M., 2009. Learning Models for Object Recognition from Natural Language Descriptions. In BMVC, 1, 2.
⦁ Nilsback M.E., Zisserman, A., 2006. A Visual Vocabulary for Flower Classification. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 1447-1454.
⦁ Lazebnik, S., Schmid, C., Ponce, J., 2005. A Maximum Entropy Framework for Part-based Texture and Object Recognition. In Tenth IEEE International Conference on Computer Vision. 1, 832-838.
⦁ Lazebnik, S., Schmid, C. Ponce, J., 2004. Semi-local Affine Parts for Object Recognition. In British Machine Vision Conference, 779-788.

There are 12 citations in total.

Details

Primary Language	English
Subjects	Engineering
Journal Section	Articles
Authors	Mehmet Sarıgül This is me 0000-0001-7323-6864
Publication Date	March 30, 2023
Published in Issue	Year 2023 Volume: 38 Issue: 1

Cite

APA	Sarıgül, M. (2023). Performance Analysis of CNN Channel Attention Modules for Image Classification Task. Çukurova Üniversitesi Mühendislik Fakültesi Dergisi, 38(1), 35-40. https://doi.org/10.21605/cukurovaumfd.1273688

Download Cover Image

Article Files

Full Text