Which pooling method is better: Max, Avg, or Concat (Max, Avg)

Yahya Doğan

doi:10.33769/aupse.1356138

Research Article

BibTex

RIS

Cite

Which pooling method is better: Max, Avg, or Concat (Max, Avg)

Year 2024, Volume: 66 Issue: 1, 95 - 117, 14.06.2024

Yahya Doğan

https://doi.org/10.33769/aupse.1356138

Abstract

Pooling is a non-linear operation that aggregates the results of a given region to a single value. This method effectively removes extraneous details in feature maps while keeping the overall information. As a result, the size of feature maps is reduced, which decreases computing costs and prevents overfitting by eliminating irrelevant data. In CNN models, the max pooling and average pooling methods are commonly utilized. The max pooling selects the highest value within the pooling area and aids in preserving essential features of the image. However, it ignores the other values inside the pooling region, resulting in a significant loss of information. The average pooling computes the average values within the pooling area, which reduces data loss. However, by failing to emphasize critical pixels in the image, it may result in the loss of significant features. To examine the performance of pooling methods, this study comprised the experimental analysis of multiple models, i.e. shallow and deep, datasets, i.e. Cifar10, Cifar100, and SVHN, and pool sizes, e.g. $2x2$, $3x3$, $10x10$. Furthermore, the study investigated the effectiveness of combining two approaches, namely Concat (Max, Avg), to minimize information loss. The findings of this work provide an important guideline for selecting pooling methods in the design of CNNs. The experimental results demonstrate that pooling methods have a considerable impact on model performance. Moreover, there are variances based on the model and pool size.

Keywords

Pooling, deep learning, convolutional neural network

References

Atas, I., Human gender prediction based on deep transfer learning from panoramic dental radiograph images, Trait. du Signal, 39 (5) (2022), 1585, http://dx.doi.org/10.18280/ts.390515.
Atas, M., Ozdemir, C., Atas, I., Ak, B., Ozeroglu, E, Biometric identification using panoramic dental radiographic images withfew-shot learning, Turk. J. Electr. Eng., 30 (3) (2022), 1115- 1126, http://dx.doi.org/10.55730/1300-0632.3830.
Ozdemir, C., Gedik, M. A., Kaya, Y., Age estimation from left-hand radiographs with deep learning methods, Trait. du Signal, 38 (6) (2021), http://dx.doi.org/10.18280/ts.380601.
Krizhevsky, A., Sutskever, I., Hinton, G. E., Imagenet classification with deep convolutional neural networks, Commun. ACM, 60 (6) (2017), 84-90, http://dx.doi.org/10.1145/3065386.
Tolstikhin, I. O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Dosovitskiy, A., Mlp-mixer: An all-mlp architecture for vision, Adv. Neural Inf. Process. Syst., 34 (2021), 24261-24272, https://arxiv.org/abs/2105.01601.
Meng, L., Li, H., Chen, B. C., Lan, S., Wu, Z., Jiang, Y. G., Lim, S. N., Adavit: Adaptive vision transformers for efficient image recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 12309-12318, http://dx.doi.org/10.1109/cvpr52688.2022.01199.
Krizhevsky, A., Nair, V., rey Hinton, G., CIFAR-10 dataset, (2014), Available at: https://www.cs.toronto.edu/ kriz/cifar.html.
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A., The street view house numbers (SVHN) dataset, (2016). Available at: https://www.kaggle.com/datasets/stanfordu/streetview-house-numbers.
Akhtar, N., Ragavendran, U., Interpretation of intelligence in cnn-pooling processes: a methodological survey, Neural. Comput. Appl., 32 (3) (2020), 879-898, http://dx.doi.org/10.1007/s00521-019-04296-5.
Yu, D., Wang, H., Chen, P., Wei, Z., Mixed pooling for convolutional neural networks, International Conference on Rough Sets and Knowledge Technology, (2014), 364-375, http://dx.doi.org/10.1007/978-3-319-11740-9 34.
Dogan Y., A new global pooling method for deep neural networks: Global average of top-kmax-pooling, Trait. du Signal, 40 (2) (2023), 577-587, http://dx.doi.org/10.18280/ts.400216.
Saeedan, F., Weber, N., Goesele, M., Roth, S., Detail-preserving pooling in deep networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018), 9108-9116, http://dx.doi.org/10.1109/cvpr.2018.00949.
He, K., Zhang, X., Ren, S., Sun, J., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37 (9) (2015), 1904-1916, http://dx.doi.org/10.1109/tpami.2015.2389824.
Sun, M., Song, Z., Jiang, X., Pan, J., Pang, Y. Learning pooling for convolutional neural network, Neurocomputing, 224, (2017), 96-104, http://dx.doi.org/10.1016/j.neucom.2016.10.049.
Wang, F., Huang, S., Shi, L., Fan, W., The application of series multi-pooling convolutional neural networks for medical image segmentation, Int. J. Distrib. Sens. Netw., 13 (12) (2017), http://dx.doi.org/10.1177/1550147717748899.
Ozdemir, C., Avg-topk: A new pooling method for convolutional neural networks, Expert Syst. Appl., (2023), 119892, http://dx.doi.org/10.1016/j.eswa.2023.119892.
Sermanet, P., Chintala, S., LeCun, Y., Convolutional neural networks applied to house numbers digit classification, Proceedings of the 21st International Conference on Pattern Recognition, (2012), 3288-3291, https://doi.org/10.48550/arXiv.1204.3968.
Fei, J., Fang, H., Yin, Q., Yang, C., Wang, D., Restricted stochastic pooling for convolutional neural network, Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, (2018), 1-4, http://dx.doi.org/10.1145/3240876.3240919.
Wu, H., Gu, X., Max-pooling dropout for regularization of convolutional neural networks, International Conference on Neural Information Processing, (2015), 46-54, http://dx.doi.org/10.1007/978-3-319-26532-2 6.
Song, Z., Liu, Y., Song, R., Chen, Z., Yang, J., Zhang, C., Jiang, Q., A sparsitybased stochastic pooling mechanism for deep convolutional neural networks, Neural Netw., 105 (2018), 340-345, http://dx.doi.org/10.1016/j.neunet.2018.05.015.
Tong, Z., Aihara, K., Tanaka, G., A hybrid pooling method for convolutional neural networks, International Conference on Neural Information Processing, (2016), 454-461, http://dx.doi.org/10.1007/978-3-319-46672-9 51.
Shahriari, A., Porikli, F., Multipartite pooling for deep convolutional neural networks, arXiv:1710.07435, (2017), http://arxiv.org/abs/1710.07435.
Kumar, A., Ordinal pooling networks: for preserving information over shrinking feature maps, arXiv:1804.02702, (2018), http://arxiv.org/abs/1804.02702.
Kolesnikov, A., Lampert, C. H. Seed, Expand and constrain: three principles for weakly supervised image segmentation, European Conference on Computer Vision, (2016), 695-711, http://dx.doi.org/10.1007/978-3-319-46493-0 42.
Williams, T., Li, R., Wavelet pooling for convolutional neural networks, International Conference on Learning Representations, (2018).
Rippel, O., Snoek, J., Adams, R. P., Spectral representations for convolutional neural networks, Adv. Neural Inf. Process. Syst., (2015), 28, https://doi.org/10.48550/arXiv.1506.03767.
Wang, Z., Lan, Q., Huang, D., Wen, M., Combining fft and spectral-pooling for efficient convolution neural network model, 2016 2nd International Conference on Artificial Intelligence and Industrial Engineering (AIIE), (2016), 203-206, http://dx.doi.org/10.2991/aiie16.2016.47.
Simonyan, K., Zisserman, A., Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556, (2014), https://doi.org/10.48550/arXiv.1409.1556.
He, K., Zhang, X., Ren, S., Sun, J., Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), (2016), 770-778, http://dx.doi.org/10.1109/cvpr.2016.90.
Tan, M., Le, Q., Efficientnet: Rethinking model scaling for convolutional neural networks, International conference on machine learning (ICML), (2019), 6105-6114, https://doi.org/10.48550/arXiv.1905.11946.
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 86 (1998), 2278–2324, http://dx.doi.org/10.1109/5.726791.
Nair, V., Hinton, G. E., Rectified linear units improve restricted boltzmann machines, Proceedings of the 27th international conference on machine learning (ICML), (2010), 807-814.
Bottou, L., Stochastic gradient descent tricks, Neural Networks: Tricks of the Trade: Second Edition, (2012), 421-436, http://dx.doi.org/10.1007/978-3-642-35289-8 25.
Boureau Y. L., Le Roux N., Bach F., Ponce J., LeCun Y., Ask the locals: multiway local pooling for image recognition, in Computer Vision, IEEE International Conference, (2011), 2651-2658, http://dx.doi.org/10.1109/iccv.2011.6126555.

Year 2024, Volume: 66 Issue: 1, 95 - 117, 14.06.2024

Yahya Doğan

https://doi.org/10.33769/aupse.1356138

Abstract

References

Atas, I., Human gender prediction based on deep transfer learning from panoramic dental radiograph images, Trait. du Signal, 39 (5) (2022), 1585, http://dx.doi.org/10.18280/ts.390515.
Atas, M., Ozdemir, C., Atas, I., Ak, B., Ozeroglu, E, Biometric identification using panoramic dental radiographic images withfew-shot learning, Turk. J. Electr. Eng., 30 (3) (2022), 1115- 1126, http://dx.doi.org/10.55730/1300-0632.3830.
Ozdemir, C., Gedik, M. A., Kaya, Y., Age estimation from left-hand radiographs with deep learning methods, Trait. du Signal, 38 (6) (2021), http://dx.doi.org/10.18280/ts.380601.
Krizhevsky, A., Sutskever, I., Hinton, G. E., Imagenet classification with deep convolutional neural networks, Commun. ACM, 60 (6) (2017), 84-90, http://dx.doi.org/10.1145/3065386.
Tolstikhin, I. O., Houlsby, N., Kolesnikov, A., Beyer, L., Zhai, X., Unterthiner, T., Dosovitskiy, A., Mlp-mixer: An all-mlp architecture for vision, Adv. Neural Inf. Process. Syst., 34 (2021), 24261-24272, https://arxiv.org/abs/2105.01601.
Meng, L., Li, H., Chen, B. C., Lan, S., Wu, Z., Jiang, Y. G., Lim, S. N., Adavit: Adaptive vision transformers for efficient image recognition, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2022), 12309-12318, http://dx.doi.org/10.1109/cvpr52688.2022.01199.
Krizhevsky, A., Nair, V., rey Hinton, G., CIFAR-10 dataset, (2014), Available at: https://www.cs.toronto.edu/ kriz/cifar.html.
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A., The street view house numbers (SVHN) dataset, (2016). Available at: https://www.kaggle.com/datasets/stanfordu/streetview-house-numbers.
Akhtar, N., Ragavendran, U., Interpretation of intelligence in cnn-pooling processes: a methodological survey, Neural. Comput. Appl., 32 (3) (2020), 879-898, http://dx.doi.org/10.1007/s00521-019-04296-5.
Yu, D., Wang, H., Chen, P., Wei, Z., Mixed pooling for convolutional neural networks, International Conference on Rough Sets and Knowledge Technology, (2014), 364-375, http://dx.doi.org/10.1007/978-3-319-11740-9 34.
Dogan Y., A new global pooling method for deep neural networks: Global average of top-kmax-pooling, Trait. du Signal, 40 (2) (2023), 577-587, http://dx.doi.org/10.18280/ts.400216.
Saeedan, F., Weber, N., Goesele, M., Roth, S., Detail-preserving pooling in deep networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), (2018), 9108-9116, http://dx.doi.org/10.1109/cvpr.2018.00949.
He, K., Zhang, X., Ren, S., Sun, J., Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 37 (9) (2015), 1904-1916, http://dx.doi.org/10.1109/tpami.2015.2389824.
Sun, M., Song, Z., Jiang, X., Pan, J., Pang, Y. Learning pooling for convolutional neural network, Neurocomputing, 224, (2017), 96-104, http://dx.doi.org/10.1016/j.neucom.2016.10.049.
Wang, F., Huang, S., Shi, L., Fan, W., The application of series multi-pooling convolutional neural networks for medical image segmentation, Int. J. Distrib. Sens. Netw., 13 (12) (2017), http://dx.doi.org/10.1177/1550147717748899.
Ozdemir, C., Avg-topk: A new pooling method for convolutional neural networks, Expert Syst. Appl., (2023), 119892, http://dx.doi.org/10.1016/j.eswa.2023.119892.
Sermanet, P., Chintala, S., LeCun, Y., Convolutional neural networks applied to house numbers digit classification, Proceedings of the 21st International Conference on Pattern Recognition, (2012), 3288-3291, https://doi.org/10.48550/arXiv.1204.3968.
Fei, J., Fang, H., Yin, Q., Yang, C., Wang, D., Restricted stochastic pooling for convolutional neural network, Proceedings of the 10th International Conference on Internet Multimedia Computing and Service, (2018), 1-4, http://dx.doi.org/10.1145/3240876.3240919.
Wu, H., Gu, X., Max-pooling dropout for regularization of convolutional neural networks, International Conference on Neural Information Processing, (2015), 46-54, http://dx.doi.org/10.1007/978-3-319-26532-2 6.
Song, Z., Liu, Y., Song, R., Chen, Z., Yang, J., Zhang, C., Jiang, Q., A sparsitybased stochastic pooling mechanism for deep convolutional neural networks, Neural Netw., 105 (2018), 340-345, http://dx.doi.org/10.1016/j.neunet.2018.05.015.
Tong, Z., Aihara, K., Tanaka, G., A hybrid pooling method for convolutional neural networks, International Conference on Neural Information Processing, (2016), 454-461, http://dx.doi.org/10.1007/978-3-319-46672-9 51.
Shahriari, A., Porikli, F., Multipartite pooling for deep convolutional neural networks, arXiv:1710.07435, (2017), http://arxiv.org/abs/1710.07435.
Kumar, A., Ordinal pooling networks: for preserving information over shrinking feature maps, arXiv:1804.02702, (2018), http://arxiv.org/abs/1804.02702.
Kolesnikov, A., Lampert, C. H. Seed, Expand and constrain: three principles for weakly supervised image segmentation, European Conference on Computer Vision, (2016), 695-711, http://dx.doi.org/10.1007/978-3-319-46493-0 42.
Williams, T., Li, R., Wavelet pooling for convolutional neural networks, International Conference on Learning Representations, (2018).
Rippel, O., Snoek, J., Adams, R. P., Spectral representations for convolutional neural networks, Adv. Neural Inf. Process. Syst., (2015), 28, https://doi.org/10.48550/arXiv.1506.03767.
Wang, Z., Lan, Q., Huang, D., Wen, M., Combining fft and spectral-pooling for efficient convolution neural network model, 2016 2nd International Conference on Artificial Intelligence and Industrial Engineering (AIIE), (2016), 203-206, http://dx.doi.org/10.2991/aiie16.2016.47.
Simonyan, K., Zisserman, A., Very deep convolutional networks for large-scale image recognition, arXiv:1409.1556, (2014), https://doi.org/10.48550/arXiv.1409.1556.
He, K., Zhang, X., Ren, S., Sun, J., Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), (2016), 770-778, http://dx.doi.org/10.1109/cvpr.2016.90.
Tan, M., Le, Q., Efficientnet: Rethinking model scaling for convolutional neural networks, International conference on machine learning (ICML), (2019), 6105-6114, https://doi.org/10.48550/arXiv.1905.11946.
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., Gradient-based learning applied to document recognition, Proc. IEEE, 86 (1998), 2278–2324, http://dx.doi.org/10.1109/5.726791.
Nair, V., Hinton, G. E., Rectified linear units improve restricted boltzmann machines, Proceedings of the 27th international conference on machine learning (ICML), (2010), 807-814.
Bottou, L., Stochastic gradient descent tricks, Neural Networks: Tricks of the Trade: Second Edition, (2012), 421-436, http://dx.doi.org/10.1007/978-3-642-35289-8 25.
Boureau Y. L., Le Roux N., Bach F., Ponce J., LeCun Y., Ask the locals: multiway local pooling for image recognition, in Computer Vision, IEEE International Conference, (2011), 2651-2658, http://dx.doi.org/10.1109/iccv.2011.6126555.

There are 34 citations in total.

Details

Primary Language	English
Subjects	Information Systems (Other)
Journal Section	Research Articles
Authors	Yahya Doğan 0000-0003-1529-6118
Early Pub Date	April 7, 2024
Publication Date	June 14, 2024
Submission Date	September 6, 2023
Acceptance Date	November 17, 2023
Published in Issue	Year 2024 Volume: 66 Issue: 1

Cite

APA	Doğan, Y. (2024). Which pooling method is better: Max, Avg, or Concat (Max, Avg). Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, 66(1), 95-117. https://doi.org/10.33769/aupse.1356138
AMA	Doğan Y. Which pooling method is better: Max, Avg, or Concat (Max, Avg). Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng. June 2024;66(1):95-117. doi:10.33769/aupse.1356138
Chicago	Doğan, Yahya. “Which Pooling Method Is Better: Max, Avg, or Concat (Max, Avg)”. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 66, no. 1 (June 2024): 95-117. https://doi.org/10.33769/aupse.1356138.
EndNote	Doğan Y (June 1, 2024) Which pooling method is better: Max, Avg, or Concat (Max, Avg). Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 66 1 95–117.
IEEE	Y. Doğan, “Which pooling method is better: Max, Avg, or Concat (Max, Avg)”, Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng., vol. 66, no. 1, pp. 95–117, 2024, doi: 10.33769/aupse.1356138.
ISNAD	Doğan, Yahya. “Which Pooling Method Is Better: Max, Avg, or Concat (Max, Avg)”. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 66/1 (June 2024), 95-117. https://doi.org/10.33769/aupse.1356138.
JAMA	Doğan Y. Which pooling method is better: Max, Avg, or Concat (Max, Avg). Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng. 2024;66:95–117.
MLA	Doğan, Yahya. “Which Pooling Method Is Better: Max, Avg, or Concat (Max, Avg)”. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, vol. 66, no. 1, 2024, pp. 95-117, doi:10.33769/aupse.1356138.
Vancouver	Doğan Y. Which pooling method is better: Max, Avg, or Concat (Max, Avg). Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng. 2024;66(1):95-117.

Article Files

Full Text

Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.