Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review

Mirza Farhan Azhari; Gibran Maulana Syah; Sri Nurdiati; Elis Khatizah; Mohamad Khoirun Najib

doi:10.47000/tjmcs.1689564

Research Article

BibTex

RIS

Cite

Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review

Year 2025, Volume: 17 Issue: 2 , 472 - 484 , 30.12.2025

Mirza Farhan Azhari Gibran Maulana Syah Sri Nurdiati Elis Khatizah Mohamad Khoirun Najib

https://doi.org/10.47000/tjmcs.1689564

https://izlik.org/JA92KC44GG

Abstract

The classification of animal vocalizations through bioacoustic analysis has become a crucial tool in wildlife monitoring and conservation. This study presents a Systematic Literature Review (SLR) focused on the use of Convolutional Neural Networks (CNNs) for animal sound classification, synthesizing insights from 21 peer-reviewed journal articles published between 2016 and 2024. The review investigates the effectiveness of CNN models, the impact of different architectures (e.g., ResNet, VGG, MobileNet, Xception), commonly used evaluation metrics, and the advantages and limitations of CNN-based approaches. Results show that CNNs consistently outperform traditional classifiers by leveraging spectrogram-based feature extraction techniques and deep feature learning, achieving high classification accuracy across birds, mammals, amphibians, and marine species. Lightweight CNNs provide viable alternatives for real-time and resource-constrained applications, while ensemble learning and transfer learning further enhance model performance. Nonetheless, challenges persist, including computational cost, interpretability, and domain-specific generalization. This review underscores CNNs' growing role in ecological research and highlights areas for future improvement, particularly in data-scarce environments. The findings contribute to guiding future implementations and innovations in automated animal sound classification systems.

Keywords

deep learning , feature extraction , PRISMA , transfer learning , wavelet

References

Al-Emran, M., Mezhuyev, V., Kamaludin A., Technology acceptance model in M-learning context: A systematic review, Computers & Education, 125(2018), 389–412.
Allen, A.N., Harvey, M., Harrell, L., Jansen, A., Merkens, K.P. et al., A convolutional neural network for automated detection of humpback whale song in a diverse, long-term passive acoustic dataset, Frontiers in Marine Science, 8(2021).
Allen-Ankins, S., McKnight, D.T., Nordberg, E.J., Hoefer, S., Roe, P. et al., Effectiveness of acoustic indices as indicators of vertebrate biodiversity, Ecological Indicators, 147(2023), 109937.
Ardiyani, E., Nurdiati, S., Sopaheluwakan, A., Najib, M.K., Rohimahastuti, F., Probabilistic prediction model using bayesian inference in climate field: A systematic literature, JTAM (Jurnal Teori dan Aplikasi Matematika), 7(3)(2023), 602–618.
Arnaud, V., Pellegrino, F., Keenan, S., St-Gelais, X., Mathevon, N. et al., Improving the workflow to crack small, unbalanced, noisy, but genuine (SUNG) datasets in bioacoustics: The case of bonobo calls, PLoS Computational Biology, 19(2023).
Bergler, C., Schr¨oter, H., Cheng, R. X., Barth, V., Weber, M. et al., ORCA-SPOT: An automatic killer whale sound detection toolkit using deep learning, Scientific Reports, 9(1)(2019).
Bermant, P.C., Bronstein, M.M., Wood, R. J., Gero, S., Gruber, D.F., Deep machine learning techniques for the detection and classification of sperm whale bioacoustics, Scientific Reports, 9(2019), 12588.
Bravo Sanchez, F.J., Hossain, M.R., English, N.B., Moore, S.T., Bioacoustic classification of avian calls from raw sound waveforms with an open-source deep learning architecture, Scientific Reports, 11(2021), 15733.
Chao, K.W., Hu, N.Z., Chao, Y.C., Su, C.K., Chiu, W.H., Implementation of artificial intelligence for classification of frogs in bioacoustics, Symmetry, 11(12)(2019).
Clark, M.L., Salas, L., Baligar, S., Quinn, C.A., Snyder, R.L. et al., The effect of soundscape composition on bird vocalization classification in a citizen science biodiversity monitoring project, Ecological Informatics, 75(2023).
Cowans, A., Lambin, X., Hare, D., Sutherland, C., Improving the integration of artificial intelligence into existing ecological inference workflows, Methods in Ecology and Evolution, (2024).
Efremova, D.B., Sankupellay, M., Konovalov, D.A., Data-efficient classification of birdcall through convolutional neural networks transfer learning, In 2019 Digital Image Computing: Techniques and Applications (DICTA), (2019), 1–8.
Eichinski, P., Alexander, C., Roe, P., Parsons, S., Fuller, S., A convolutional neural network bird species recognizer built from little data by iteratively training, detecting, and labeling, Frontiers in Ecology and Evolution, 10(2022), 810330.
Ekpezu, A.O., Katsriku, F., Yaokumah, W., Wiafe, I., The use of machine learning algorithms in the classification of sound: A systematic review, International Journal of Service Science, Management, Engineering, and Technology, 13(2022), 1–28.
Ferdiana, R., Cat sounds classification with convolutional neural network, International Journal on Electrical Engineering and Informatics, 13(2021), 755–765.
Garcia, M., Favaro, L., Animal vocal communication: function, structures, and production mechanisms, Current Zoology, 63(2017), 417–419.
Ghani, B., Kalkman, V.J., Planqu´e, B., Vellinga, W.-P., Gill, L. et al., Generalization in birdsong classification: Impact of transfer learning methods and dataset characteristics, arXiv, (2024), 2409.15383.
Gong, C.-S.A., Su, C.-H.S., Chao, K.-W., Chao, Y.-C., Su, C.-K. et al., Exploiting deep neural network and long short-term memory methodologies in bioacoustic classification of LPC-based features, PLOS ONE, 16(2021), e0259140.
Hu, R., Hu, K.,Wang, L., Guan, Z., Zhou, X. et al., Using deep learning to classify environmental sounds in the habitat of western black-crested gibbons, Diversity, 16(2024), 509.
Kahl, S., Wood, C.M., Eibl, M., Klinck, H., BirdNET: A deep learning solution for avian diversity monitoring, Ecological Informatics, 61(2021), 101236.
Karaaslan, M., Turkoglu, B., Kaya, E., Asuroglu, T., Voice analysis in dogs with deep learning: development of a fully automatic voice analysis system for bioacoustics studies, Sensors, 24(2024), 7978.
Kitchenham, B., Charters, S., Guidelines for performing systematic literature reviews in software engineering, EBSE Technical Report, EBSE-2007-01, Keele University and University of Durham, (2007). Available: https://www.cs.york.ac.uk/ftpdir/reports/2007/YCS-2007-03.pdf.
Kitchenham, B., Pearl Brereton, O., Budgen, D., Turner, M., Bailey, J. et al., Systematic literature reviews in software engineering – A systematic literature review, Special Section - Most Cited Articles in 2002 and Regular Research Papers, 51(2009), 7—15.
Knight, E.C., Poo Hernandez, S., Bayne, E.M., Bulitko, V., Tucker, B.V., Pre-processing spectrogram parameters improve the accuracy of bioacoustic classification using convolutional neural networks, Bioacoustics, 29(2020), 337–355.
Kong, Q., Xu, Y., Plumbley, M.D. Joint detection and classification convolutional neural network on weakly labelled bird audio detection, In 2017 25th European Signal Processing Conference (EUSIPCO), (2017), 1749–1753.
Krizhevsky, A., Sutskever, I., Hinton, G.E., ImageNet classification with deep convolutional neural networks, Communications of the ACM, 60(6)(2017), 84–90.
Lakdari, M.W., Ahmad, A.H., Sethi, S., Bohn, G.A., Clink, D.J., Mel-frequency cepstral coefficients outperform embeddings from pre-trained convolutional neural networks under noisy conditions for discrimination tasks of individual gibbons, Ecological Informatics, 80(2024), 102457.
Lauha, P., Somervuo, P., Lehikoinen, P., Geres, L., Richter, T. et al., Domain-specific neural networks improve automated bird sound recognition already with small amount of local data, Methods in Ecology and Evolution, 13(12)(2022), 2799–2810.
LeBien, J., Zhong, M., Campos-Cerqueira, M., Velev, J.P., Dodhia, R. et al., A pipeline for identification of bird and frog species in tropical soundscape recordings using a convolutional neural network, Ecological Informatics, 59(2020), 101113.
Libal, U., Biernacki, P., Non-intrusive system for honeybee recognition based on audio signals and maximum likelihood classification by autoencoder, Sensors, 24(2024), 5389.
Liu, J., Zhang, Y., Lv, D., Lu, J., Xie, S. et al., Birdsong classification based on ensemble multi-scale convolutional neural network, Scientific Reports, 12(2022), 8636.
Luis, C., Maira, G., Rafa, L., Traditional and modern processing of digital signals and images for the classification of birds from singing, International Journal of Applied Science and Engineering, 21(1)(2024).
Mac Aodha, O., Gibb, R., Barlow, K. E., Browning, E., Firman, M. et al., Bat detective—Deep learning tools for bat acoustic signal detection, PLOS Computational Biology, 14(2018), e1005995.
MacIsaac, J., Newson, S., Ashton-Butt, A., Pearce, H., Milner, B., Improving acoustic species identification using data augmentation within a deep learning framework, Ecological Informatics, 83(2024), 102851.
Mahbub, T., Bhagwagar, A., Chand, P., Zualkernan, I., Judas, J. et al., Bat2Web: A framework for real-time classification of bat species echolocation signals using audio sensor data, Sensors, 24(2024), 2899.
Manriquez P.R., Kotz, S.A., Ravignani, A., De Boer, B., Bioacoustic classification of a small dataset of mammalian vocalisations using deep learning, Bioacoustics, 33(2024), 354–371.
Mao, A., Giraudet, C.S., Liu, K., De Almeida Nolasco, I., Xie, Z. et al., Automated identification of chicken distress vocalizations using deep learning models, Journal of the Royal Society Interface, 19(2022), 20210921.
Merchan, F., Guerra, A., Poveda, H., Guzm´an, H.M., Sanchez-Galan, J.E., Bioacoustic classification of antillean manatee vocalization spectrograms using deep convolutional neural networks, Applied Sciences, 10(2020), 3286.
Monczak, A., Ji, Y., Soueidan, J., Montie, E.W., Automatic detection, classification, and quantification of sciaenid fish calls in an estuarine soundscape in the Southeast United States, PLOS ONE, 14(2019), e0209914.
Najib, M.K., Nurdiati, S., Sopaheluwakan, A., Copula in wildfire analysis: a systematic literature review, InPrime: Indonesian Journal of Pure and Applied Mathematics, 3(2021), 101–111.
Nanni, L., Maguolo, G., Paci, M., Data augmentation approaches for improving animal audio classification, Ecological Informatics, 57(2020), 101084.
Nolan, V., Scott, C., Yeiser, J.M., Wilhite, N., Howell, P. E. et al., The development of a convolutional neural network for the automatic detection of Northern Bobwhite Colinus virginianus covey calls, Remote Sensing in Ecology and Conservation, 9(1)(2023), 46–61.
Ntalampiras, S., Kosmin, D., Sanchez, J., Acoustic classification of individual cat vocalizations in evolving environments, In 2021 44th International Conference on Telecommunications and Signal Processing (TSP), (2021), 254–258.
Ozanich, E., Thode, A., Gerstoft, P., Freeman, L.A., Freeman, S. Deep embedded clustering of coral reef bioacoustics, The Journal of the Acoustical Society of America, 149(2021), 2587–2601.
Pandeya, Y.R, Domestic cat sound classification using learned features from deep neural nets, Applied Sciences (Switzerland), 8(2018).
Robinson, D., Miron, M., Hagiwara, M., Weck, B., Keen, S. et al., NatureLM-audio: an Audio-language foundation model for bioacoustics, ArXiv.
Romero-Mujalli, D., Bergmann, T., Zimmermann, A., Scheumann, M., Utilizing DeepSqueak for automatic detection and classification of mammalian vocalizations: a case study on primate vocalizations, Scientific Reports, 11(2021), 24463.
Ruff, Z.J., Lesmeister, D.B., Appel, C. L., Sullivan, C.M., Workflow and convolutional neural network for automated identification of animal sounds, Ecological Indicators, 124(2021).
Ruff, Z.J., Lesmeister, D.B., Duchac, L. S., Padmaraju, B.K., Sullivan, C.M., Automated identification of avian vocalizations with deep convolutional neural networks, Remote Sensing in Ecology and Conservation, 6(2020), 79–92.
Saad, A., Ahmed, J., Elaraby, A., Classification of bird sound using high-and low-complexity convolutional neural networks, Traitement Du Signal, 39(2022), 187–193.
Salem, S.I., Shirayama, S., Shimazaki, S., Oki, K. Ensemble deep learning and anomaly detection framework for automatic audio classification: Insights into deer vocalizations, Ecological Informatics, 84(2024), 102883.
Schall, E., Kaya, I.I., Debusschere, E., Devos, P., Parcerisas, C., Deep learning in marine bioacoustics: a benchmark for baleen whale detection, Remote Sensing in Ecology and Conservation, 10(5)(2024), 642–654.
Spillmann, B., van Schaik, C.P., Setia, T.M., Sadjadi, S.O., Who shall I say is calling? Validation of a caller recognition procedure in Bornean flanged male orangutan (Pongo pygmaeus wurmbii) long calls, Bioacoustics, 26(2)(2017), 109–120.
Stowell, D., Computational bioacoustics with deep learning: a review and roadmap, PeerJ, 10(2022), e13152.
Torres-Carrion, P.V., Gonzalez-Gonzalez, C.S., Aciar, S., Rodriguez-Morales, G., Methodology for systematic literature review applied to engineering and education, 2018 IEEE Global Engineering Education Conference (EDUCON), (2018), 1364–1373.
Tosato, G., Shehata, A., Janssen, J., Kamp, K., Jati, P. et al., Auto deep learning for bioacoustic signals, arXiv, (2023), 2311.04945.
Trapanotto, M., Nanni, L., Brahnam, S., Guo, X., Convolutional neural networks for the identification of African lions from individual vocalizations, Journal of Imaging, 8(2022), 96.
van Wee, B., Banister, D., How to write a literature review paper?, Transport Reviews, 36(2016), 278–288.
Wei, X., Hossain, M.Z., Ahmed, K.A., A ResNet attention model for classifying mosquitoes from wing-beating sounds, Scientific Reports, 12(1)(2022).
White, E.L., White, P.R., Bull, J.M., Risch, D., Beck, S. et al., More than a whistle: Automated detection of marine sound sources with a convolutional neural network, Frontiers in Marine Science, 9(2022).
Wohlin, C., Guidelines for snowballing in systematic literature studies and a replication in software engineering, In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering (EASE), (2014), 1–10.
Wood, C.M., Popescu, V.D., Klinck, H., Keane, J.J., Guti´errez, R.J. et al., Detecting small changes in populations at landscape scales: a bioacoustic site-occupancy framework, Ecological Indicators, 98(2019), 492–507.
Xiao, Y., Watson, M., Guidance on conducting a systematic literature review, Journal of Planning Education and Research, 39(2019), 93–112.
Xie, B., Daunay, V., Petersen, T.C., Briefer, E.F., Vocal repertoire and individuality in the plains zebra (Equus quagga), Royal Society Open Science, 11(2024), 240477.
Xie, J., Towsey, M., Zhang, J., Dong, X., Roe, P., Application of image processing techniques for frog call classification, In Proceedings - International Conference on Image Processing, ICIP, (2015), 4190–4194.
Xie, J., Zhu, M., Acoustic classification of bird species using an early fusion of deep features, Birds, 4(2023), 138–147.
Yang, Q., Chen, X., Ma, C., Duarte, C. M., Zhang, X., Advanced framework for animal sound classification with features optimization, ArXiv,(2024).
Yin, Y., Ji, N., Wang, X., Shen, W., Dai, B. et al., An investigation of fusion strategies for boosting pig cough sound recognition, Computers and Electronics in Agriculture, 205(2023).
Zaheer, R., Ahmad, I., Habibi, D., Islam, K.Y., Phung, Q.V., A survey on artificial intelligence-based acoustic source identification, IEEE Access, 11(2023), 60078–60108.
Zhong, M., Castellote, M., Dodhia, R., Lavista Ferres, J., Keogh, M. et al., Beluga whale acoustic signal classification using deep learning neural network models, The Journal of the Acoustical Society of America, 147(2020), 1834–1841.

There are 70 citations in total.

Details

Primary Language	English
Subjects	Artificial Intelligence (Other)
Journal Section	Research Article
Authors	Mirza Farhan Azhari This is me 0009-0001-4917-842X Gibran Maulana Syah This is me 0009-0007-4815-6855 Sri Nurdiati This is me 0000-0001-9571-7060 Elis Khatizah This is me 0000-0003-4132-1495 Mohamad Khoirun Najib 0000-0002-4372-4661
Submission Date	May 7, 2025
Acceptance Date	October 13, 2025
Publication Date	December 30, 2025
DOI	https://doi.org/10.47000/tjmcs.1689564
IZ	https://izlik.org/JA92KC44GG
Published in Issue	Year 2025 Volume: 17 Issue: 2

Cite

APA	Azhari, M. F., Syah, G. M., Nurdiati, S., Khatizah, E., & Najib, M. K. (2025). Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review. Turkish Journal of Mathematics and Computer Science, 17(2), 472-484. https://doi.org/10.47000/tjmcs.1689564
AMA	1.Azhari MF, Syah GM, Nurdiati S, Khatizah E, Najib MK. Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review. TJMCS. 2025;17(2):472-484. doi:10.47000/tjmcs.1689564
Chicago	Azhari, Mirza Farhan, Gibran Maulana Syah, Sri Nurdiati, Elis Khatizah, and Mohamad Khoirun Najib. 2025. “Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review”. Turkish Journal of Mathematics and Computer Science 17 (2): 472-84. https://doi.org/10.47000/tjmcs.1689564.
EndNote	Azhari MF, Syah GM, Nurdiati S, Khatizah E, Najib MK (December 1, 2025) Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review. Turkish Journal of Mathematics and Computer Science 17 2 472–484.
IEEE	[1]M. F. Azhari, G. M. Syah, S. Nurdiati, E. Khatizah, and M. K. Najib, “Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review”, TJMCS, vol. 17, no. 2, pp. 472–484, Dec. 2025, doi: 10.47000/tjmcs.1689564.
ISNAD	Azhari, Mirza Farhan - Syah, Gibran Maulana - Nurdiati, Sri - Khatizah, Elis - Najib, Mohamad Khoirun. “Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review”. Turkish Journal of Mathematics and Computer Science 17/2 (December 1, 2025): 472-484. https://doi.org/10.47000/tjmcs.1689564.
JAMA	1.Azhari MF, Syah GM, Nurdiati S, Khatizah E, Najib MK. Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review. TJMCS. 2025;17:472–484.
MLA	Azhari, Mirza Farhan, et al. “Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review”. Turkish Journal of Mathematics and Computer Science, vol. 17, no. 2, Dec. 2025, pp. 472-84, doi:10.47000/tjmcs.1689564.
Vancouver	1.Mirza Farhan Azhari, Gibran Maulana Syah, Sri Nurdiati, Elis Khatizah, Mohamad Khoirun Najib. Bioacoustic Sound Classification Based on Noise and Vocalizations Using Convolutional Neural Networks: A Comprehensive Systematic Review. TJMCS. 2025 Dec. 1;17(2):472-84. doi:10.47000/tjmcs.1689564

Article Files

Full Text