Research Article

Real-Time Auditory Scene Analysis using Continual Learning in Real Environments

August 15, 2020
  • Barış Bayram
  • Gökhan İnce *
TR EN

Real-Time Auditory Scene Analysis using Continual Learning in Real Environments

Abstract

Continual learning for scene analysis is a continuous process to incrementally learn distinct events, actions, and even noise models from past experiences using different sensory modalities. In this paper, an Auditory Scene Analysis (ASA) approach based on a continual learning system is developed to incrementally learn the acoustic events in a dynamically-changing domestic environment. The events being salient sound sources are localized by a Sound Source Localization (SSL) method to robustly process the signals of the localized sound source in the domestic scene where multiple sources can co-exist. For real-time ASA, audio patterns are segmented from the acoustic signal stream of the localized source for extraction of the audio features, and construction of a feature set for each pattern. The continual learning is employed via a time-series algorithm, Hidden Markov Model (HMM), on these feature sets from acoustic signals stemming from the sources. The learning process is investigated by conducting a variety of experiments to evaluate the performance of Unknown Event Detection (UED), Acoustic Event Recognition (AER), and continual learning using a Hierarchical HMM algorithm. The Hierarchical HMM consists of two layers: 1) a lower layer in which AER is performed using an HMM for each event and the event-wise likelihood thresholds; and 2) an upper layer in which UED is achieved by one HMM with a suspicion threshold through the audio features with their proto symbols stemming from the lower layer HMMs. We verified the effectiveness of the proposed system capable of continual learning, AER and UED in terms of False-Positive Rates, True-Positive Rates, recognition accuracy and computational time to meet the demands in a learning task of multiple events in real-time. The effectiveness of the AER system has been verified with high accuracy, and a short retraining time in real-time ASA having nine different sounds.

Keywords

References

  1. Salamon J, Jacoby C, Bello JP. A dataset and taxonomy for urban soundresearch. In: Proceedings of the 22nd ACM international conference onMultimedia 2014, pp. 1041-1044.
  2. Young SH, Scanlon MV. Robotic vehicle uses acoustic array for detec-tion and localization in urban environments. Unmanned Ground Vehicle Technology III, International Society for Optics and Photonics 2001; 4364: pp. 264-273.
  3. D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange and M. D. Plumbley, Detection and Classification of Acoustic Scenes and Events, IEEE Trans. Multimedia, vol. 17, no. 10, pp. 1733-1746, 2015.
  4. Wang JC, Lee HP, ang JF, Lin CB. Robust environmental sound recognition for home automation. IEEE Transactions on Automation Science and Engineering 2008; 5 (1): 25-31.
  5. Sinapov J, Weimer M, Stoytchev A. Interactive learning of the acoustic properties of objects by a robot. In: Procceedings of the RSS Workshopon Robot Manipulation: Intelligence in Human Environments 2008. doi:10.1109/ROBOT.2009.5152802
  6. Lee, CH, Han, CC, Chuang, CC. Automatic classification of bird species from their sounds using two-dimensional cepstral coefficients, IEEE Transactions on Audio, Speech and Language Processing 2008; 16(8):1541-1550.
  7. R. Radhakrishnan, A. Divakaran and P. Smaragdis, Audio Analysis for Surveillance Applications, in Proc. IEEE Workshop Appl. Signal Process. Audio Acoust., pp. 158-161, 2005.
  8. Carletti V, Foggia P, Percannella G, Saggese A, Strisciuglio N et al. Audio surveillance using a bag of aural words classifier. 10th IEEE International Conference on Advanced Video and Signal Based Surveillance 2013, pp. 81-86.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Authors

Barış Bayram This is me
0000-0002-5588-577X
Türkiye

Gökhan İnce * This is me
0000-0002-0034-030X
Türkiye

Publication Date

August 15, 2020

Submission Date

June 28, 2020

Acceptance Date

August 10, 2020

Published in Issue

Year 2020

APA
Bayram, B., & İnce, G. (2020). Real-Time Auditory Scene Analysis using Continual Learning in Real Environments. Avrupa Bilim Ve Teknoloji Dergisi, 215-226. https://doi.org/10.31590/ejosat.779710

Cited By