Videoların Derin Öğrenme ile Sınıflandırılarak Filtrelenmesi

Murat Kazanç; Tolga Ensari; Mustafa Dağtekin

doi:10.31590/ejosat.952481

Konferans Bildirisi

Videoların Derin Öğrenme ile Sınıflandırılarak Filtrelenmesi

Yıl 2021, Sayı: 26 - Ejosat Özel Sayı 2021 (HORA), 338 - 342, 31.07.2021

Murat Kazanç , Tolga Ensari , Mustafa Dağtekin

https://doi.org/10.31590/ejosat.952481

Öz

Bu çalışmada derin öğrenme metodu olan evrişimli sinir ağları (Convolutional Neural Networks - CNN) ve transfer öğrenme metodu kullanılarak tütün mamulleri, alkollü içecek ve silah gibi istenmeyen nesnelerin tespitini ve sınıflamasını yapan bir model geliştirilmiştir. Bu model Tensorflow JS’e dönüştürülerek, internet tarayıcısı için bir eklenti olarak geliştirilmiştir. Bu eklenti ile izlenen videolardan anlık görüntüler alınarak eğitilen model üzerinde sınıflandırmalar yapılmıştır. Videolar üzerinde yapılan sınıflamalarda gerekli görülen sonuçlar, Google tarafından sağlanan bir bulut hizmeti olan Firebase’in RealTimeDatabase’ e kaydedilmiştir. Kaydedilen veri tabanı kullanılarak daha önceden kötü içerik tespiti yapılmış videoların engellenmesi sağlanmıştır. Tarayıcıdan yapılan tespitlerin son 25 tanesi bilgilendirme amaçlı kullanıcı tarafından görüntülenebilmektedir. Bu çalışmada, izlenen videolardan anlık görüntüler alınarak model ile sınıflama yapılmıştır. Gerekli hallerde videonun bilgisi veri tabanına eklenebilmektedir ve veri tabanına kayıtlı videoların görüntülenmesi filtre edilebilmektedir. Geliştirilen sistem hem fiziksel cihazlar hem de emülatör aracılığıyla test edilmiştir. CNN ile geliştirilen derin öğrenme modelinin ağ yapısı oluşturulmasında iki yol benimsenmiştir. Birincisinde, tüm ağ modeli tarafımızca oluşturduğumuz modeldir. Bu modelde, parametre sayısı 7.752.707 adettir ve %86,75 eğitim ve %88,02 test doğruluğu elde edilmiştir. İkinci olarak, transfer öğrenme metodu kullanılarak, başarısı literatürde kanıtlanmış modellerden olan MobileNetV2 tercih edilmiştir. Çıkış katmanları düzenlenmiş bu modelde eğitilebilir 593.155 adet ve toplamda 2.852.675 adet parametre ile %65,34 eğitim ve %50,35 test doğruluğu elde edilmiştir. Yapılan çalışma sonucunda, video içeriklerini filtrelemek için CNN modelinin daha verimli olacağı bulgusuna ulaşılmıştır.

Anahtar Kelimeler

Evrişimli sinir ağları (CNN), derin öğrenme, video filtreleme, transfer öğrenme

Kaynakça

(2021). Kaggle: https://www.kaggle.com/ adresinden alındı
(2021). ImageNet: https://image-net.org/ adresinden alındı
(2021). COCO: https://cocodataset.org/ adresinden alındı
Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., & Vijayanarasimhan, S. (2016, 9 27). YouTube-8M: A Large-Scale Video Classification Benchmark. arxiv.org: https://arxiv.org/abs/1609.08675 adresinden alındı
Bhatti, M. T., Khan, M. G., Aslam, M., & Fiaz, M. J. (2021, 2 12). Weapon Detection in Real-Time CCTV Videos Using Deep Learning. IEEE Access, s. 34366 - 34382. doi:10.1109/ACCESS.2021.3059170
Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y. M. (2020, 4 23). YOLOv4: Optimal Speed and Accuracy of Object Detection. arxiv.org: https://arxiv.org/abs/2004.10934 adresinden alındı
Chen, Y.-L., Chang, C.-L., & Yeh, C.-S. (2017, Eylül). Emotion classification of YouTube videos. Decision Support Systems, s. 40-50. doi:10.1016/j.dss.2017.05.014 Hammam, A. M. (2019). An Extensible, Modular Framework for Classifying YouTube Videos Using Web and Social Media. 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE. doi:10.1109/MIPR.2019.00092
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., . . . Adam, H. (2017, 4 17). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arxiv.org: https://arxiv.org/pdf/1704.04861.pdf adresinden alındı
Jia, C., Wang, S., Zhang, X., Wang, S., Liu, J., Pu, S., & Ma, S. (2019). Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding. IEEE Transactions on Image Processing, 3343-3356. doi:10.1109/TIP.2019.2896489
Jia, Y., Chen, W., Yang, M., Wang, L., Liu, D., & Zhang, Q. (2021). Video smoke detection with domain knowledge and transfer learning from deep convolutional neural networks. Optik(240). doi:10.1016/j.ijleo.2021.166947
Jing, L., Parag, T., Wu, Z., Tian, Y., & Wang, H. (2021). VideoSSL: Semi-Supervised Learning for Video Classification. Winter Conference on Applications of Computer Vision (WACV) (s. 1110-1119). IEEE/CVF.
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-Scale Video Classification with Convolutional Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. doi:10.1109/CVPR.2014.223
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 436-444. doi:10.1038/nature14539
Lee, H. (2019). Cigarette Smoker Detection. 04 24, 2021 tarihinde https://www.kaggle.com/vitaminc/cigarette-smoker-detection adresinden alındı Nayak, R., Pati, U. C., & Das, S. K. (2021). A comprehensive review on deep learning-based methods for video anomaly detection. Image and Vision Computing. doi:10.1016/j.imavis.2020.104078
Nouisser, K. (2019). Alcoholic Drinks. 04 24, 2021 tarihinde https://www.kaggle.com/khalilnouisser/alcoolicdrinks5 adresinden alındı
Nugroho, H. A., Hardiyanto, D., & Adji, T. B. (2015). Negative content filtering for video application. 7th International Conference on Information Technology and Electrical Engineering (ICITEE). IEEE. doi:10.1109/ICITEED.2015.7408912
Ostrolucky, G. (2020). Bulk Bing Image Downloader. 04 24, 2021 tarihinde https://github.com/ostrolucky/Bulk-Bing-Image-downloader adresinden alındı
Panta, F. J., Qodseya, M., & Péninou, A. (2018). Management of Mobile Objects Location for Video Content Filtering. MoMM2018: Proceedings of the 16th International Conference on Advances in Mobile Computing and Multimedia (s. 44-52). New York: Association for Computing Machinery. doi:10.1145/3282353.3282368
Ramesh, M., & Mahesh, K. (2020). A Performance Analysis of Pre-trained Neural Network and Design of CNN for Sports Video Classification. International Conference on Communication and Signal Processing. Chennai: IEEE. doi:10.1109/ICCSP48568.2020.9182113
Ren, S., He, K., Girshick, R., & Sun, J. (2016, 1 6). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks . arxiv.org: https://arxiv.org/pdf/1506.01497v3.pdf adresinden alındı
Sasank, S. (2019). Guns Object Detection. 04 24, 2021 tarihinde https://www.kaggle.com/issaisasank/guns-object-detection adresinden alındı
Shaout, A., & Crispi, B. (2020). Streaming Video ClassificationUsing Machine Learning. The International Arab Journal of Information Technology, 667-682. doi:10.34028/iajit/17/4A/13
Simonyan, K., & Zisserman, A. (2014, 9 4). Very Deep Convolutional Networksfor Large-Scale Image Recognition. arXiv.org: https://arxiv.org/abs/1409.1556 adresinden alındı
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2016, 2 23). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv.org: https://arxiv.org/abs/1602.07261v2 adresinden alındı
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015, 12 5). Rethinking the Inception Architecture for Computer Vision. arXiv.org: https://arxiv.org/abs/1512.00567v3 adresinden alındı
Wu, Z., Yao, T., Fu, Y., & Jiang, Y.-G. (2017). Deep learning for video classification and captioning. Frontiers of Multimedia Research, 3-29. doi:10.1145/3122865.3122867

Filtering Videos by Classification with Deep Learning

Yıl 2021, Sayı: 26 - Ejosat Özel Sayı 2021 (HORA), 338 - 342, 31.07.2021

Murat Kazanç , Tolga Ensari , Mustafa Dağtekin

https://doi.org/10.31590/ejosat.952481

Öz

In this study, a model that detects and classifies unwanted objects such as tobacco products, alcoholic beverages and weapons was developed using Convolutional Neural Networks (CNN), which is a deep learning methods and transfer learning. This model was converted to Tensorflow JS and developed as an add-on for an Internet Browser. With this add-on, snapshots were taken from the watched videos and classifications were made using the trained model. The results that are deemed necessary in the classification of the videos were recorded to Firebase's RealTimeDatabase, a cloud service provided by Google. By using the recorded database, videos that had previously been detected with harmful content were blocked. The last 25 of the definitions made from the browser can be viewed by the user for informational purposes. In this study, snapshots from watched videos were taken and classified with the model. If necessary, information about the video can be added to the database, and videos saved in the database can be filtered. The developed system has been tested through both physical devices and emulator. Two ways have been adopted in creating the network structure of the model developed with CNN. First, the entire network model is the one that is created by us. In this model, the number of parameters was 7,752,707 and the accuracy of 84.84% training and 79.77% testing was achieved. Second, MobileNetV2, which is one of the models whose success has been proven in the literature, was preferred using the transfer learning method. With 593,155 trainable parameters and 2,852,675 parameters in total, 65.34% training and 50.35% test accuracy was achieved in this model with output layers arranged. As a result of the study, it was found that the CNN model would be more efficient in filtering video content.

Anahtar Kelimeler

Convolutional neural networks (CNN), deep learning, video filtering, transfer learning

Kaynakça

(2021). Kaggle: https://www.kaggle.com/ adresinden alındı
(2021). ImageNet: https://image-net.org/ adresinden alındı
(2021). COCO: https://cocodataset.org/ adresinden alındı
Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., & Vijayanarasimhan, S. (2016, 9 27). YouTube-8M: A Large-Scale Video Classification Benchmark. arxiv.org: https://arxiv.org/abs/1609.08675 adresinden alındı
Bhatti, M. T., Khan, M. G., Aslam, M., & Fiaz, M. J. (2021, 2 12). Weapon Detection in Real-Time CCTV Videos Using Deep Learning. IEEE Access, s. 34366 - 34382. doi:10.1109/ACCESS.2021.3059170
Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y. M. (2020, 4 23). YOLOv4: Optimal Speed and Accuracy of Object Detection. arxiv.org: https://arxiv.org/abs/2004.10934 adresinden alındı
Chen, Y.-L., Chang, C.-L., & Yeh, C.-S. (2017, Eylül). Emotion classification of YouTube videos. Decision Support Systems, s. 40-50. doi:10.1016/j.dss.2017.05.014 Hammam, A. M. (2019). An Extensible, Modular Framework for Classifying YouTube Videos Using Web and Social Media. 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE. doi:10.1109/MIPR.2019.00092
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., . . . Adam, H. (2017, 4 17). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arxiv.org: https://arxiv.org/pdf/1704.04861.pdf adresinden alındı
Jia, C., Wang, S., Zhang, X., Wang, S., Liu, J., Pu, S., & Ma, S. (2019). Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding. IEEE Transactions on Image Processing, 3343-3356. doi:10.1109/TIP.2019.2896489
Jia, Y., Chen, W., Yang, M., Wang, L., Liu, D., & Zhang, Q. (2021). Video smoke detection with domain knowledge and transfer learning from deep convolutional neural networks. Optik(240). doi:10.1016/j.ijleo.2021.166947
Jing, L., Parag, T., Wu, Z., Tian, Y., & Wang, H. (2021). VideoSSL: Semi-Supervised Learning for Video Classification. Winter Conference on Applications of Computer Vision (WACV) (s. 1110-1119). IEEE/CVF.
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-Scale Video Classification with Convolutional Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. doi:10.1109/CVPR.2014.223
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 436-444. doi:10.1038/nature14539
Lee, H. (2019). Cigarette Smoker Detection. 04 24, 2021 tarihinde https://www.kaggle.com/vitaminc/cigarette-smoker-detection adresinden alındı Nayak, R., Pati, U. C., & Das, S. K. (2021). A comprehensive review on deep learning-based methods for video anomaly detection. Image and Vision Computing. doi:10.1016/j.imavis.2020.104078
Nouisser, K. (2019). Alcoholic Drinks. 04 24, 2021 tarihinde https://www.kaggle.com/khalilnouisser/alcoolicdrinks5 adresinden alındı
Nugroho, H. A., Hardiyanto, D., & Adji, T. B. (2015). Negative content filtering for video application. 7th International Conference on Information Technology and Electrical Engineering (ICITEE). IEEE. doi:10.1109/ICITEED.2015.7408912
Ostrolucky, G. (2020). Bulk Bing Image Downloader. 04 24, 2021 tarihinde https://github.com/ostrolucky/Bulk-Bing-Image-downloader adresinden alındı
Panta, F. J., Qodseya, M., & Péninou, A. (2018). Management of Mobile Objects Location for Video Content Filtering. MoMM2018: Proceedings of the 16th International Conference on Advances in Mobile Computing and Multimedia (s. 44-52). New York: Association for Computing Machinery. doi:10.1145/3282353.3282368
Ramesh, M., & Mahesh, K. (2020). A Performance Analysis of Pre-trained Neural Network and Design of CNN for Sports Video Classification. International Conference on Communication and Signal Processing. Chennai: IEEE. doi:10.1109/ICCSP48568.2020.9182113
Ren, S., He, K., Girshick, R., & Sun, J. (2016, 1 6). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks . arxiv.org: https://arxiv.org/pdf/1506.01497v3.pdf adresinden alındı
Sasank, S. (2019). Guns Object Detection. 04 24, 2021 tarihinde https://www.kaggle.com/issaisasank/guns-object-detection adresinden alındı
Shaout, A., & Crispi, B. (2020). Streaming Video ClassificationUsing Machine Learning. The International Arab Journal of Information Technology, 667-682. doi:10.34028/iajit/17/4A/13
Simonyan, K., & Zisserman, A. (2014, 9 4). Very Deep Convolutional Networksfor Large-Scale Image Recognition. arXiv.org: https://arxiv.org/abs/1409.1556 adresinden alındı
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2016, 2 23). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv.org: https://arxiv.org/abs/1602.07261v2 adresinden alındı
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015, 12 5). Rethinking the Inception Architecture for Computer Vision. arXiv.org: https://arxiv.org/abs/1512.00567v3 adresinden alındı
Wu, Z., Yao, T., Fu, Y., & Jiang, Y.-G. (2017). Deep learning for video classification and captioning. Frontiers of Multimedia Research, 3-29. doi:10.1145/3122865.3122867

Toplam 26 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Murat Kazanç 0000-0002-8405-0181 Tolga Ensari 0000-0003-0896-3058 Mustafa Dağtekin 0000-0002-0797-9392
Yayımlanma Tarihi	31 Temmuz 2021
Yayımlandığı Sayı	Yıl 2021 Sayı: 26 - Ejosat Özel Sayı 2021 (HORA)

Kaynak Göster

APA	Kazanç, M., Ensari, T., & Dağtekin, M. (2021). Videoların Derin Öğrenme ile Sınıflandırılarak Filtrelenmesi. Avrupa Bilim Ve Teknoloji Dergisi(26), 338-342. https://doi.org/10.31590/ejosat.952481

Kapak Resmi İndir

Makale Dosyaları

Tam Metin