Konferans Bildirisi
BibTex RIS Kaynak Göster

Videoların Derin Öğrenme ile Sınıflandırılarak Filtrelenmesi

Yıl 2021, Sayı: 26 - Ejosat Özel Sayı 2021 (HORA), 338 - 342, 31.07.2021
https://doi.org/10.31590/ejosat.952481

Öz

Bu çalışmada derin öğrenme metodu olan evrişimli sinir ağları (Convolutional Neural Networks - CNN) ve transfer öğrenme metodu kullanılarak tütün mamulleri, alkollü içecek ve silah gibi istenmeyen nesnelerin tespitini ve sınıflamasını yapan bir model geliştirilmiştir. Bu model Tensorflow JS’e dönüştürülerek, internet tarayıcısı için bir eklenti olarak geliştirilmiştir. Bu eklenti ile izlenen videolardan anlık görüntüler alınarak eğitilen model üzerinde sınıflandırmalar yapılmıştır. Videolar üzerinde yapılan sınıflamalarda gerekli görülen sonuçlar, Google tarafından sağlanan bir bulut hizmeti olan Firebase’in RealTimeDatabase’ e kaydedilmiştir. Kaydedilen veri tabanı kullanılarak daha önceden kötü içerik tespiti yapılmış videoların engellenmesi sağlanmıştır. Tarayıcıdan yapılan tespitlerin son 25 tanesi bilgilendirme amaçlı kullanıcı tarafından görüntülenebilmektedir. Bu çalışmada, izlenen videolardan anlık görüntüler alınarak model ile sınıflama yapılmıştır. Gerekli hallerde videonun bilgisi veri tabanına eklenebilmektedir ve veri tabanına kayıtlı videoların görüntülenmesi filtre edilebilmektedir. Geliştirilen sistem hem fiziksel cihazlar hem de emülatör aracılığıyla test edilmiştir. CNN ile geliştirilen derin öğrenme modelinin ağ yapısı oluşturulmasında iki yol benimsenmiştir. Birincisinde, tüm ağ modeli tarafımızca oluşturduğumuz modeldir. Bu modelde, parametre sayısı 7.752.707 adettir ve %86,75 eğitim ve %88,02 test doğruluğu elde edilmiştir. İkinci olarak, transfer öğrenme metodu kullanılarak, başarısı literatürde kanıtlanmış modellerden olan MobileNetV2 tercih edilmiştir. Çıkış katmanları düzenlenmiş bu modelde eğitilebilir 593.155 adet ve toplamda 2.852.675 adet parametre ile %65,34 eğitim ve %50,35 test doğruluğu elde edilmiştir. Yapılan çalışma sonucunda, video içeriklerini filtrelemek için CNN modelinin daha verimli olacağı bulgusuna ulaşılmıştır.

Kaynakça

  • (2021). Kaggle: https://www.kaggle.com/ adresinden alındı
  • (2021). ImageNet: https://image-net.org/ adresinden alındı
  • (2021). COCO: https://cocodataset.org/ adresinden alındı
  • Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., & Vijayanarasimhan, S. (2016, 9 27). YouTube-8M: A Large-Scale Video Classification Benchmark. arxiv.org: https://arxiv.org/abs/1609.08675 adresinden alındı
  • Bhatti, M. T., Khan, M. G., Aslam, M., & Fiaz, M. J. (2021, 2 12). Weapon Detection in Real-Time CCTV Videos Using Deep Learning. IEEE Access, s. 34366 - 34382. doi:10.1109/ACCESS.2021.3059170
  • Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y. M. (2020, 4 23). YOLOv4: Optimal Speed and Accuracy of Object Detection. arxiv.org: https://arxiv.org/abs/2004.10934 adresinden alındı
  • Chen, Y.-L., Chang, C.-L., & Yeh, C.-S. (2017, Eylül). Emotion classification of YouTube videos. Decision Support Systems, s. 40-50. doi:10.1016/j.dss.2017.05.014 Hammam, A. M. (2019). An Extensible, Modular Framework for Classifying YouTube Videos Using Web and Social Media. 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE. doi:10.1109/MIPR.2019.00092
  • Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., . . . Adam, H. (2017, 4 17). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arxiv.org: https://arxiv.org/pdf/1704.04861.pdf adresinden alındı
  • Jia, C., Wang, S., Zhang, X., Wang, S., Liu, J., Pu, S., & Ma, S. (2019). Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding. IEEE Transactions on Image Processing, 3343-3356. doi:10.1109/TIP.2019.2896489
  • Jia, Y., Chen, W., Yang, M., Wang, L., Liu, D., & Zhang, Q. (2021). Video smoke detection with domain knowledge and transfer learning from deep convolutional neural networks. Optik(240). doi:10.1016/j.ijleo.2021.166947
  • Jing, L., Parag, T., Wu, Z., Tian, Y., & Wang, H. (2021). VideoSSL: Semi-Supervised Learning for Video Classification. Winter Conference on Applications of Computer Vision (WACV) (s. 1110-1119). IEEE/CVF.
  • Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-Scale Video Classification with Convolutional Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. doi:10.1109/CVPR.2014.223
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 436-444. doi:10.1038/nature14539
  • Lee, H. (2019). Cigarette Smoker Detection. 04 24, 2021 tarihinde https://www.kaggle.com/vitaminc/cigarette-smoker-detection adresinden alındı Nayak, R., Pati, U. C., & Das, S. K. (2021). A comprehensive review on deep learning-based methods for video anomaly detection. Image and Vision Computing. doi:10.1016/j.imavis.2020.104078
  • Nouisser, K. (2019). Alcoholic Drinks. 04 24, 2021 tarihinde https://www.kaggle.com/khalilnouisser/alcoolicdrinks5 adresinden alındı
  • Nugroho, H. A., Hardiyanto, D., & Adji, T. B. (2015). Negative content filtering for video application. 7th International Conference on Information Technology and Electrical Engineering (ICITEE). IEEE. doi:10.1109/ICITEED.2015.7408912
  • Ostrolucky, G. (2020). Bulk Bing Image Downloader. 04 24, 2021 tarihinde https://github.com/ostrolucky/Bulk-Bing-Image-downloader adresinden alındı
  • Panta, F. J., Qodseya, M., & Péninou, A. (2018). Management of Mobile Objects Location for Video Content Filtering. MoMM2018: Proceedings of the 16th International Conference on Advances in Mobile Computing and Multimedia (s. 44-52). New York: Association for Computing Machinery. doi:10.1145/3282353.3282368
  • Ramesh, M., & Mahesh, K. (2020). A Performance Analysis of Pre-trained Neural Network and Design of CNN for Sports Video Classification. International Conference on Communication and Signal Processing. Chennai: IEEE. doi:10.1109/ICCSP48568.2020.9182113
  • Ren, S., He, K., Girshick, R., & Sun, J. (2016, 1 6). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks . arxiv.org: https://arxiv.org/pdf/1506.01497v3.pdf adresinden alındı
  • Sasank, S. (2019). Guns Object Detection. 04 24, 2021 tarihinde https://www.kaggle.com/issaisasank/guns-object-detection adresinden alındı
  • Shaout, A., & Crispi, B. (2020). Streaming Video ClassificationUsing Machine Learning. The International Arab Journal of Information Technology, 667-682. doi:10.34028/iajit/17/4A/13
  • Simonyan, K., & Zisserman, A. (2014, 9 4). Very Deep Convolutional Networksfor Large-Scale Image Recognition. arXiv.org: https://arxiv.org/abs/1409.1556 adresinden alındı
  • Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2016, 2 23). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv.org: https://arxiv.org/abs/1602.07261v2 adresinden alındı
  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015, 12 5). Rethinking the Inception Architecture for Computer Vision. arXiv.org: https://arxiv.org/abs/1512.00567v3 adresinden alındı
  • Wu, Z., Yao, T., Fu, Y., & Jiang, Y.-G. (2017). Deep learning for video classification and captioning. Frontiers of Multimedia Research, 3-29. doi:10.1145/3122865.3122867

Filtering Videos by Classification with Deep Learning

Yıl 2021, Sayı: 26 - Ejosat Özel Sayı 2021 (HORA), 338 - 342, 31.07.2021
https://doi.org/10.31590/ejosat.952481

Öz

In this study, a model that detects and classifies unwanted objects such as tobacco products, alcoholic beverages and weapons was developed using Convolutional Neural Networks (CNN), which is a deep learning methods and transfer learning. This model was converted to Tensorflow JS and developed as an add-on for an Internet Browser. With this add-on, snapshots were taken from the watched videos and classifications were made using the trained model. The results that are deemed necessary in the classification of the videos were recorded to Firebase's RealTimeDatabase, a cloud service provided by Google. By using the recorded database, videos that had previously been detected with harmful content were blocked. The last 25 of the definitions made from the browser can be viewed by the user for informational purposes. In this study, snapshots from watched videos were taken and classified with the model. If necessary, information about the video can be added to the database, and videos saved in the database can be filtered. The developed system has been tested through both physical devices and emulator. Two ways have been adopted in creating the network structure of the model developed with CNN. First, the entire network model is the one that is created by us. In this model, the number of parameters was 7,752,707 and the accuracy of 84.84% training and 79.77% testing was achieved. Second, MobileNetV2, which is one of the models whose success has been proven in the literature, was preferred using the transfer learning method. With 593,155 trainable parameters and 2,852,675 parameters in total, 65.34% training and 50.35% test accuracy was achieved in this model with output layers arranged. As a result of the study, it was found that the CNN model would be more efficient in filtering video content.

Kaynakça

  • (2021). Kaggle: https://www.kaggle.com/ adresinden alındı
  • (2021). ImageNet: https://image-net.org/ adresinden alındı
  • (2021). COCO: https://cocodataset.org/ adresinden alındı
  • Abu-El-Haija, S., Kothari, N., Lee, J., Natsev, P., Toderici, G., Varadarajan, B., & Vijayanarasimhan, S. (2016, 9 27). YouTube-8M: A Large-Scale Video Classification Benchmark. arxiv.org: https://arxiv.org/abs/1609.08675 adresinden alındı
  • Bhatti, M. T., Khan, M. G., Aslam, M., & Fiaz, M. J. (2021, 2 12). Weapon Detection in Real-Time CCTV Videos Using Deep Learning. IEEE Access, s. 34366 - 34382. doi:10.1109/ACCESS.2021.3059170
  • Bochkovskiy, A., Wang, C.-Y., & Liao, H.-Y. M. (2020, 4 23). YOLOv4: Optimal Speed and Accuracy of Object Detection. arxiv.org: https://arxiv.org/abs/2004.10934 adresinden alındı
  • Chen, Y.-L., Chang, C.-L., & Yeh, C.-S. (2017, Eylül). Emotion classification of YouTube videos. Decision Support Systems, s. 40-50. doi:10.1016/j.dss.2017.05.014 Hammam, A. M. (2019). An Extensible, Modular Framework for Classifying YouTube Videos Using Web and Social Media. 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). IEEE. doi:10.1109/MIPR.2019.00092
  • Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., . . . Adam, H. (2017, 4 17). MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arxiv.org: https://arxiv.org/pdf/1704.04861.pdf adresinden alındı
  • Jia, C., Wang, S., Zhang, X., Wang, S., Liu, J., Pu, S., & Ma, S. (2019). Content-Aware Convolutional Neural Network for In-Loop Filtering in High Efficiency Video Coding. IEEE Transactions on Image Processing, 3343-3356. doi:10.1109/TIP.2019.2896489
  • Jia, Y., Chen, W., Yang, M., Wang, L., Liu, D., & Zhang, Q. (2021). Video smoke detection with domain knowledge and transfer learning from deep convolutional neural networks. Optik(240). doi:10.1016/j.ijleo.2021.166947
  • Jing, L., Parag, T., Wu, Z., Tian, Y., & Wang, H. (2021). VideoSSL: Semi-Supervised Learning for Video Classification. Winter Conference on Applications of Computer Vision (WACV) (s. 1110-1119). IEEE/CVF.
  • Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-Scale Video Classification with Convolutional Neural Networks. 2014 IEEE Conference on Computer Vision and Pattern Recognition. IEEE. doi:10.1109/CVPR.2014.223
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 436-444. doi:10.1038/nature14539
  • Lee, H. (2019). Cigarette Smoker Detection. 04 24, 2021 tarihinde https://www.kaggle.com/vitaminc/cigarette-smoker-detection adresinden alındı Nayak, R., Pati, U. C., & Das, S. K. (2021). A comprehensive review on deep learning-based methods for video anomaly detection. Image and Vision Computing. doi:10.1016/j.imavis.2020.104078
  • Nouisser, K. (2019). Alcoholic Drinks. 04 24, 2021 tarihinde https://www.kaggle.com/khalilnouisser/alcoolicdrinks5 adresinden alındı
  • Nugroho, H. A., Hardiyanto, D., & Adji, T. B. (2015). Negative content filtering for video application. 7th International Conference on Information Technology and Electrical Engineering (ICITEE). IEEE. doi:10.1109/ICITEED.2015.7408912
  • Ostrolucky, G. (2020). Bulk Bing Image Downloader. 04 24, 2021 tarihinde https://github.com/ostrolucky/Bulk-Bing-Image-downloader adresinden alındı
  • Panta, F. J., Qodseya, M., & Péninou, A. (2018). Management of Mobile Objects Location for Video Content Filtering. MoMM2018: Proceedings of the 16th International Conference on Advances in Mobile Computing and Multimedia (s. 44-52). New York: Association for Computing Machinery. doi:10.1145/3282353.3282368
  • Ramesh, M., & Mahesh, K. (2020). A Performance Analysis of Pre-trained Neural Network and Design of CNN for Sports Video Classification. International Conference on Communication and Signal Processing. Chennai: IEEE. doi:10.1109/ICCSP48568.2020.9182113
  • Ren, S., He, K., Girshick, R., & Sun, J. (2016, 1 6). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks . arxiv.org: https://arxiv.org/pdf/1506.01497v3.pdf adresinden alındı
  • Sasank, S. (2019). Guns Object Detection. 04 24, 2021 tarihinde https://www.kaggle.com/issaisasank/guns-object-detection adresinden alındı
  • Shaout, A., & Crispi, B. (2020). Streaming Video ClassificationUsing Machine Learning. The International Arab Journal of Information Technology, 667-682. doi:10.34028/iajit/17/4A/13
  • Simonyan, K., & Zisserman, A. (2014, 9 4). Very Deep Convolutional Networksfor Large-Scale Image Recognition. arXiv.org: https://arxiv.org/abs/1409.1556 adresinden alındı
  • Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2016, 2 23). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. arXiv.org: https://arxiv.org/abs/1602.07261v2 adresinden alındı
  • Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2015, 12 5). Rethinking the Inception Architecture for Computer Vision. arXiv.org: https://arxiv.org/abs/1512.00567v3 adresinden alındı
  • Wu, Z., Yao, T., Fu, Y., & Jiang, Y.-G. (2017). Deep learning for video classification and captioning. Frontiers of Multimedia Research, 3-29. doi:10.1145/3122865.3122867
Toplam 26 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Mühendislik
Bölüm Makaleler
Yazarlar

Murat Kazanç 0000-0002-8405-0181

Tolga Ensari 0000-0003-0896-3058

Mustafa Dağtekin 0000-0002-0797-9392

Yayımlanma Tarihi 31 Temmuz 2021
Yayımlandığı Sayı Yıl 2021 Sayı: 26 - Ejosat Özel Sayı 2021 (HORA)

Kaynak Göster

APA Kazanç, M., Ensari, T., & Dağtekin, M. (2021). Videoların Derin Öğrenme ile Sınıflandırılarak Filtrelenmesi. Avrupa Bilim Ve Teknoloji Dergisi(26), 338-342. https://doi.org/10.31590/ejosat.952481