Araştırma Makalesi

Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units

Sayı: 35 7 Mayıs 2022
PDF İndir
EN TR

Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units

Öz

Recurrent neural networks have recently emerged as a useful tool in computer vision and language modeling tasks such as image and video captioning. The main limitation of these networks is preserving the gradient flow as the network gets deeper. We propose a video captioning approach that utilizes residual connections to overcome this limitation and maintain the gradient flow by carrying the information through layers from bottom to top with additive features. The experimental evaluations on the MSVD dataset indicate that the proposed approach achieves accurate caption generation compared to the state-of-the-art results. In addition, the proposed approach is integrated with our custom-designed Android application, WeCapV2, capable of generating captions without an internet connection.

Anahtar Kelimeler

Kaynakça

  1. Amirian, S., Rasheed, K., Taha, T. R., & Arabnia, H. R. J. I. A. (2020). Automatic image and video caption generation with deep learning: A concise review and algorithmic overlap. IEEE Access, 8, 218386-218400.
  2. Anderson, P., Fernando, B., Johnson, M., & Gould, S. (2016). Spice: Semantic propositional image caption evaluation. Paper presented at the European Conference on Computer Vision.
  3. Banerjee, S., & Lavie, A. (2005). METEOR: An automatic metric for MT evaluation with improved correlation with human judgments. Paper presented at the Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization.
  4. Baran, M., Moral, Ö. T., & Kılıç, V. J. A. B. v. T. D. (2021). Akıllı telefonlar için birleştirme modeli tabanlı görüntü altyazılama. European Journal of Science and Technology(26), 191-196. Chen, D., & Dolan, W. B. (2011). Collecting highly parallel data for paraphrase evaluation. Paper presented at the Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies.
  5. Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  6. Çaylı, Ö., Makav, B., Kılıç, V., & Onan, A. (2020). Mobile application based automatic caption generation for visually impaired. Paper presented at the International Conference on Intelligent and Fuzzy Systems.
  7. Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., & Darrell, T. (2015). Long-term recurrent convolutional networks for visual recognition and description. Paper presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
  8. Fetiler, B., Çaylı, Ö., Moral, Ö. T., Kılıç, V., & Onan, A. (2021). Video captioning based on multi-layer gated recurrent unit for smartphones. European Journal of Science and Technology(32), 221-226.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Mühendislik

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

7 Mayıs 2022

Gönderilme Tarihi

11 Şubat 2022

Kabul Tarihi

24 Mart 2022

Yayımlandığı Sayı

Yıl 2022 Sayı: 35

Kaynak Göster

APA
Aydın, S., Çaylı, Ö., Kılıç, V., & Onan, A. (2022). Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units. Avrupa Bilim ve Teknoloji Dergisi, 35, 380-386. https://doi.org/10.31590/ejosat.1071835
AMA
1.Aydın S, Çaylı Ö, Kılıç V, Onan A. Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units. EJOSAT. 2022;(35):380-386. doi:10.31590/ejosat.1071835
Chicago
Aydın, Selman, Özkan Çaylı, Volkan Kılıç, ve Aytuğ Onan. 2022. “Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units”. Avrupa Bilim ve Teknoloji Dergisi, sy 35: 380-86. https://doi.org/10.31590/ejosat.1071835.
EndNote
Aydın S, Çaylı Ö, Kılıç V, Onan A (01 Mayıs 2022) Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units. Avrupa Bilim ve Teknoloji Dergisi 35 380–386.
IEEE
[1]S. Aydın, Ö. Çaylı, V. Kılıç, ve A. Onan, “Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units”, EJOSAT, sy 35, ss. 380–386, May. 2022, doi: 10.31590/ejosat.1071835.
ISNAD
Aydın, Selman - Çaylı, Özkan - Kılıç, Volkan - Onan, Aytuğ. “Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units”. Avrupa Bilim ve Teknoloji Dergisi. 35 (01 Mayıs 2022): 380-386. https://doi.org/10.31590/ejosat.1071835.
JAMA
1.Aydın S, Çaylı Ö, Kılıç V, Onan A. Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units. EJOSAT. 2022;:380–386.
MLA
Aydın, Selman, vd. “Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units”. Avrupa Bilim ve Teknoloji Dergisi, sy 35, Mayıs 2022, ss. 380-6, doi:10.31590/ejosat.1071835.
Vancouver
1.Selman Aydın, Özkan Çaylı, Volkan Kılıç, Aytuğ Onan. Sequence-to-Sequence Video Captioning with Residual Connected Gated Recurrent Units. EJOSAT. 01 Mayıs 2022;(35):380-6. doi:10.31590/ejosat.1071835

Cited By