Research Article
BibTex RIS Cite

Evaluation of Convolutional Networks for Event Camera Face Pose Alignment

Year 2025, Volume: 13 Issue: 2, 22 - 30, 31.05.2025
https://doi.org/10.21541/apjess.1417068

Abstract

Event camera offers substantial advantages over conventional video cameras with their efficiency, extremely high temporal resolutions, low latency, and high dynamic range. These benefits have led to applications in various vision domains. Recently they have been applied in facial recognition tasks as well. However, while significant advantages of event cameras in some facial processing tasks have been demonstrated, the initial stage in almost any task, i.e., face alignment, is not at par with the conventional cameras. This study investigates the use of face alignment convolutional networks regarding both performance and complexity for event camera processing. Our aim is event camera face pose alignment that can be used as an efficient preprocessor for facial tasks. Therefore, we comparatively evaluate simple convolutional coordinate regression with a hybrid of coordinate and heatmap regression, known as pixel-in-pixel regression. Our experimental results reveal the superior performance of the hybrid method. However, we also show that if there is a computation bottleneck, simple convolutional coordinate regression is preferable for their low resource requirements though at the expense of some performance loss.

Supporting Institution

YAŞAR ÜNİVERSİTESİ

Project Number

BAP112

Thanks

This work was supported by the Yaşar University Project Evaluation Commission, Turkey for the project ‘‘Dynamic Facial Analysis with Neuromorphic Camera’’ [grant number: BAP112].

References

  • G. Gallego et al., “Event-Based Vision: A Survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 1, pp. 154–180, Jan. 2022, doi: 10.1109/TPAMI.2020.3008413.
  • G. Tan, Y. Wang, H. Han, Y. Cao, F. Wu, and Z.-J. Zha, “Multi-grained Spatio-Temporal Features Perceived Network for Event-based Lip-Reading,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA: IEEE, Jun. 2022, pp. 20062–20071. doi: 10.1109/CVPR52688.2022.01946.
  • G. Moreira, A. Graca, B. Silva, P. Martins, and J. Batista, “Neuromorphic Event-based Face Identity Recognition,” in 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada: IEEE, Aug. 2022, pp. 922–929. doi: 10.1109/ICPR56361.2022.9956236.
  • A. Savran, “Fully Convolutional Event-camera Voice Activity Detection Based on Event Intensity,” in 2023 Innovations in Intelligent Systems and Applications Conference (ASYU), Sivas, Turkiye: IEEE, Oct. 2023, pp. 1–6. doi: 10.1109/ASYU58738.2023.10296754.
  • A. Savran, “Multi-timescale boosting for efficient and improved event camera face pose alignment,” Computer Vision and Image Understanding, vol. 236, p. 103817, Nov. 2023, doi: 10.1016/j.cviu.2023.103817.
  • A. Savran and C. Bartolozzi, “Face Pose Alignment with Event Cameras,” Sensors, vol. 20, no. 24, p. 7079, Dec. 2020, doi: 10.3390/s20247079.
  • Z.-H. Feng, J. Kittler, M. Awais, and X.-J. Wu, “Rectified Wing Loss for Efficient and Robust Facial Landmark Localisation with Convolutional Neural Networks,” Int J Comput Vis, vol. 128, no. 8–9, pp. 2126–2145, Sep. 2020, doi: 10.1007/s11263-019-01275-0.
  • H. Jin, S. Liao, and L. Shao, “Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild,” Int J Comput Vis, vol. 129, no. 12, pp. 3174–3194, Dec. 2021, doi: 10.1007/s11263-021-01521-4.
  • B. Browatzki and C. Wallraven, “3FabRec: Fast Few-Shot Face Alignment by Reconstruction,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, Jun. 2020, pp. 6109–6119. doi: 10.1109/CVPR42600.2020.00615.
  • Y. Sun, X. Wang, and X. Tang, “Deep Convolutional Network Cascade for Facial Point Detection,” in 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA: IEEE, Jun. 2013, pp. 3476–3483. doi: 10.1109/CVPR.2013.446.
  • Y. Wu, T. Hassner, K. Kim, G. Medioni, and P. Natarajan, “Facial Landmark Detection with Tweaked Convolutional Neural Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 12, pp. 3067–3074, Dec. 2018, doi: 10.1109/TPAMI.2017.2787130.
  • S. Honari, P. Molchanov, S. Tyree, P. Vincent, C. Pal, and J. Kautz, “Improving Landmark Localization with Semi-Supervised Learning,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE, Jun. 2018, pp. 1546–1555. doi: 10.1109/CVPR.2018.00167.
  • A. Kumar et al., “LUVLi Face Alignment: Estimating Landmarks’ Location, Uncertainty, and Visibility Likelihood,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, Jun. 2020, pp. 8233–8243. doi: 10.1109/CVPR42600.2020.00826.
  • X. Dong and Y. Yang, “Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, Oct. 2019, pp. 783–792. doi: 10.1109/ICCV.2019.00087.
  • X. Wang, L. Bo, and L. Fuxin, “Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression,” arXiv:1904.07399 [cs], May 2020, Accessed: Mar. 27, 2022. [Online]. Available: http://arxiv.org/abs/1904.07399
  • M. Cannici, M. Ciccone, A. Romanoni, and M. Matteucci, “Asynchronous Convolutional Networks for Object Detection in Neuromorphic Cameras,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA: IEEE, Jun. 2019, pp. 1656–1665. doi: 10.1109/CVPRW.2019.00209.
  • F. Paredes-Valles and G. C. H. E. de Croon, “Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA: IEEE, Jun. 2021, pp. 3445–3454. doi: 10.1109/CVPR46437.2021.00345.
  • D. Gehrig, A. Loquercio, K. Derpanis, and D. Scaramuzza, “End-to-End Learning of Representations for Asynchronous Event-Based Data,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, Oct. 2019, pp. 5632–5642. doi: 10.1109/ICCV.2019.00573.
  • E. Perot, P. de Tournemire, D. Nitti, J. Masci, and A. Sironi, “Learning to Detect Objects with a 1 Megapixel Event Camera.” arXiv, Dec. 09, 2020. Accessed: Apr. 24, 2024. [Online]. Available: http://arxiv.org/abs/2009.13436
  • A. Kugele, T. Pfeil, M. Pfeiffer, and E. Chicca, “How Many Events Make an Object? Improving Single-frame Object Detection on the 1 Mpx Dataset,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 3913–3922. doi: 10.1109/CVPRW59228.2023.00406.
  • C. Boretti, P. Bich, F. Pareschi, L. Prono, R. Rovatti, and G. Setti, “PEDRo: an Event-based Dataset for Person Detection in Robotics,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4065–4070. doi: 10.1109/CVPRW59228.2023.00426.
  • G. Goyal, F. Di Pietro, N. Carissimi, A. Glover, and C. Bartolozzi, “MoveEnet: Online High-Frequency Human Pose Estimation with an Event Camera,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4024–4033. doi: 10.1109/CVPRW59228.2023.00420.
  • P. R. Gantier Cadena, Y. Qian, C. Wang, and M. Yang, “Sparse-E2VID: A Sparse Convolutional Model for Event-Based Video Reconstruction Trained with Real Event Noise,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4150–4158. doi: 10.1109/CVPRW59228.2023.00437.
  • L. Berlincioni et al., “Neuromorphic Event-based Facial Expression Recognition.” arXiv, Apr. 13, 2023. Accessed: Apr. 24, 2024. [Online]. Available: http://arxiv.org/abs/2304.06351
  • H. Bulzomi, M. Schweiker, A. Gruel, and J. Martinet, “End-to-end Neuromorphic Lip Reading,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4101–4108. doi: 10.1109/CVPRW59228.2023.00431.
  • A. Savran, R. Tavarone, B. Higy, L. Badino, and C. Bartolozzi, “Energy and Computation Efficient Audio-Visual Voice Activity Detection Driven by Event-Cameras,” in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an: IEEE, May 2018, pp. 333–340. doi: 10.1109/FG.2018.00055.
  • A. Savran, "Temporal Convolutional Networks for Efficient Voice Activity Detection with Event Camera," Journal of Intelligent Systems: Theory and Applications, vol. 7, no. 2, pp. 102–115, Sep. 2024, doi: 10.38016/jista.1400047.
  • A. Savran, “Comparison of Timing Strategies for Face Pose Alignment with Event Camera,” in 2023 8th International Conference on Computer Science and Engineering (UBMK), Burdur, Turkiye: IEEE, Sep. 2023, pp. 97–101. doi: 10.1109/UBMK59864.2023.10286582.
  • K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” arXiv, Dec. 10, 2015. Accessed: Jan. 09, 2024. [Online]. Available: http://arxiv.org/abs/1512.03385
  • S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, “Aggregated Residual Transformations for Deep Neural Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI: IEEE, Jul. 2017, pp. 5987–5995. doi: 10.1109/CVPR.2017.634.
  • J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132-7141, doi: 10.1109/CVPR.2018.00745.
  • A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv, Apr. 16, 2017. Accessed: Apr. 29, 2024. [Online]. Available: http://arxiv.org/abs/1704.04861
  • M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT: IEEE, Jun. 2018, pp. 4510–4520. doi: 10.1109/CVPR.2018.00474.

Year 2025, Volume: 13 Issue: 2, 22 - 30, 31.05.2025
https://doi.org/10.21541/apjess.1417068

Abstract

Project Number

BAP112

References

  • G. Gallego et al., “Event-Based Vision: A Survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 1, pp. 154–180, Jan. 2022, doi: 10.1109/TPAMI.2020.3008413.
  • G. Tan, Y. Wang, H. Han, Y. Cao, F. Wu, and Z.-J. Zha, “Multi-grained Spatio-Temporal Features Perceived Network for Event-based Lip-Reading,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA: IEEE, Jun. 2022, pp. 20062–20071. doi: 10.1109/CVPR52688.2022.01946.
  • G. Moreira, A. Graca, B. Silva, P. Martins, and J. Batista, “Neuromorphic Event-based Face Identity Recognition,” in 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada: IEEE, Aug. 2022, pp. 922–929. doi: 10.1109/ICPR56361.2022.9956236.
  • A. Savran, “Fully Convolutional Event-camera Voice Activity Detection Based on Event Intensity,” in 2023 Innovations in Intelligent Systems and Applications Conference (ASYU), Sivas, Turkiye: IEEE, Oct. 2023, pp. 1–6. doi: 10.1109/ASYU58738.2023.10296754.
  • A. Savran, “Multi-timescale boosting for efficient and improved event camera face pose alignment,” Computer Vision and Image Understanding, vol. 236, p. 103817, Nov. 2023, doi: 10.1016/j.cviu.2023.103817.
  • A. Savran and C. Bartolozzi, “Face Pose Alignment with Event Cameras,” Sensors, vol. 20, no. 24, p. 7079, Dec. 2020, doi: 10.3390/s20247079.
  • Z.-H. Feng, J. Kittler, M. Awais, and X.-J. Wu, “Rectified Wing Loss for Efficient and Robust Facial Landmark Localisation with Convolutional Neural Networks,” Int J Comput Vis, vol. 128, no. 8–9, pp. 2126–2145, Sep. 2020, doi: 10.1007/s11263-019-01275-0.
  • H. Jin, S. Liao, and L. Shao, “Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild,” Int J Comput Vis, vol. 129, no. 12, pp. 3174–3194, Dec. 2021, doi: 10.1007/s11263-021-01521-4.
  • B. Browatzki and C. Wallraven, “3FabRec: Fast Few-Shot Face Alignment by Reconstruction,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, Jun. 2020, pp. 6109–6119. doi: 10.1109/CVPR42600.2020.00615.
  • Y. Sun, X. Wang, and X. Tang, “Deep Convolutional Network Cascade for Facial Point Detection,” in 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA: IEEE, Jun. 2013, pp. 3476–3483. doi: 10.1109/CVPR.2013.446.
  • Y. Wu, T. Hassner, K. Kim, G. Medioni, and P. Natarajan, “Facial Landmark Detection with Tweaked Convolutional Neural Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 12, pp. 3067–3074, Dec. 2018, doi: 10.1109/TPAMI.2017.2787130.
  • S. Honari, P. Molchanov, S. Tyree, P. Vincent, C. Pal, and J. Kautz, “Improving Landmark Localization with Semi-Supervised Learning,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE, Jun. 2018, pp. 1546–1555. doi: 10.1109/CVPR.2018.00167.
  • A. Kumar et al., “LUVLi Face Alignment: Estimating Landmarks’ Location, Uncertainty, and Visibility Likelihood,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, Jun. 2020, pp. 8233–8243. doi: 10.1109/CVPR42600.2020.00826.
  • X. Dong and Y. Yang, “Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, Oct. 2019, pp. 783–792. doi: 10.1109/ICCV.2019.00087.
  • X. Wang, L. Bo, and L. Fuxin, “Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression,” arXiv:1904.07399 [cs], May 2020, Accessed: Mar. 27, 2022. [Online]. Available: http://arxiv.org/abs/1904.07399
  • M. Cannici, M. Ciccone, A. Romanoni, and M. Matteucci, “Asynchronous Convolutional Networks for Object Detection in Neuromorphic Cameras,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA: IEEE, Jun. 2019, pp. 1656–1665. doi: 10.1109/CVPRW.2019.00209.
  • F. Paredes-Valles and G. C. H. E. de Croon, “Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA: IEEE, Jun. 2021, pp. 3445–3454. doi: 10.1109/CVPR46437.2021.00345.
  • D. Gehrig, A. Loquercio, K. Derpanis, and D. Scaramuzza, “End-to-End Learning of Representations for Asynchronous Event-Based Data,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, Oct. 2019, pp. 5632–5642. doi: 10.1109/ICCV.2019.00573.
  • E. Perot, P. de Tournemire, D. Nitti, J. Masci, and A. Sironi, “Learning to Detect Objects with a 1 Megapixel Event Camera.” arXiv, Dec. 09, 2020. Accessed: Apr. 24, 2024. [Online]. Available: http://arxiv.org/abs/2009.13436
  • A. Kugele, T. Pfeil, M. Pfeiffer, and E. Chicca, “How Many Events Make an Object? Improving Single-frame Object Detection on the 1 Mpx Dataset,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 3913–3922. doi: 10.1109/CVPRW59228.2023.00406.
  • C. Boretti, P. Bich, F. Pareschi, L. Prono, R. Rovatti, and G. Setti, “PEDRo: an Event-based Dataset for Person Detection in Robotics,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4065–4070. doi: 10.1109/CVPRW59228.2023.00426.
  • G. Goyal, F. Di Pietro, N. Carissimi, A. Glover, and C. Bartolozzi, “MoveEnet: Online High-Frequency Human Pose Estimation with an Event Camera,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4024–4033. doi: 10.1109/CVPRW59228.2023.00420.
  • P. R. Gantier Cadena, Y. Qian, C. Wang, and M. Yang, “Sparse-E2VID: A Sparse Convolutional Model for Event-Based Video Reconstruction Trained with Real Event Noise,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4150–4158. doi: 10.1109/CVPRW59228.2023.00437.
  • L. Berlincioni et al., “Neuromorphic Event-based Facial Expression Recognition.” arXiv, Apr. 13, 2023. Accessed: Apr. 24, 2024. [Online]. Available: http://arxiv.org/abs/2304.06351
  • H. Bulzomi, M. Schweiker, A. Gruel, and J. Martinet, “End-to-end Neuromorphic Lip Reading,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4101–4108. doi: 10.1109/CVPRW59228.2023.00431.
  • A. Savran, R. Tavarone, B. Higy, L. Badino, and C. Bartolozzi, “Energy and Computation Efficient Audio-Visual Voice Activity Detection Driven by Event-Cameras,” in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an: IEEE, May 2018, pp. 333–340. doi: 10.1109/FG.2018.00055.
  • A. Savran, "Temporal Convolutional Networks for Efficient Voice Activity Detection with Event Camera," Journal of Intelligent Systems: Theory and Applications, vol. 7, no. 2, pp. 102–115, Sep. 2024, doi: 10.38016/jista.1400047.
  • A. Savran, “Comparison of Timing Strategies for Face Pose Alignment with Event Camera,” in 2023 8th International Conference on Computer Science and Engineering (UBMK), Burdur, Turkiye: IEEE, Sep. 2023, pp. 97–101. doi: 10.1109/UBMK59864.2023.10286582.
  • K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” arXiv, Dec. 10, 2015. Accessed: Jan. 09, 2024. [Online]. Available: http://arxiv.org/abs/1512.03385
  • S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, “Aggregated Residual Transformations for Deep Neural Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI: IEEE, Jul. 2017, pp. 5987–5995. doi: 10.1109/CVPR.2017.634.
  • J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132-7141, doi: 10.1109/CVPR.2018.00745.
  • A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv, Apr. 16, 2017. Accessed: Apr. 29, 2024. [Online]. Available: http://arxiv.org/abs/1704.04861
  • M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT: IEEE, Jun. 2018, pp. 4510–4520. doi: 10.1109/CVPR.2018.00474.
There are 33 citations in total.

Details

Primary Language English
Subjects Deep Learning, Machine Vision
Journal Section Research Article
Authors

Burhan Burak Oral 0000-0002-9635-078X

Alptuğ Çakıcı 0009-0001-1251-8793

Arman Savran 0000-0001-5142-6384

Project Number BAP112
Submission Date January 9, 2024
Acceptance Date January 16, 2025
Early Pub Date May 30, 2025
Publication Date May 31, 2025
Published in Issue Year 2025 Volume: 13 Issue: 2

Cite

IEEE B. B. Oral, A. Çakıcı, and A. Savran, “Evaluation of Convolutional Networks for Event Camera Face Pose Alignment”, APJESS, vol. 13, no. 2, pp. 22–30, 2025, doi: 10.21541/apjess.1417068.

Academic Platform Journal of Engineering and Smart Systems