Evaluation of Convolutional Networks for Event Camera Face Pose Alignment

Burhan Burak Oral; Alptuğ Çakıcı; Arman Savran

doi:10.21541/apjess.1417068

EN

Evaluation of Convolutional Networks for Event Camera Face Pose Alignment

Abstract

Event camera offers substantial advantages over conventional video cameras with their efficiency, extremely high temporal resolutions, low latency, and high dynamic range. These benefits have led to applications in various vision domains. Recently they have been applied in facial recognition tasks as well. However, while significant advantages of event cameras in some facial processing tasks have been demonstrated, the initial stage in almost any task, i.e., face alignment, is not at par with the conventional cameras. This study investigates the use of face alignment convolutional networks regarding both performance and complexity for event camera processing. Our aim is event camera face pose alignment that can be used as an efficient preprocessor for facial tasks. Therefore, we comparatively evaluate simple convolutional coordinate regression with a hybrid of coordinate and heatmap regression, known as pixel-in-pixel regression. Our experimental results reveal the superior performance of the hybrid method. However, we also show that if there is a computation bottleneck, simple convolutional coordinate regression is preferable for their low resource requirements though at the expense of some performance loss.

Keywords

Supporting Institution

YAŞAR ÜNİVERSİTESİ

Project Number

BAP112

Thanks

This work was supported by the Yaşar University Project Evaluation Commission, Turkey for the project ‘‘Dynamic Facial Analysis with Neuromorphic Camera’’ [grant number: BAP112].

References

G. Gallego et al., “Event-Based Vision: A Survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 1, pp. 154–180, Jan. 2022, doi: 10.1109/TPAMI.2020.3008413.
G. Tan, Y. Wang, H. Han, Y. Cao, F. Wu, and Z.-J. Zha, “Multi-grained Spatio-Temporal Features Perceived Network for Event-based Lip-Reading,” in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA: IEEE, Jun. 2022, pp. 20062–20071. doi: 10.1109/CVPR52688.2022.01946.
G. Moreira, A. Graca, B. Silva, P. Martins, and J. Batista, “Neuromorphic Event-based Face Identity Recognition,” in 2022 26th International Conference on Pattern Recognition (ICPR), Montreal, QC, Canada: IEEE, Aug. 2022, pp. 922–929. doi: 10.1109/ICPR56361.2022.9956236.
A. Savran, “Fully Convolutional Event-camera Voice Activity Detection Based on Event Intensity,” in 2023 Innovations in Intelligent Systems and Applications Conference (ASYU), Sivas, Turkiye: IEEE, Oct. 2023, pp. 1–6. doi: 10.1109/ASYU58738.2023.10296754.
A. Savran, “Multi-timescale boosting for efficient and improved event camera face pose alignment,” Computer Vision and Image Understanding, vol. 236, p. 103817, Nov. 2023, doi: 10.1016/j.cviu.2023.103817.
A. Savran and C. Bartolozzi, “Face Pose Alignment with Event Cameras,” Sensors, vol. 20, no. 24, p. 7079, Dec. 2020, doi: 10.3390/s20247079.
Z.-H. Feng, J. Kittler, M. Awais, and X.-J. Wu, “Rectified Wing Loss for Efficient and Robust Facial Landmark Localisation with Convolutional Neural Networks,” Int J Comput Vis, vol. 128, no. 8–9, pp. 2126–2145, Sep. 2020, doi: 10.1007/s11263-019-01275-0.
H. Jin, S. Liao, and L. Shao, “Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild,” Int J Comput Vis, vol. 129, no. 12, pp. 3174–3194, Dec. 2021, doi: 10.1007/s11263-021-01521-4.

B. Browatzki and C. Wallraven, “3FabRec: Fast Few-Shot Face Alignment by Reconstruction,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, Jun. 2020, pp. 6109–6119. doi: 10.1109/CVPR42600.2020.00615.
Y. Sun, X. Wang, and X. Tang, “Deep Convolutional Network Cascade for Facial Point Detection,” in 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA: IEEE, Jun. 2013, pp. 3476–3483. doi: 10.1109/CVPR.2013.446.
Y. Wu, T. Hassner, K. Kim, G. Medioni, and P. Natarajan, “Facial Landmark Detection with Tweaked Convolutional Neural Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 12, pp. 3067–3074, Dec. 2018, doi: 10.1109/TPAMI.2017.2787130.
S. Honari, P. Molchanov, S. Tyree, P. Vincent, C. Pal, and J. Kautz, “Improving Landmark Localization with Semi-Supervised Learning,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA: IEEE, Jun. 2018, pp. 1546–1555. doi: 10.1109/CVPR.2018.00167.
A. Kumar et al., “LUVLi Face Alignment: Estimating Landmarks’ Location, Uncertainty, and Visibility Likelihood,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, Jun. 2020, pp. 8233–8243. doi: 10.1109/CVPR42600.2020.00826.
X. Dong and Y. Yang, “Teacher Supervises Students How to Learn From Partially Labeled Images for Facial Landmark Detection,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, Oct. 2019, pp. 783–792. doi: 10.1109/ICCV.2019.00087.
X. Wang, L. Bo, and L. Fuxin, “Adaptive Wing Loss for Robust Face Alignment via Heatmap Regression,” arXiv:1904.07399 [cs], May 2020, Accessed: Mar. 27, 2022. [Online]. Available: http://arxiv.org/abs/1904.07399
M. Cannici, M. Ciccone, A. Romanoni, and M. Matteucci, “Asynchronous Convolutional Networks for Object Detection in Neuromorphic Cameras,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA: IEEE, Jun. 2019, pp. 1656–1665. doi: 10.1109/CVPRW.2019.00209.
F. Paredes-Valles and G. C. H. E. de Croon, “Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA: IEEE, Jun. 2021, pp. 3445–3454. doi: 10.1109/CVPR46437.2021.00345.
D. Gehrig, A. Loquercio, K. Derpanis, and D. Scaramuzza, “End-to-End Learning of Representations for Asynchronous Event-Based Data,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, Oct. 2019, pp. 5632–5642. doi: 10.1109/ICCV.2019.00573.
E. Perot, P. de Tournemire, D. Nitti, J. Masci, and A. Sironi, “Learning to Detect Objects with a 1 Megapixel Event Camera.” arXiv, Dec. 09, 2020. Accessed: Apr. 24, 2024. [Online]. Available: http://arxiv.org/abs/2009.13436
A. Kugele, T. Pfeil, M. Pfeiffer, and E. Chicca, “How Many Events Make an Object? Improving Single-frame Object Detection on the 1 Mpx Dataset,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 3913–3922. doi: 10.1109/CVPRW59228.2023.00406.
C. Boretti, P. Bich, F. Pareschi, L. Prono, R. Rovatti, and G. Setti, “PEDRo: an Event-based Dataset for Person Detection in Robotics,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4065–4070. doi: 10.1109/CVPRW59228.2023.00426.
G. Goyal, F. Di Pietro, N. Carissimi, A. Glover, and C. Bartolozzi, “MoveEnet: Online High-Frequency Human Pose Estimation with an Event Camera,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4024–4033. doi: 10.1109/CVPRW59228.2023.00420.
P. R. Gantier Cadena, Y. Qian, C. Wang, and M. Yang, “Sparse-E2VID: A Sparse Convolutional Model for Event-Based Video Reconstruction Trained with Real Event Noise,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4150–4158. doi: 10.1109/CVPRW59228.2023.00437.
L. Berlincioni et al., “Neuromorphic Event-based Facial Expression Recognition.” arXiv, Apr. 13, 2023. Accessed: Apr. 24, 2024. [Online]. Available: http://arxiv.org/abs/2304.06351
H. Bulzomi, M. Schweiker, A. Gruel, and J. Martinet, “End-to-end Neuromorphic Lip Reading,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada: IEEE, Jun. 2023, pp. 4101–4108. doi: 10.1109/CVPRW59228.2023.00431.
A. Savran, R. Tavarone, B. Higy, L. Badino, and C. Bartolozzi, “Energy and Computation Efficient Audio-Visual Voice Activity Detection Driven by Event-Cameras,” in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi’an: IEEE, May 2018, pp. 333–340. doi: 10.1109/FG.2018.00055.
A. Savran, "Temporal Convolutional Networks for Efficient Voice Activity Detection with Event Camera," Journal of Intelligent Systems: Theory and Applications, vol. 7, no. 2, pp. 102–115, Sep. 2024, doi: 10.38016/jista.1400047.
A. Savran, “Comparison of Timing Strategies for Face Pose Alignment with Event Camera,” in 2023 8th International Conference on Computer Science and Engineering (UBMK), Burdur, Turkiye: IEEE, Sep. 2023, pp. 97–101. doi: 10.1109/UBMK59864.2023.10286582.
K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition.” arXiv, Dec. 10, 2015. Accessed: Jan. 09, 2024. [Online]. Available: http://arxiv.org/abs/1512.03385
S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, “Aggregated Residual Transformations for Deep Neural Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI: IEEE, Jul. 2017, pp. 5987–5995. doi: 10.1109/CVPR.2017.634.
J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 2018, pp. 7132-7141, doi: 10.1109/CVPR.2018.00745.
A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications.” arXiv, Apr. 16, 2017. Accessed: Apr. 29, 2024. [Online]. Available: http://arxiv.org/abs/1704.04861
M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT: IEEE, Jun. 2018, pp. 4510–4520. doi: 10.1109/CVPR.2018.00474.

Details

Primary Language

English

Subjects

Deep Learning, Machine Vision

Journal Section

Research Article

Authors

Burhan Burak Oral ^*
0000-0002-9635-078X
Türkiye

Alptuğ Çakıcı
0009-0001-1251-8793
Türkiye

Arman Savran
0000-0001-5142-6384
Türkiye

Early Pub Date

May 30, 2025

Publication Date

May 31, 2025

Submission Date

January 9, 2024

Acceptance Date

January 16, 2025

Published in Issue

Year 2025 Volume: 13 Number: 2

DOI

https://doi.org/10.21541/apjess.1417068

IZ

https://izlik.org/JA52TG35NR

Cite

RIS / Bibtex

APA

Oral, B. B., Çakıcı, A., & Savran, A. (2025). Evaluation of Convolutional Networks for Event Camera Face Pose Alignment. Academic Platform Journal of Engineering and Smart Systems, 13(2), 22-30. https://doi.org/10.21541/apjess.1417068

AMA

1.Oral BB, Çakıcı A, Savran A. Evaluation of Convolutional Networks for Event Camera Face Pose Alignment. APJESS. 2025;13(2):22-30. doi:10.21541/apjess.1417068

Chicago

Oral, Burhan Burak, Alptuğ Çakıcı, and Arman Savran. 2025. “Evaluation of Convolutional Networks for Event Camera Face Pose Alignment”. Academic Platform Journal of Engineering and Smart Systems 13 (2): 22-30. https://doi.org/10.21541/apjess.1417068.

EndNote

Oral BB, Çakıcı A, Savran A (May 1, 2025) Evaluation of Convolutional Networks for Event Camera Face Pose Alignment. Academic Platform Journal of Engineering and Smart Systems 13 2 22–30.

IEEE

[1]B. B. Oral, A. Çakıcı, and A. Savran, “Evaluation of Convolutional Networks for Event Camera Face Pose Alignment”, APJESS, vol. 13, no. 2, pp. 22–30, May 2025, doi: 10.21541/apjess.1417068.

ISNAD

Oral, Burhan Burak - Çakıcı, Alptuğ - Savran, Arman. “Evaluation of Convolutional Networks for Event Camera Face Pose Alignment”. Academic Platform Journal of Engineering and Smart Systems 13/2 (May 1, 2025): 22-30. https://doi.org/10.21541/apjess.1417068.

JAMA

1.Oral BB, Çakıcı A, Savran A. Evaluation of Convolutional Networks for Event Camera Face Pose Alignment. APJESS. 2025;13:22–30.

MLA

Oral, Burhan Burak, et al. “Evaluation of Convolutional Networks for Event Camera Face Pose Alignment”. Academic Platform Journal of Engineering and Smart Systems, vol. 13, no. 2, May 2025, pp. 22-30, doi:10.21541/apjess.1417068.

Vancouver

1.Burhan Burak Oral, Alptuğ Çakıcı, Arman Savran. Evaluation of Convolutional Networks for Event Camera Face Pose Alignment. APJESS. 2025 May 1;13(2):22-30. doi:10.21541/apjess.1417068