A Region Covariances-based Visual Attention Model for RGB-D Images

Erkut Erdem

A Region Covariances-based Visual Attention Model for RGB-D Images

Yıl 2016, Cilt: 4 Sayı: 4, 128 - 134, 06.12.2016

Öz

Existing computational models of visual attention generally employ simple image features such as color, intensity or orientation to generate a saliency map which highlights the image parts that attract human attention. Interestingly, most of these models do not process any depth information and operate only on standard two-dimensional RGB images. On the other hand, depth processing through stereo vision is a key characteristics of the human visual system. In line with this observation, in this study, we propose to extend two state-of-the-art static saliency models that depend on region covariances to process additional depth information available in RGB-D images. We evaluate our proposed models on NUS-3D benchmark dataset by taking into account different evaluation metrics. Our results reveal that using the additional depth information improves the saliency prediction in a statistically significant manner, giving more accurate saliency maps.

Anahtar Kelimeler

Visual attention, Visual saliency; Depth saliency; RGB-D images; Region covariances

Kaynakça

Y. Benjamini, and Y. Hochberg (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), pages 289-300.
A. Borji, and L. Itti (2013). State-of-the-art in Visual Attention Modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35(1), pages 185-207.
N. D. Bruce, and J. K. Tsotsos (2005). An attentional framework for stereo vision. In Proc. IEEE Canadian Conference on Computer and Robot Vision, pages 88-95.
N. Bruce, and J. Tsotsos (2006). Saliency based on information maximization. In Proc. Advance in Neural Information Processing Systems (NIPS), pages 155-162.
N. Bruce, and J. Tsotsos (2009). Saliency, attention, and visual search: An information theoretic approach. Journal of Vision, Vol. 9(3):5, pages 1-24.
Z. Bylinskii, T. Judd, A. Borji, L. Itti, F. Durand, A. Oliva, and A. Torralba (accessed by 2016). MIT Saliency Benchmark, http://saliency.mit.edu.
Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, and F. Durand (2016). What do different evaluation metrics tell us about saliency models?. arXiv preprint arXiv:1604.03605.
E. Erdem, and A. Erdem (2013). Visual saliency estimation by nonlinearly integrating features using region covariances. Journal of Vision, Vol. 13(4):1, pages 1-20.
W. Föerstner, and B. Moonen (1999). A metric for covariance matrices (Tech. Rep.). Department of Geodesy and Geoinformatics, Stuttgart University, Germany.
D. Gao, and N. Vasconcelos (2007). Bottom-up saliency is a discriminant process. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 1-6.
S. Goferman, L. Zelnik-Manor, and A. Tal (2010). Context-aware saliency detection. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2376-2383.
J. Harel, C. Koch, and P. Perona (2007). Graph-based visual saliency. In Proc. Advance in Neural Information Processing Systems (NIPS), pages 545-552.
X. Hong, H. Chang, S. Shan, X. Chen, and W. Gao (2009). Sigma Set: A small second order statistical region descriptor. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1802-1809.
X. Hou, and L. Zhang (2007). Saliency detection: A spectral residual approach. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1-8.
B. Hu, R. Kane-Jackson, and E. Niebur (2016). A proto-object based saliency model in three-dimensional space. Vision Research, Vol. 119, pages 42-49.
H. Hügli, T. Jost, and N. Ouerhani (2005). Model performance for visual attention in real 3d color scenes. In Proc. Artificial intelligence and knowledge engineering applications: A bioinspired approach, pages 469-478
I. Iatsun, M.-C. Larabin, C. Fernandez-Maloigne (2015). A visual attention model for stereoscopic 3D images using monocular cues. Signal Processing: Image Communication, Vol. 38, pages 70-83.
L. Itti, C. Koch, and E. Niebur (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20(11), pages 1254-1259.
T. Jost, N. Ouerhani, R. von Wartburg, R., Müri, and H. Hügli (2004). Contribution of depth to visual attention: Comparison of a computer model and human. In Proc. Early Cognitive Vision Workshop, pages 1-4.
T. Judd, K. Ehinger, F. Durand, and A. Torralba (2009). Learning to predict where humans look. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 2106-2113.
S. S. S. Kruthiventi, V. Gudisa, J. H. Dholakiya, and R. V. Babu (2016). Saliency Unified: A Deep Architecture for Simultaneous Eye Fixation Prediction and Salient Object Segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5781-5790.
C. Lang, T. V. Nguyen, H. Katti, K. Yadati, M. Kankanhalli, and S. Yan (2012). Depth matters: Influence of depth cues on visual saliency. In Proc. European Conference of Computer Vision (ECCV), pages 101-115.
C.-Y. Ma, and H.-M. Hang (2015). Learning-based saliency model with depth information. Journal of Vision, Vol. 15(6):19, pages. 1-22.
R. Margolin, A. Tal, and L. Zelnik-Manor (2013). What makes a patch distinct? In Proc IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1139-1146.
N. Ouerhani, and H. Hügli (2000). Computing visual attention from scene depth. In Proc. International Conference on Pattern Recognition, pages 375-378.
J. Pan, E. Sayrol, X. Giro-i Nieto, K. McGuinness, and N. O’Connor (2016). Shallow and deep convolutional networks for saliency prediction. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 598-606.
S. Ramenahalli, and E. Niebur (2013). Computing 3D saliency from a 2D image. In Proc. Annual conference on information sciences and systems (CISS), pages 1-5.
A. F. Russell, S. Mihalas, R. von der Heydt, E., Niebur, and R. Etienne-Cummings (2014). A model of proto-object based saliency. Vision Research, Vol. 94, pages 1-15.
H. J. Seo, and P. Milanfar (2009). Static and space-time visual saliency detection by self-resemblance. Journal of Vision, Vol. 9(12):15, pages 1-27.
O. Tuzel, F., Porikli, and P. Meer, (2006). Region covariance: A fast descriptor for detection and classification. In Proc. European Conference of Computer Vision (ECCV), pages 589-600.
J. Wang, M. P. DaSilva, P. LeCallet, and V. Ricordel (2013). Computational model of stereoscopic 3d visual saliency. IEEE Transactions on Image Processing, Vol. 22(6), pages 2151-2165.
L. Zhang, M. H. Tong, T. K. Marks, H. Shan, and G. W. Cottrell (2008). SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision, Vol. 8(7):32, pages 1-20.
Y. Zhang, G. Jiang, M. Yu, and K. Chen (2010). Stereoscopic visual attention model for 3D video. In Proc. Advances in Multimedia Modeling, pages 314–324.

Yıl 2016, Cilt: 4 Sayı: 4, 128 - 134, 06.12.2016

Erkut Erdem

Öz

Kaynakça

Y. Benjamini, and Y. Hochberg (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological), pages 289-300.
A. Borji, and L. Itti (2013). State-of-the-art in Visual Attention Modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35(1), pages 185-207.
N. D. Bruce, and J. K. Tsotsos (2005). An attentional framework for stereo vision. In Proc. IEEE Canadian Conference on Computer and Robot Vision, pages 88-95.
N. Bruce, and J. Tsotsos (2006). Saliency based on information maximization. In Proc. Advance in Neural Information Processing Systems (NIPS), pages 155-162.
N. Bruce, and J. Tsotsos (2009). Saliency, attention, and visual search: An information theoretic approach. Journal of Vision, Vol. 9(3):5, pages 1-24.
Z. Bylinskii, T. Judd, A. Borji, L. Itti, F. Durand, A. Oliva, and A. Torralba (accessed by 2016). MIT Saliency Benchmark, http://saliency.mit.edu.
Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, and F. Durand (2016). What do different evaluation metrics tell us about saliency models?. arXiv preprint arXiv:1604.03605.
E. Erdem, and A. Erdem (2013). Visual saliency estimation by nonlinearly integrating features using region covariances. Journal of Vision, Vol. 13(4):1, pages 1-20.
W. Föerstner, and B. Moonen (1999). A metric for covariance matrices (Tech. Rep.). Department of Geodesy and Geoinformatics, Stuttgart University, Germany.
D. Gao, and N. Vasconcelos (2007). Bottom-up saliency is a discriminant process. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 1-6.
S. Goferman, L. Zelnik-Manor, and A. Tal (2010). Context-aware saliency detection. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2376-2383.
J. Harel, C. Koch, and P. Perona (2007). Graph-based visual saliency. In Proc. Advance in Neural Information Processing Systems (NIPS), pages 545-552.
X. Hong, H. Chang, S. Shan, X. Chen, and W. Gao (2009). Sigma Set: A small second order statistical region descriptor. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1802-1809.
X. Hou, and L. Zhang (2007). Saliency detection: A spectral residual approach. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1-8.
B. Hu, R. Kane-Jackson, and E. Niebur (2016). A proto-object based saliency model in three-dimensional space. Vision Research, Vol. 119, pages 42-49.
H. Hügli, T. Jost, and N. Ouerhani (2005). Model performance for visual attention in real 3d color scenes. In Proc. Artificial intelligence and knowledge engineering applications: A bioinspired approach, pages 469-478
I. Iatsun, M.-C. Larabin, C. Fernandez-Maloigne (2015). A visual attention model for stereoscopic 3D images using monocular cues. Signal Processing: Image Communication, Vol. 38, pages 70-83.
L. Itti, C. Koch, and E. Niebur (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20(11), pages 1254-1259.
T. Jost, N. Ouerhani, R. von Wartburg, R., Müri, and H. Hügli (2004). Contribution of depth to visual attention: Comparison of a computer model and human. In Proc. Early Cognitive Vision Workshop, pages 1-4.
T. Judd, K. Ehinger, F. Durand, and A. Torralba (2009). Learning to predict where humans look. In Proc. IEEE International Conference on Computer Vision (ICCV), pages 2106-2113.
S. S. S. Kruthiventi, V. Gudisa, J. H. Dholakiya, and R. V. Babu (2016). Saliency Unified: A Deep Architecture for Simultaneous Eye Fixation Prediction and Salient Object Segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 5781-5790.
C. Lang, T. V. Nguyen, H. Katti, K. Yadati, M. Kankanhalli, and S. Yan (2012). Depth matters: Influence of depth cues on visual saliency. In Proc. European Conference of Computer Vision (ECCV), pages 101-115.
C.-Y. Ma, and H.-M. Hang (2015). Learning-based saliency model with depth information. Journal of Vision, Vol. 15(6):19, pages. 1-22.
R. Margolin, A. Tal, and L. Zelnik-Manor (2013). What makes a patch distinct? In Proc IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1139-1146.
N. Ouerhani, and H. Hügli (2000). Computing visual attention from scene depth. In Proc. International Conference on Pattern Recognition, pages 375-378.
J. Pan, E. Sayrol, X. Giro-i Nieto, K. McGuinness, and N. O’Connor (2016). Shallow and deep convolutional networks for saliency prediction. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 598-606.
S. Ramenahalli, and E. Niebur (2013). Computing 3D saliency from a 2D image. In Proc. Annual conference on information sciences and systems (CISS), pages 1-5.
A. F. Russell, S. Mihalas, R. von der Heydt, E., Niebur, and R. Etienne-Cummings (2014). A model of proto-object based saliency. Vision Research, Vol. 94, pages 1-15.
H. J. Seo, and P. Milanfar (2009). Static and space-time visual saliency detection by self-resemblance. Journal of Vision, Vol. 9(12):15, pages 1-27.
O. Tuzel, F., Porikli, and P. Meer, (2006). Region covariance: A fast descriptor for detection and classification. In Proc. European Conference of Computer Vision (ECCV), pages 589-600.
J. Wang, M. P. DaSilva, P. LeCallet, and V. Ricordel (2013). Computational model of stereoscopic 3d visual saliency. IEEE Transactions on Image Processing, Vol. 22(6), pages 2151-2165.
L. Zhang, M. H. Tong, T. K. Marks, H. Shan, and G. W. Cottrell (2008). SUN: A Bayesian framework for saliency using natural statistics. Journal of Vision, Vol. 8(7):32, pages 1-20.
Y. Zhang, G. Jiang, M. Yu, and K. Chen (2010). Stereoscopic visual attention model for 3D video. In Proc. Advances in Multimedia Modeling, pages 314–324.

Toplam 33 adet kaynakça vardır.

Ayrıntılar

Bölüm	Research Article
Yazarlar	Erkut Erdem
Yayımlanma Tarihi	6 Aralık 2016
Yayımlandığı Sayı	Yıl 2016 Cilt: 4 Sayı: 4

Kaynak Göster

APA	Erdem, E. (2016). A Region Covariances-based Visual Attention Model for RGB-D Images. International Journal of Intelligent Systems and Applications in Engineering, 4(4), 128-134.
AMA	Erdem E. A Region Covariances-based Visual Attention Model for RGB-D Images. International Journal of Intelligent Systems and Applications in Engineering. Aralık 2016;4(4):128-134.
Chicago	Erdem, Erkut. “A Region Covariances-Based Visual Attention Model for RGB-D Images”. International Journal of Intelligent Systems and Applications in Engineering 4, sy. 4 (Aralık 2016): 128-34.
EndNote	Erdem E (01 Aralık 2016) A Region Covariances-based Visual Attention Model for RGB-D Images. International Journal of Intelligent Systems and Applications in Engineering 4 4 128–134.
IEEE	E. Erdem, “A Region Covariances-based Visual Attention Model for RGB-D Images”, International Journal of Intelligent Systems and Applications in Engineering, c. 4, sy. 4, ss. 128–134, 2016.
ISNAD	Erdem, Erkut. “A Region Covariances-Based Visual Attention Model for RGB-D Images”. International Journal of Intelligent Systems and Applications in Engineering 4/4 (Aralık 2016), 128-134.
JAMA	Erdem E. A Region Covariances-based Visual Attention Model for RGB-D Images. International Journal of Intelligent Systems and Applications in Engineering. 2016;4:128–134.
MLA	Erdem, Erkut. “A Region Covariances-Based Visual Attention Model for RGB-D Images”. International Journal of Intelligent Systems and Applications in Engineering, c. 4, sy. 4, 2016, ss. 128-34.
Vancouver	Erdem E. A Region Covariances-based Visual Attention Model for RGB-D Images. International Journal of Intelligent Systems and Applications in Engineering. 2016;4(4):128-34.

Makale Dosyaları

Tam Metin