Research Article

Attentive Sequential Auto-Encoding Towards Unsupervised Object-centric Scene Modeling

Volume: 10 Number: 4 December 30, 2022
EN

Attentive Sequential Auto-Encoding Towards Unsupervised Object-centric Scene Modeling

Abstract

This paper describes an unsupervised sequential auto-encoding model targeting multi-object scenes. The proposed model uses an attention-based formulation, with reconstruction-driven losses. The main model relies on iteratively writing regions onto a canvas, in a differentiable manner. To enforce attention to objects and/or parts, the model uses a convolutional localization network, a region level bottleneck auto-encoder and a loss term that encourages reconstruction within a limited number of iterations. An extended version of the model incorporates a background modeling component that aims at handling scenes with complex backgrounds. The model is evaluated on two separate datasets: a synthetic dataset that is constructed by composing MNIST digit instances together, and the MS-COCO dataset. The model achieves high reconstruction ability on MNIST based scenes. The extended model shows promising results on the complex and challenging MS-COCO scenes.

Keywords

Supporting Institution

TUBITAK

Project Number

116E445

References

  1. Goodfellow I. J., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., & Bengio Y. (2014). Generative Adversarial Networks. Advances in Neural Information Processing Systems.
  2. Arjovsky M., Chintala S., & Bottou L. (2017). Wasserstein GAN. ArXiv:1701.07875 [Cs, Stat].
  3. Karras T., Laine S., & Aila T. (2019). A Style-Based Generator Architecture for Generative Adversarial Networks. Proc. CVPR.
  4. Kingma Diederik P., & Welling, M. (2014). Auto-Encoding Variational Bayes. International Conference on Learning Representations.
  5. Rezende D. J., Mohamed S., & Wierstra D. (2014). Stochastic Backpropagation and Approximate Inference in Deep Generative Models. ArXiv:1401.4082.
  6. Li Y., Swersky K., & Zemel R. (2015). Generative Moment Matching Networks. PMLR.
  7. Dinh L., Sohl-Dickstein J., & Bengio S. (2016). Density estimation using Real NVP.
  8. Kobyzev I., Prince S. J., & Brubaker M. A. (2020). Normalizing flows: An introduction and review of current methods. IEEE transactions on pattern analysis and machine intelligence, 43(11), 3964-3979.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

December 30, 2022

Submission Date

July 2, 2022

Acceptance Date

November 15, 2022

Published in Issue

Year 2022 Volume: 10 Number: 4

APA
Çetin, Y. D., & Cinbiş, R. G. (2022). Attentive Sequential Auto-Encoding Towards Unsupervised Object-centric Scene Modeling. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, 10(4), 1127-1142. https://doi.org/10.29109/gujsc.1139701

                                TRINDEX     16167        16166    21432    logo.png

      

    e-ISSN:2147-9526