EN
Attentive Sequential Auto-Encoding Towards Unsupervised Object-centric Scene Modeling
Abstract
This paper describes an unsupervised sequential auto-encoding model targeting multi-object scenes. The proposed model uses an attention-based formulation, with reconstruction-driven losses. The main model relies on iteratively writing regions onto a canvas, in a differentiable manner. To enforce attention to objects and/or parts, the model uses a convolutional localization network, a region level bottleneck auto-encoder and a loss term that encourages reconstruction within a limited number of iterations. An extended version of the model incorporates a background modeling component that aims at handling scenes with complex backgrounds. The model is evaluated on two separate datasets: a synthetic dataset that is constructed by composing MNIST digit instances together, and the MS-COCO dataset. The model achieves high reconstruction ability on MNIST based scenes. The extended model shows promising results on the complex and challenging MS-COCO scenes.
Keywords
Supporting Institution
TUBITAK
Project Number
116E445
References
- Goodfellow I. J., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., & Bengio Y. (2014). Generative Adversarial Networks. Advances in Neural Information Processing Systems.
- Arjovsky M., Chintala S., & Bottou L. (2017). Wasserstein GAN. ArXiv:1701.07875 [Cs, Stat].
- Karras T., Laine S., & Aila T. (2019). A Style-Based Generator Architecture for Generative Adversarial Networks. Proc. CVPR.
- Kingma Diederik P., & Welling, M. (2014). Auto-Encoding Variational Bayes. International Conference on Learning Representations.
- Rezende D. J., Mohamed S., & Wierstra D. (2014). Stochastic Backpropagation and Approximate Inference in Deep Generative Models. ArXiv:1401.4082.
- Li Y., Swersky K., & Zemel R. (2015). Generative Moment Matching Networks. PMLR.
- Dinh L., Sohl-Dickstein J., & Bengio S. (2016). Density estimation using Real NVP.
- Kobyzev I., Prince S. J., & Brubaker M. A. (2020). Normalizing flows: An introduction and review of current methods. IEEE transactions on pattern analysis and machine intelligence, 43(11), 3964-3979.
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Publication Date
December 30, 2022
Submission Date
July 2, 2022
Acceptance Date
November 15, 2022
Published in Issue
Year 2022 Volume: 10 Number: 4
