A Coarse-to-Fine Model for 3D Pose Estimation and Sub-category Recognition

Introduction

Despite the fact that object detection, 3D pose estimation, and sub-category recognition are highly correlated tasks, they are usually addressed independently from each other because of the huge space of parameters. To jointly model all of these tasks, we propose a coarse-to-fine hierarchical representation, where each level of the hierarchy represents objects at a different level of granularity. The hierarchical representation prevents performance loss, which is often caused by the increase in the number of parameters (as we consider more tasks to model), and the joint modeling enables resolving ambiguities that exist in independent modeling of these tasks. We augment PASCAL3D+ [1] dataset with annotations for these tasks and show that our hierarchical model is effective in joint modeling of object detection, 3D pose estimation, and sub-category recognition.

Publication

Roozbeh Mottaghi, Yu Xiang, and Silvio Savarese. A Coarse-to-Fine Model for 3D Pose Estimation and Sub-category Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. bibtex, pdf, supp.material

Annotations and 3D CAD Models

The annoations used in our experiments are here.

The 3D CAD models used in our experiments are here.

References

Y. Xiang, R. Mottaghi, and S. Savarese. Beyond pascal: A benchmark for 3d object detection in the wild. In WACV,2014.

Acknowledgements

We acknowledge the support of ONR grant N00014-13-1-0761 and NSF CAREER 1054127.

Contact : roozbeh [at( cs.stanford.edu

Last update : 4/28/2015