A Coarse-to-Fine Model for 3D Pose Estimation and Sub-category Recognition


Despite the fact that object detection, 3D pose estimation, and sub-category recognition are highly correlated tasks, they are usually addressed independently from each other because of the huge space of parameters. To jointly model all of these tasks, we propose a coarse-to-fine hierarchical representation, where each level of the hierarchy represents objects at a different level of granularity. The hierarchical representation prevents performance loss, which is often caused by the increase in the number of parameters (as we consider more tasks to model), and the joint modeling enables resolving ambiguities that exist in independent modeling of these tasks. We augment PASCAL3D+ [1] dataset with annotations for these tasks and show that our hierarchical model is effective in joint modeling of object detection, 3D pose estimation, and sub-category recognition.


Annotations and 3D CAD Models

  • The annoations used in our experiments are here.
  • The 3D CAD models used in our experiments are here.


  • Y. Xiang, R. Mottaghi, and S. Savarese. Beyond pascal: A benchmark for 3d object detection in the wild. In WACV,2014.


  • We acknowledge the support of ONR grant N00014-13-1-0761 and NSF CAREER 1054127.

    Contact : roozbeh [at( cs.stanford.edu

    Last update : 4/28/2015