EPro PnP Save

[CVPR 2022 Oral, Best Student Paper] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Project README

EPro-PnP

📢 NEWS: We have released EPro-PnP-v2. A new updated preprint can be found on arXiv.

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
In CVPR 2022 (Oral, Best Student Paper). [paper][video]
Hansheng Chen*^1,2, Pichao Wang†², Fan Wang², Wei Tian†¹, Lu Xiong¹, Hao Li²

¹Tongji University, ²Alibaba Group
*Part of work done during an internship at Alibaba Group.
†Corresponding Authors: Pichao Wang, Wei Tian.

Introduction

EPro-PnP is a probabilistic Perspective-n-Points (PnP) layer for end-to-end 6DoF pose estimation networks. Broadly speaking, it is essentially a continuous counterpart of the widely used categorical Softmax layer, and is theoretically generalizable to other learning models with nested $\mathrm{arg\,min}$ optimization.

Given the layer input: an $N$ -point correspondence set $X = \left\{x^\text{3D}_i,x^\text{2D}_i,w^\text{2D}_i\,\middle|\,i=1\cdots N\right\}$ consisting of 3D object coordinates $x^\text{3D}_i \in \mathbb{R}^3$ , 2D image coordinates $x^\text{2D}_i \in \mathbb{R}^2$ , and 2D weights $w^\text{2D}_i \in \mathbb{R}^2_+$ , a conventional PnP solver searches for an optimal pose $y^\ast$ (rigid transformation in SE(3)) that minimizes the weighted reprojection error. Previous work tries to backpropagate through the PnP operation, yet $y^\ast$ is inherently non-differentiable due to the inner $\mathrm{arg\,min}$ operation. This leads to convergence issue if all the components in $X$ must be learned by the network.

In contrast, our probabilistic PnP layer outputs a posterior distribution of pose, whose probability density $p(y|X)$ can be derived for proper backpropagation. The distribution is approximated via Monte Carlo sampling. With EPro-PnP, the correspondences $X$ can be learned from scratch altogether by minimizing the KL divergence between the predicted and target pose distribution.

Models

V1 models in this repository

EPro-PnP-6DoF for 6DoF pose estimation

EPro-PnP-Det for 3D object detection

New V2 models

EPro-PnP-Det v2: state-of-the-art monocular 3D object detector

Main differences to v1b:

Use GaussianMixtureNLLLoss as auxiliary coordinate regression loss
Add auxiliary depth and bbox losses

At the time of submission (Aug 30, 2022), EPro-PnP-Det v2 ranks 1st among all camera-based single-frame object detection models on the official nuScenes benchmark (test split, without extra data).

Method	TTA	Backbone	NDS	mAP	mATE	mASE	mAOE	mAVE	mAAE	Schedule
EPro-PnP-Det v2 (ours)	Y	R101	0.490	0.423	0.547	0.236	0.302	1.071	0.123	12 ep
PETR	N	Swin-B	0.483	0.445	0.627	0.249	0.449	0.927	0.141	24 ep
BEVDet-Base	Y	Swin-B	0.482	0.422	0.529	0.236	0.395	0.979	0.152	20 ep
EPro-PnP-Det v2 (ours)	N	R101	0.481	0.409	0.559	0.239	0.325	1.090	0.115	12 ep
PolarFormer	N	R101	0.470	0.415	0.657	0.263	0.405	0.911	0.139	24 ep
BEVFormer-S	N	R101	0.462	0.409	0.650	0.261	0.439	0.925	0.147	24 ep
PETR	N	R101	0.455	0.391	0.647	0.251	0.433	0.933	0.143	24 ep
EPro-PnP-Det v1	Y	R101	0.453	0.373	0.605	0.243	0.359	1.067	0.124	12 ep
PGD	Y	R101	0.448	0.386	0.626	0.245	0.451	1.509	0.127	24+24 ep
FCOS3D	Y	R101	0.428	0.358	0.690	0.249	0.452	1.434	0.124	-

EPro-PnP-6DoF v2 for 6DoF pose estimation

Main differences to v1b:

Fix w2d scale handling (very important)
Improve network initialization
Adjust loss weights

With these updates the v2 model can be trained without 3D models to achieve better performance (ADD 0.1d = 93.83) than GDRNet (ADD 0.1d = 93.6), unleashing the full potential of simple end-to-end training.

Use EPro-PnP in Your Own Model

We provide a demo on the usage of the EPro-PnP layer.

Citation

If you find this project useful in your research, please consider citing:

@inproceedings{epropnp, 
  author = {Hansheng Chen and Pichao Wang and Fan Wang and Wei Tian and Lu Xiong and Hao Li, 
  title = {EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation}, 
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
  year = {2022}
}

Open Source Agenda is not affiliated with "EPro PnP" Project. README Source: tjiiv-cprg/EPro-PnP

Stars

1,052

Open Issues

Last Commit

9 months ago

Repository

tjiiv-cprg/EPro-PnP

License

Apache-2.0

Homepage

https://www.youtube.com/watch?v=TonBodQ6EUU

Open Source Agenda Badge

<a href="https://www.opensourceagenda.com/projects/epro-pnp"><img src="https://www.opensourceagenda.com/projects/epro-pnp/reviews/badge.svg" alt="Open Source Agenda"></a>

Submit Review Review Your Favorite Project

Submit Resource Articles, Courses, Videos

Submit Article Submit a post to our blog

From the blog

Dec 11, 2022

How to Choose Which Programming Language to Learn First?

From the blog

Dec 11, 2022