The implementation of the paper 'ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation' (CVPR2022). ArXiv
- Ubuntu 18.04
- CUDA 11.1
- Python 3.6
bop_toolkit
- Pytorch 1.10
- torchvision 0.11.0
- opencv-python
Progressive-X
Download with git clone --recurse-submodules
so that bop_toolkit
will also be cloned.
-
Download the dataset from
BOP benchmark
-
Download required ground truth folders of zebrapose from
owncloud
. The folders aremodels_GT_color
,XX_GT
(e.g.train_real_GT
andtest_GT
) andmodels
(models
is optional, only if you want to generate GT from scratch). -
The expected data structure:
. └── BOP ROOT PATH/ ├── lmo ├── ycbv/ │ ├── models │ ├── models_eval │ ├── models_fine │ ├── test │ ├── train_pbr │ ├── train_real │ ├── ... #(other files from BOP page) │ ├── models_GT_color #(from last step) │ ├── train_pbr_GT #(from last step) │ ├── train_real_GT #(from last step) │ ├── test_GT #(from last step) │ ├── train_pbr_GT_v2 #(from last step, for symmetry aware training) │ ├── train_real_GT_v2 #(from last step, for symmetry aware training) │ └── test_GT_v2 #(from last step, for symmetry aware training) └── tless
-
Download the 3
pretrained resnet
, save them underzebrapose/pretrained_backbone/resnet
, and downloadpretrained efficientnet
from "https://download.pytorch.org/models/efficientnet_b4_rwightman-7eb33cd5.pth", save it underzebrapose/pretrained_backbone/efficientnet
-
(Optional) Instead of download the ground truth, you can also generate them from scratch, details in
Generate_GT.md
.
Adjust the paths in the config files, and train the network with train.py
, e.g.
python train.py --cfg config/config_BOP/lmo/exp_lmo_BOP.txt --obj_name ape
The script will save the last 3 checkpoints and the best checkpoint, as well as tensorboard log. To enable sym. aware training, with --sym_aware_training True
For most datasets, a specific object occurs only once in a test images.
python test.py --cfg config/config_BOP/lmo/exp_lmo_BOP.txt --obj_name ape --ckpt_file path/to/the/best/checkpoint --ignore_bit 0 --eval_output_path path/to/save/the/evaluation/report
To use ICP for refinement, use --use_icp True
For datasets like tless, the number of a a specific object is unknown in the test stage.
python test_vivo.py --cfg config/config_BOP/tless/exp_tless_BOP.txt --ckpt_file path/to/the/best/checkpoint --ignore_bit 0 --obj_name obj01 --eval_output_path path/to/save/the/evaluation/report
To use ICP for refinement, use --use_icp True
Download our trained model from this link
. The ProgressiveX can not set random seed in its python API. The ADD results can be +/- 0.5%.
Merge the .csv
files generated in the last step using tools_for_BOP/merge_csv.py
, e.g.
python merge_csv.py --input_dir /dir/to/pose_result_bop/lmo --output_fn zebrapose_lmo-test.csv
And then evaluate it according to bop_toolkit
The results were reported with the same checkpoints. We fixed a bug that only influence the inference results:
The PnP solver requires the Bbox size to calculate the 2D pixel location in the original image. We modified the Bbox size in the dataloader. The bug is that we didn't update this modification for the PnP solver. If you remove the get_final_Bbox
in the dataloader, you will get the results reported in v1.
The bug has more influence if we resize the Bbox using crop_square_resize
. After we fixed the bug, we used crop_square_resize
for BOP challange (instead of crop_resize
in the config files in config_paper). We think this resize method should work better since it will not introduce distortion. However, we didn't compare resize methods with experiments.
The original code has been developed together with Mahdi Saleh
. Some code are adapted from Pix2Pose
, SingleShotPose
, GDR-Net
, and Deeplabv3
.
@article{su2022zebrapose,
title={ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation},
author={Su, Yongzhi and Saleh, Mahdi and Fetzer, Torben and Rambach, Jason and Navab, Nassir and Busam, Benjamin and Stricker, Didier and Tombari, Federico},
journal={arXiv preprint arXiv:2203.09418},
year={2022}
}