[cvpr 20] Demo, training and evaluation code for joint hand-object pose estimation in sparsely annotated videos
Yana Hasson, Bugra Tekin, Federica Bogo, Ivan Laptev, Marc Pollefeys, and Cordelia Schmid
git clone https://github.com/hassony2/handobjectconsist`
cd handobjectconsist
conda env create --file=environment.yml
conda activate handobject_env
Go to MANO website
Create an account by clicking Sign Up and provide your information
Download Models and Code (the downloaded file should have the format mano_v*_*.zip). Note that all code and data from this download falls under the MANO license.
unzip and copy the content of the models folder into the assets/mano
folder
Your structure should look like this:
handobjectconsist/
assets/
mano/
MANO_LEFT.pkl
MANO_RIGHT.pkl
fhb_skel_centeridx9.pkl
data/fhbhands
folderObject_models
unzip data/fhbhands/Object_models.zip -d data/fhbhands
tar -xvf assets/fhbhands_fits.tgz -C assets/
wget https://github.com/hassony2/handobjectconsist/releases/download/v0.2/releasemodels.zip
unzip releasemodels.zip
Optionally, resize the images (speeds up training !)
python reduce_fphab.py
Your structure should look like this:
data/
fhbhands/
Video_files/
Video_files_480/ # Optional, created by reduce_fphab.py script
Subects_info/
Object_models/
Hand_pose_annotation_v1/
Object_6D_pose_annotation_v1_1/
assets/
fhbhands_fits/
releasemodels/
fphab/
...
Note that all results in our paper are reported on a subset of the current dataset which was published as an early release, additionally we used synthetic data which is not released. The results are therefore not directly comparable with the final published results which are reported on the v2 version of the dataset.
After submisison I retrained a baseline model on the current dataset (official release of HO3D, which I refer to as HO3D-v2). You can get the model from the releasemodels
Evaluate the pre-trained model:
Download pre-trained models
Extract the pre-trained models unzip releasemodels.zip
Run the evaluation code and generate the codalab submission file
python evalho3dv2.py --resume releasemodels/ho3dv2/realonly/checkpoint_200.pth --val_split test --json_folder jsonres/res
This will create a file 'pred.zip' ready for upload to the codalab challenge
Download the HO3D-v2 dataset.
launch training using python trainmeshreg
and providing all arguments as in releasemodels/ho3dv2/realonly/opt.txt
Run the demo on the FPHAB dataset.
python visualize.py
This script loads three models and visualizes their predictions on samples from the test split of FPHAB:
It produces images such as the following:
Run the training code
Train baseline model of entire FPHAB (100% of the data is supervised with 3D annotations)
python trainmeshreg.py --freeze_batchnorm --workers 8 --block_rot
python trainmeshreg.py --freeze_batchnorm --workers 8 --fraction 0.00625 --eval_freq 50
Step 1 will have produced a trained model which will be saved in a subdirectory of checkpoints/fhbhands_train_mini1/{data_you_launched_trainings}/
.
Step 2 will resume training from this model, and further train with the additional photometric consistency loss on the frames for which the ground truth annotations are not used.
python trainmeshwarp.py --freeze_batchnorm --consist_gt_refs --workers 8 --fraction 0.00625 --resume checkpoints/path/to/saved/checkpoint.pth
python trainmeshwarp.py --freeze_batchnorm --consist_gt_refs --workers 8 --fraction 0.00625 --resume checkpoints/path/to/saved/checkpoint.pth --lambda_data 1 --lambda_consist 0
If you find this code useful for your research, consider citing our paper:
@INPROCEEDINGS{hasson20_handobjectconsist,
title = {Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction},
author = {Hasson, Yana and Tekin, Bugra and Bogo, Federica and Laptev, Ivan and Pollefeys, Marc and Schmid, Cordelia},
booktitle = {CVPR},
year = {2020}
}
Thanks to Samira Kaviani for spotting that in Table 2. the splits are different because I previously filtered out frames for which hands are further than 10cm away from the object ! I will rerun the results beginning September and update them here.
For this project, we relied on research code from:
I would like to specially thank Shreyas Hampali for advice on the HO-3D dataset and Guillermo Garcia-Hernando for advice and on the FPHAB dataset.
I would also like to thank Mihai Dusmanu, Yann Labbé and Thomas Eboli for helpful discussions and proofreading !