[ICCV 2023] Official implementation of the paper "Neural Interactive Keypoint Detection"
This is the official pytorch implementation of our ICCV 2023 paper "Neural Interactive Keypoint Detection."
Jie Yang, Ailing Zeng, Feng Li, Shilong Liu, Ruimao Zhang, Lei Zhang
Keywords: π― Multi-person 2D pose estimation, π Human-in-the-loop, π€Interactive model
π We first propose an interactive keypoint detection task for efficient keypoint annotation.
π We present the first neural interactive keypoint detection framework, Click-Pose, an end-to-end baseline to annotate multi-person 2D keypoints given an image.
π Click-Pose is more than 10 times faster than manual annotation. Importantly, it significantly alleviates model bias in out-of-domain annotation (e.g., on Human-Art), reducing the time required by 83% compared to state-of-the-art model annotation (ViTPose) with manual correction.
Model | Backbone | Lr schd | mAP | AP50 | AP75 | APM | APL | Time (ms) | Model |
---|---|---|---|---|---|---|---|---|---|
ED-Pose | ResNet-50 | 60e | 71.7 | 89.7 | 78.8 | 66.2 | 79.7 | 51 | GitHub, Model |
Click-Pose | ResNet-50 | 40e | 73.0 | 90.4 | 80.0 | 68.1 | 80.5 | 48 | Google Drive |
Model | Backbone | mAP | APM | APL | Model |
---|---|---|---|---|---|
ED-Pose | ResNet-50 | 37.5 | 7.6 | 41.1 | GitHub, Model |
Click-Pose | ResNet-50 | 40.5 | 8.3 | 44.2 | Google Drive |
Model | Backbone | mAP | AP50 | AP75 | Model |
---|---|---|---|---|---|
ED-Pose | ResNet-50 | 31.4 | 39.5 | 35.1 | GitHub, Model |
Click-Pose | ResNet-50 | 33.9 | 43.4 | 37.5 | Google Drive |
Note that the model is trained on COCO train2017 set and tested on COCO val2017 set, Human-Art val set, and OCHuman test set.
Model | Backbone | NoC@85 | NoC@90 | NoC@95 | Model |
---|---|---|---|---|---|
ViTPose | ViT-Huge | 1.46 | 2.15 | 2.87 | GitHub, Model |
Click-Pose | ResNet-50 | 0.95 | 1.48 | 1.97 | Google Drive |
Model | Backbone | NoC@85 | NoC@90 | NoC@95 | Model |
---|---|---|---|---|---|
ViTPose | ViT-Huge | 9.12 | 9.79 | 10.13 | GitHub, Model |
Click-Pose | ResNet-50 | 4.82 | 5.81 | 6.45 | Google Drive |
We use the ED-Pose as our codebase. We test our models under python=3.7.3,pytorch=1.9.0,cuda=11.1
. Other versions might be available as well.
git clone https://github.com/IDEA-Research/Click-Pose.git
cd Click-Pose
Follow the instruction on https://pytorch.org/get-started/locally/.
# an example:
conda install -c pytorch pytorch torchvision
pip install -r requirements.txt
cd models/clickpose/ops
python setup.py build install
# unit test (should see all checking is True)
python test.py
cd ../../..
For COCO data, please download from COCO download. The coco_dir should look like this:
|-- Click-Pose
`-- |-- coco_dir
`-- |-- annotations
| |-- person_keypoints_train2017.json
| `-- person_keypoints_val2017.json
`-- images
|-- train2017
| |-- 000000000009.jpg
| |-- 000000000025.jpg
| |-- 000000000030.jpg
| |-- ...
`-- val2017
|-- 000000000139.jpg
|-- 000000000285.jpg
|-- 000000000632.jpg
|-- ...
For Human-Art data, please download from Human-Art download, The humanart_dir should look like this:
|-- Click-Pose
`-- |-- humanart_dir
`-- |-- annotations
| |-- training_humanart.json
| |-- validation_humanart.json
`-- images
|-- 2D_virtual_human
|-- ...
|-- 3D_virtual_human
|-- ...
|-- real_human
|-- ...
For CrowdPose data, please download from CrowdPose download, The crowdpose_dir should look like this:
|-- Click-Pose
`-- |-- crowdpose_dir
`-- |-- json
| |-- crowdpose_train.json
| |-- crowdpose_val.json
| |-- crowdpose_trainval.json (generated by util/crowdpose_concat_train_val.py)
| `-- crowdpose_test.json
`-- images
|-- 100000.jpg
|-- 100001.jpg
|-- 100002.jpg
|-- 100003.jpg
|-- 100004.jpg
|-- 100005.jpg
|-- ...
For OCHuman data, please download from OCHuman download. The ochuman_dir should look like this:
|-- Click-Pose
`-- |-- ochuman_dir
`-- |-- annotations
`-- images
export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir
python -m torch.distributed.launch --nproc_per_node=4 main.py \
--output_dir "logs/ClickPose_Model-Only" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \
--dataset_file="coco"
export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir
python -m torch.distributed.launch --nproc_per_node=4 main.py \
--output_dir "logs/ClickPose_Neural_Interactive" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \
--dataset_file="coco"
export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir
python -m torch.distributed.launch --nproc_per_node=4 main.py \
--output_dir "logs/ClickPose_Model-Only_eval" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \
--dataset_file="coco" \
--pretrain_model_path "./models/ClickPose_model_only_R50.pth" \
--eval
export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir
export CLICKPOSE_NoC_Test="TRUE"
export CLICKPOSE_SAVE_PATH = "./NoC_95_coco.json"
export NoC_thr = 0.95
python -m torch.distributed.launch --nproc_per_node=1 --master_port 3458 main.py \
--output_dir "logs/ClickPose_Neural_Interactive_eval" \
-c config/clickpose.cfg.py \
--options batch_size=1 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=TRUE feedback_inference=TRUE only_correction=FALSE num_select=20 \
--dataset_file="coco" \
--pretrain_model_path "./models/ClickPose_interactive_R50.pth" \
--eval
export CLICKPOSE_COCO_PATH=/path/to/your/coco_dir
export CLICKPOSE_NoC_Test="TRUE"
for CLICKPOSE_Click_Number in {1..17}
do
python -m torch.distributed.launch --nproc_per_node=4 --master_port 3458 main.py \
--output_dir "logs/ClickPose_Neural_Interactive_eval" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=TRUE only_correction=FALSE num_select=20 \
--dataset_file="coco" \
--pretrain_model_path "./models/ClickPose_interactive_R50.pth" \
--eval
done
export CLICKPOSE_HumanArt_PATH=/path/to/your/humanart_dir
python -m torch.distributed.launch --nproc_per_node=4 main.py \
--output_dir "logs/ClickPose_Model-Only_eval" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \
--dataset_file="humanart" \
--pretrain_model_path "./models/ClickPose_model_only_R50.pth" \
--eval
export CLICKPOSE_HumanArt_PATH=/path/to/your/humanart_dir
export CLICKPOSE_NoC_Test="TRUE"
export CLICKPOSE_SAVE_PATH = "./NoC_95_humanart.json"
export NoC_thr = 0.95
python -m torch.distributed.launch --nproc_per_node=1 --master_port 3458 main.py \
--output_dir "logs/ClickPose_Neural_Interactive_eval" \
-c config/clickpose.cfg.py \
--options batch_size=1 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=TRUE feedback_inference=TRUE only_correction=FALSE num_select=20 \
--dataset_file="humanart" \
--pretrain_model_path "./models/ClickPose_interactive_R50.pth" \
--eval
export CLICKPOSE_HumanArt_PATH=/path/to/your/humanart_dir
export CLICKPOSE_NoC_Test="TRUE"
for CLICKPOSE_Click_Number in {1..17}
do
python -m torch.distributed.launch --nproc_per_node=4 --master_port 3458 main.py \
--output_dir "logs/ClickPose_Neural_Interactive_eval" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=TRUE only_correction=FALSE num_select=20 \
--dataset_file="humanart" \
--pretrain_model_path "./models/ClickPose_interactive_R50.pth" \
--eval
done
export CLICKPOSE_OCHuman_PATH=/path/to/your/ochuman_dir
python -m torch.distributed.launch --nproc_per_node=4 main.py \
--output_dir "logs/ClickPose_Model-Only_eval" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=FLASE feedback_loop_NOC_test=FALSE feedback_inference=FALSE only_correction=FALSE \
--dataset_file="ochuman" \
--pretrain_model_path "./models/ClickPose_model_only_R50.pth" \
--eval
export CLICKPOSE_OCHuman_PATH=/path/to/your/ochuman_dir
export CLICKPOSE_NoC_Test = "TRUE"
export CLICKPOSE_SAVE_PATH = "./NoC_95_ochuman.json"
export NoC_thr = 0.95
python -m torch.distributed.launch --nproc_per_node=1 --master_port 3458 main.py \
--output_dir "logs/ClickPose_Neural_Interactive_eval" \
-c config/clickpose.cfg.py \
--options batch_size=1 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=TRUE feedback_inference=TRUE only_correction=FALSE num_select=20 \
--dataset_file="ochuman" \
--pretrain_model_path "./models/ClickPose_interactive_R50.pth" \
--eval
export CLICKPOSE_OCHuman_PATH=/path/to/your/ochuman_dir
export CLICKPOSE_NoC_Test="TRUE"
for CLICKPOSE_Click_Number in {1..17}
do
python -m torch.distributed.launch --nproc_per_node=4 --master_port 3458 main.py \
--output_dir "logs/ClickPose_Neural_Interactive_eval" \
-c config/clickpose.cfg.py \
--options batch_size=4 epochs=100 lr_drop=80 use_ema=TRUE human_feedback=TRUE feedback_loop_NOC_test=FALSE feedback_inference=TRUE only_correction=FALSE num_select=20 \
--dataset_file="ochuman" \
--pretrain_model_path "./models/ClickPose_interactive_R50.pth" \
--eval
done
If you find this repository useful for your work, please consider citing it as follows:
@inproceedings{yang2023neural,
title={Neural Interactive Keypoint Detection},
author={Yang, Jie and Zeng, Ailing and Li, Feng and Liu, Shilong and Zhang, Ruimao and Zhang, Lei},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={15122--15132},
year={2023}
}
@inproceedings{yang2022explicit,
title={Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation},
author={Yang, Jie and Zeng, Ailing and Liu, Shilong and Li, Feng and Zhang, Ruimao and Zhang, Lei},
booktitle={The Eleventh International Conference on Learning Representations},
year={2022}
}