3D Face Recognition, implemented with PyTorch
This is a 3D Face Recognizer, a.k.a. 2.5D Face Recognizer in many cases, implemented with PyTorch
.
Another implementation with tensorflow can be found there.
Given a pair of an RGB image and a depth image, that is, a four-dimensional image, the recognizer needs to recognize the face in the image.
The following is a detailed description of the dataset.
Dataset is available from here.
The data set contains 403,067 pairs of face images of 1,208 people. Each pair of face images is registered and contains an RGB image and a depth image.
I use ResNet50
models (CNN) for feature extraction, whose input channel is slighted modified so that 4D channel images can be used as data inputs.
I am about to compare the performance of some different classification loss functions, such as softmax
, triplet loss
, etc.
Below are some of the trained models and their accuracy on the dataset. The list will continue to be updated, please keep your attention.
Model Name | Architecture | Accuracy | Descriptions |
---|---|---|---|
RGB-ResNet50-from-imagenet.pkl | RGB ResNet50 | 94.47% | Pretrained on ImageNet.
|
RGB-D-ResNet50-from-scratch.pkl | RGB-D ResNet50 | 88.36% | Training from scratch.
|
RGB-D-ResNet50-from-imagenet.pkl | RGB-D ResNet50 | 94.64% | Pretrained on ImageNet.
|
pip install -r requirements
The model should be fed with images of fixed size, therefore we need to perform face alignment first.
Please refer to the code of davidsandberg/facenet.
# align.sh
export PYTHONPATH=${PWD}/src
python preprocess/align/align_dataset_mtcnn.py \
--input_dir /mnt/sdb/vggface3 \
--output_dir /mnt/sdb/vggface3_align \
--image_size 182 \
--margin 44 \
--random_order \
--thread_num 3 \
--gpu_memory_fraction 0.88
To split the whole dataset randomly into 3 sub-datasets, (i.e., training dataset, evaluation dataset, test dataset), by generating 3 corresponding csv files to record the paths and labels of each images.
python preprocess/get_dataset_csv.py
After that, the file structure of the data set is as follows.
vggface3d
|── train.csv
|── eval.csv
|── test.csv
|── dirty.csv
├── n000853
│ ├── 0001_03.npy
│ ├── 0001_03.png
│ ├── 0002_01.npy
│ ├── 0002_01.png
│ ├── 0003_01.npy
│ ├── 0003_01.png
│ ├── 0004_01.npy
│ ├── ......
Multi-GPU training will be supported soon.
python train_softmax.py --train_dataset_csv '~/vggface3d_sm/train.csv' \
--eval_dataset_csv '~/vggface3d_sm/eval.csv' \
--pretrained_on_imagenet \
--input_channels 4 \
--num_of_classes 1200 \
--num_of_epochs 50 \
--num_of_workers 8 \
--log_base_dir './logs'
usage: train_softmax.py [-h] [--train_dataset_csv TRAIN_DATASET_CSV]
[--eval_dataset_csv EVAL_DATASET_CSV]
[--pretrained_on_imagenet]
[--pretrained_model_path PRETRAINED_MODEL_PATH]
[--pretrained_optim_path PRETRAINED_OPTIM_PATH]
[--input_channels INPUT_CHANNELS]
[--num_of_classes NUM_OF_CLASSES]
[--num_of_epochs NUM_OF_EPOCHS]
[--image_size IMAGE_SIZE] [--batch_size BATCH_SIZE]
[--num_of_workers NUM_OF_WORKERS]
[--logs_base_dir LOGS_BASE_DIR]
optional arguments:
-h, --help show this help message and exit
--train_dataset_csv TRAIN_DATASET_CSV
The path of csv file where to write paths of training
images.
--eval_dataset_csv EVAL_DATASET_CSV
The path of csv file where to write paths of
validation images.
--pretrained_on_imagenet
(bool) Whether to load the imagenet pretrained model.
--pretrained_model_path PRETRAINED_MODEL_PATH
Load a pretrained model before training starts.
--pretrained_optim_path PRETRAINED_OPTIM_PATH
Load a optimizer before training starts.
--input_channels INPUT_CHANNELS
Number of channels of the first input layer.
--num_of_classes NUM_OF_CLASSES
Number of channels of the last output layer.
--num_of_epochs NUM_OF_EPOCHS
Number of epochs to run.
--image_size IMAGE_SIZE
Image size (height, width) in pixels.
--batch_size BATCH_SIZE
Number of images to process in a batch.
--num_of_workers NUM_OF_WORKERS
Number of subprocesses to use for data loading.
--logs_base_dir LOGS_BASE_DIR
Directory where to write event logs and save models.
python evaluation.py \
--pretrained_model_path ./RGB-D-ResNet50-from-scratch.pkl \
--num_of_workers 8
usage: evaluation.py [-h] [--test_dataset_csv TEST_DATASET_CSV]
[--pretrained_model_path PRETRAINED_MODEL_PATH]
[--input_channels INPUT_CHANNELS]
[--num_of_classes NUM_OF_CLASSES]
[--image_size IMAGE_SIZE] [--batch_size BATCH_SIZE]
[--num_of_workers NUM_OF_WORKERS]
optional arguments:
-h, --help show this help message and exit
--test_dataset_csv TEST_DATASET_CSV
The path of csv file where to write paths of test
images.
--pretrained_model_path PRETRAINED_MODEL_PATH
The path of the pretrained model.
--input_channels INPUT_CHANNELS
Number of channels of the first input layer.
--num_of_classes NUM_OF_CLASSES
Number of channels of the last output layer.
--image_size IMAGE_SIZE
Image size (height, width) in pixels.
--batch_size BATCH_SIZE
Number of images to process in a batch.
--num_of_workers NUM_OF_WORKERS
Number of subprocesses to use for data loading.
Available soon.