Implementation for
By Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj and Le Song
SphereFace is released under the MIT License (refer to the LICENSE file for details).
2022.4.10: If you are looking for an easy-to-use and well-performing PyTorch implementation of SphereFace, we now have it! Check out our official SphereFace PyTorch re-implementation here.
2018.8.14: We recommand an interesting ECCV 2018 paper that comprehensively evaluates SphereFace (A-Softmax) on current widely used face datasets and their proposed noise-controlled IMDb-Face dataset. Interested users can try to train SphereFace on their IMDb-Face dataset. Take a look here.
2018.5.23: A new SphereFace+ that explicitly enhances the inter-class separability has been introduced in our technical report. Check it out here. Code is released here.
2018.2.1: As requested, the prototxt files for SphereFace-64 are released.
2018.1.27: We updated the appendix of our SphereFace paper with useful experiments and analysis. Take a look here. The content contains:
2018.1.20: We updated some resources to summarize the current advances in angular margin learning. Take a look here.
The repository contains the entire pipeline (including all the preprocessings) for deep face recognition with SphereFace
. The recognition pipeline contains three major steps: face detection, face alignment and face recognition.
SphereFace is a recently proposed face recognition method. It was initially described in an arXiv technical report and then published in CVPR 2017. The most up-to-date paper with more experiments can be found at arXiv or here. To facilitate the face recognition research, we give an example of training on CAISA-WebFace and testing on LFW using the 20-layer CNN architecture described in the paper (i.e. SphereFace-20).
In SphereFace, our network architecures use residual units as building blocks, but are quite different from the standrad ResNets (e.g., BatchNorm is not used, the prelu replaces the relu, different initializations, etc). We proposed 4-layer, 20-layer, 36-layer and 64-layer architectures for face recognition (details can be found in the paper and prototxt files). We provided the 20-layer architecure as an example here. If our proposed architectures also help your research, please consider to cite our paper.
SphereFace achieves the state-of-the-art verification performance (previously No.1) in MegaFace Challenge under the small training set protocol.
If you find SphereFace useful in your research, please consider to cite:
@InProceedings{Liu_2017_CVPR,
title = {SphereFace: Deep Hypersphere Embedding for Face Recognition},
author = {Liu, Weiyang and Wen, Yandong and Yu, Zhiding and Li, Ming and Raj, Bhiksha and Song, Le},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2017}
}
Our another closely-related previous work in ICML'16 (more):
@InProceedings{Liu_2016_ICML,
title = {Large-Margin Softmax Loss for Convolutional Neural Networks},
author = {Liu, Weiyang and Wen, Yandong and Yu, Zhiding and Yang, Meng},
booktitle = {Proceedings of The 33rd International Conference on Machine Learning},
year = {2016}
}
Matlab
Caffe
and matcaffe
(see: Caffe installation instructions)MTCNN
(see: MTCNN - face detection & alignment) and Pdollar toolbox
(see: Piotr's Image & Video Matlab Toolbox).Clone the SphereFace repository. We'll call the directory that you cloned SphereFace as SPHEREFACE_ROOT
.
git clone --recursive https://github.com/wy1iu/sphereface.git
Build Caffe and matcaffe
cd $SPHEREFACE_ROOT/tools/caffe-sphereface
# Now follow the Caffe installation instructions here:
# http://caffe.berkeleyvision.org/installation.html
make all -j8 && make matcaffe
After successfully completing the installation, you are ready to run all the following experiments.
Note: In this part, we assume you are in the directory $SPHEREFACE_ROOT/preprocess/
Download the training set (CASIA-WebFace
) and test set (LFW
) and place them in data/
.
mv /your_path/CASIA_WebFace data/
./code/get_lfw.sh
tar xvf data/lfw.tgz -C data/
Please make sure that the directory of data/
contains two datasets.
Detect faces and facial landmarks in CAISA-WebFace and LFW datasets using MTCNN
(see: MTCNN - face detection & alignment).
# In Matlab Command Window
run code/face_detect_demo.m
This will create a file dataList.mat
in the directory of result/
.
Align faces to a canonical pose using similarity transformation.
# In Matlab Command Window
run code/face_align_demo.m
This will create two folders (CASIA-WebFace-112X96/
and lfw-112X96/
) in the directory of result/
, containing the aligned face images.
Note: In this part, we assume you are in the directory $SPHEREFACE_ROOT/train/
Get a list of training images and labels.
mv ../preprocess/result/CASIA-WebFace-112X96 data/
# In Matlab Command Window
run code/get_list.m
The aligned face images in folder CASIA-WebFace-112X96/
are moved from preprocess folder to train folder. A list CASIA-WebFace-112X96.txt
is created in the directory of data/
for the subsequent training.
Train the sphereface model.
./code/sphereface_train.sh 0,1
After training, a model sphereface_model_iter_28000.caffemodel
and a corresponding log file sphereface_train.log
are placed in the directory of result/sphereface/
.
Note: In this part, we assume you are in the directory $SPHEREFACE_ROOT/test/
Get the pair list of LFW (view 2).
mv ../preprocess/result/lfw-112X96 data/
./code/get_pairs.sh
Make sure that the LFW dataset andpairs.txt
in the directory of data/
Extract deep features and test on LFW.
# In Matlab Command Window
run code/evaluation.m
Finally we have the sphereface_model.caffemodel
, extracted features pairs.mat
in folder result/
, and accuracy on LFW like this:
fold | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | AVE |
---|---|---|---|---|---|---|---|---|---|---|---|
ACC | 99.33% | 99.17% | 98.83% | 99.50% | 99.17% | 99.83% | 99.17% | 98.83% | 99.83% | 99.33% | 99.30% |
Following the instruction, we go through the entire pipeline for 5 times. The accuracies on LFW are shown below. Generally, we report the average but we release the model-3 here.
Experiment | #1 | #2 | #3 (released) | #4 | #5 |
---|---|---|---|---|---|
ACC | 99.24% | 99.20% | 99.30% | 99.27% | 99.13% |
Other intermediate results:
Please click the image to watch the Youtube video. For Youku users, click here.
Details:
Backward gradient
Lambda and Note for training (When the loss becomes 87)
According to recent advances, using feature normalization with a tunable scaling parameter s can significantly improve the performance of SphereFace on MegaFace challenge
Difficulties in convergence - When you encounter difficulties in convergence (it may appear if you use SphereFace in another dataset), usually there are a few easy ways to address it.
L-Softmax loss and SphereFace present a promising framework for angular representation learning, which is shown very effective in deep face recognition. We are super excited that our works has inspired many well-performing methods (and loss functions). We list a few of them for your potential reference (not very up-to-date):
To evaluate the effectiveness of the angular margin learning method, you may consider to use the angular Fisher score proposed in the Appendix E of our SphereFace Paper.
Disclaimer: Some of these methods may not necessarily be inspired by us, but we still list them due to its relevance and excellence.
Questions can also be left as issues in the repository. We will be happy to answer them.