Knowledge Distillation Via TF2.0 Save

The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API

Project README

Knowledge_distillation_via_TF2.0

  • Now, I'm fixing all the issues and refining the codes. It will be easier to understand how each KD works than before.
  • Algorithms are already implemented again, but they should be checked more with hyperparameter tuning.
    • the algorithms which have experimental results have been confirmed.
  • This Repo. will be upgraded version of my previous benchmark Repo. (link)

Implemented Knowledge Distillation Methods

Defined knowledge by the neural response of the hidden layer or the output layer of the network

Experimental Results

  • I use WResNet-40-4 and WResNet-16-4 as the teacher and the student network, respectively.
  • All the algorithm is trained in the sample configuration, which is described in "train_w_distillation.py", and only each algorithm's hyper-parameters are tuned. I tried only several times to get acceptable performance, which means that my experimental results are perhaps not optimal.
  • Although some of the algorithms used soft-logits parallelly in the paper, I used only the proposed knowledge distillation algorithm to make a fair comparison.
  • Initialization-based methods give a far higher performance in the start point but a poor performance in the last point due to overfitting. Therefore, initialized students must have a regularization algorithm, such as Soft-logits.

Training/Validation accuracy

Full Dataset 50% Dataset 25% Dataset 10% Dataset
Methods Accuracy Last Accuracy Last Accuracy Last Accuracy
Teacher 78.59 - - -
Student 76.25 - - -
Soft_logits 76.57 - - -
FitNet 75.78 - - -
AT 78.14 - - -
FSP 76.08 - - -
DML - - - -
KD_SVD - - - -
FT 77.30 - - -
AB 76.52 - - -
RKD 77.69 - - -
VID - - - -
MHGD - - - -
CO 78.54 - - -

Plan to do

  • Check all the algorithms.
  • do experiments.
Open Source Agenda is not affiliated with "Knowledge Distillation Via TF2.0" Project. README Source: sseung0703/Knowledge_distillation_via_TF2.0
Stars
105
Open Issues
2
Last Commit
2 years ago

Open Source Agenda Badge

Open Source Agenda Rating