Skip to content

chaoyue729/Knowledge_distillation_via_TF2.0

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Knowledge_distillation_via_TF2.0

  • Now, I'm fixing all the issues and refining the codes. It will be easier to understand how each KD works than before.
  • Algorithms are already implemented again, but they should be checked more with hyperparameter tuning.
  • Note that some algorithms give an insufficient performance with my configuration. For example, for FitNet, multi-task learning is much better than the initialization. However, I have followed the author's way.
  • This Repo. will be upgraded version of my previous benchmark Repo. (link)

Implemented Knowledge Distillation Methods

Defined knowledge by the neural response of the hidden layer or the output layer of the network

Experimental Results

Network architecture

Training/Validation accuracy

Full Dataset 50% Dataset 25% Dataset 10% Dataset
Methods Accuracy Last Accuracy Last Accuracy Last Accuracy
Teacher 78.59 - - -
Student 76.25 - - -
Soft_logits 76.57 - - -
FitNet 75.04 - - -
AT 78.14 - - -
FSP 76.47 - - -
DML - - - -
KD_SVD - - - -
KD_EID - - - -
FT - - - -
AB - - - -
RKD - - - -
VID - - - -
MHGD - - - -
CO - - - -

Plan to do

  • Check all the algorithms.
  • do experiments.

About

The codes for recent knowledge distillation algorithms and benchmark results via TF2.0 low-level API

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%