Home
Projects
Resources
Alternatives
Blog
Sign In
Knowledge Distillation Papers
Save
knowledge distillation papers
Overview
Reviews
Resources
Project README
knowledge distillation papers
Early Papers
Model Compression
, Rich Caruana, 2006
Distilling the Knowledge in a Neural Network
, Hinton, J.Dean, 2015
Knowledge Acquisition from Examples Via Multiple Models
, Perdo Domingos, 1997
Combining labeled and unlabeled data with co-training
, A. Blum, T. Mitchell, 1998
Using A Neural Network to Approximate An Ensemble of Classifiers
, Xinchuan Zeng and Tony R. Martinez, 2000
Do Deep Nets Really Need to be Deep?
, Lei Jimmy Ba, Rich Caruana, 2014
Recommended Papers
FitNets: Hints for Thin Deep Nets
, Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, Yoshua Bengio, 2015
Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer
, Sergey Zagoruyko, Nikos Komodakis, 2016
A Gift from Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning
, Junho Yim, Donggyu Joo, Jihoon Bae, Junmo Kim, 2017
Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks
, Zheng Xu, Yen-Chang Hsu, Jiawei Huang
Born Again Neural Networks
, Tommaso Furlanello, Zachary C. Lipton, Michael Tschannen, Laurent Itti, Anima Anandkumar, 2018
Net2Net: Accelerating Learning Via Knowledge Transfer
, Tianqi Chen, Ian Goodfellow, Jonathon Shlens, 2016
Unifying distillation and privileged information
, David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik, 2015
Distillation as a Defense to Adversarial Perturbations against Deep Neural Networks
, Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, Ananthram Swami, 2016
Large scale distributed neural network training through online distillation
, Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, Geoffrey E. Hinton, 2018
Deep Mutual Learning
, Ying Zhang, Tao Xiang, Timothy M. Hospedales, Huchuan Lu, 2017
Learning Loss for Knowledge Distillation with Conditional Adversarial Networks
, Zheng Xu, Yen-Chang Hsu, Jiawei Huang, 2017
Data-Free Knowledge Distillation for Deep Neural Networks
, Raphael Gontijo Lopes, Stefano Fenu, Thad Starner, 2017
Quantization Mimic: Towards Very Tiny CNN for Object Detection
, Yi Wei, Xinyu Pan, Hongwei Qin, Wanli Ouyang, Junjie Yan, 2018
Knowledge Projection for Deep Neural Networks
, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
Moonshine: Distilling with Cheap Convolutions
, Elliot J. Crowley, Gavin Gray, Amos Storkey, 2017
Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving
, Jiaolong Xu, Peng Wang, Heng Yang and Antonio M. L ´opez, 2018
Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net
, Zihao Liu, Qi Liu, Tao Liu, Yanzhi Wang, Wujie Wen, 2017
Improved Knowledge Distillation via Teacher Assistant: Bridging the Gap Between Student and Teacher
, Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Hassan Ghasemzadeh, 2019
ResKD: Residual-Guided Knowledge Distillation
, Xuewei Li, Songyuan Li, Bourahla Omar, and Xi Li, 2020
Rethinking Data Augmentation: Self-Supervision and Self-Distillation
, Hankook Lee, Sung Ju Hwang, Jinwoo Shin, 2019
MSD: Multi-Self-Distillation Learning via Multi-classifiers within Deep Neural Networks
, Yunteng Luan, Hanyu Zhao, Zhi Yang, Yafei Dai, 2019
Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation
, Linfeng Zhang, Jiebo Song, Anni Gao, Jingwei Chen, Chenglong Bao, Kaisheng Ma, 2019
2016
Cross Modal Distillation for Supervision Transfer
, Saurabh Gupta, Judy Hoffman, Jitendra Malik, CVPR 2016
Deep Model Compression: Distilling Knowledge from Noisy Teachers
, Bharat Bhusan Sau, Vineeth N. Balasubramanian, 2016
Knowledge Distillation for Small-footprint Highway Networks
, Liang Lu, Michelle Guo, Steve Renals, 2016
Sequence-Level Knowledge Distillation
,
deeplearning-papernotes
, Yoon Kim, Alexander M. Rush, 2016
Recurrent Neural Network Training with Dark Knowledge Transfer
, Zhiyuan Tang, Dong Wang, Zhiyong Zhang, 2016
Face Model Compression by Distilling Knowledge from Neurons
, Ping Luo, Zhenyao Zhu, Ziwei Liu, Xiaogang Wang, and Xiaoou Tang, 2016
Sequence-Level Knowledge Distillation
, Yoon Kim, Alexander M. Rush, EMNLP 2016
Distilling Word Embeddings: An Encoding Approach
, Lili Mou, Ran Jia, Yan Xu, Ge Li, Lu Zhang, Zhi Jin, CIKM 2016
2017
Data Distillation: Towards Omni-Supervised Learning
, Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, Kaiming He, CVPR 2017
Knowledge Projection for Deep Neural Networks
, Zhi Zhang, Guanghan Ning, Zhihai He, 2017
Like What You Like: Knowledge Distill via Neuron Selectivity Transfer
, Zehao Huang, Naiyan Wang, 2017
Data-Free Knowledge Distillation For Deep Neural Networks
, Raphael Gontijo Lopes, Stefano Fenu, 2017
DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer
, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, 2017
Adapting Models to Signal Degradation using Distillation
, Jong-Chyi Su, Subhransu Maji, BMVC 2017
Cross-lingual Distillation for Text Classification
, Ruochen Xu, Yiming Yang, ACL 2017,
code
2018
Learning Global Additive Explanations for Neural Nets Using Model Distillation
, Sarah Tan, Rich Caruana, Giles Hooker, Paul Koch, Albert Gordo, 2018
YASENN: Explaining Neural Networks via Partitioning Activation Sequences
, Yaroslav Zharov, Denis Korzhenkov, Pavel Shvechikov, Alexander Tuzhilin, 2018
Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results
, Antti Tarvainen, Harri Valpola, 2018
Local Affine Approximators for Improving Knowledge Transfer
, Suraj Srinivas & François Fleuret, 2018
Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?
Shilin Zhu, Xin Dong, Hao Su, 2018
Probabilistic Knowledge Transfer for deep representation learning
, Nikolaos Passalis, Anastasios Tefas, 2018
Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons
, Byeongho Heo, Minsik Lee, Sangdoo Yun, Jin Young Choi, 2018
Paraphrasing Complex Network: Network Compression via Factor Transfer
, Jangho Kim, SeongUk Park, Nojun Kwak, NIPS, 2018
KDGAN: Knowledge Distillation with Generative Adversarial Networks
, Xiaojie Wang, Rui Zhang, Yu Sun, Jianzhong Qi, NeurIPS 2018
Distilling Knowledge for Search-based Structured Prediction
, Yijia Liu, Wanxiang Che, Huaipeng Zhao, Bing Qin, Ting Liu, ACL 2018
2019
Learning Efficient Detector with Semi-supervised Adaptive Distillation
, Shitao Tang, Litong Feng, Zhanghui Kuang, Wenqi Shao, Quanquan Li, Wei Zhang, Yimin Chen, 2019
Dataset Distillation
, Tongzhou Wang, Jun-Yan Zhu, Antonio Torralba, Alexei A. Efros, 2019
Relational Knowledge Distillation
, Wonpyo Park, Dongju Kim, Yan Lu, Minsu Cho, 2019
Knowledge Adaptation for Efficient Semantic Segmentation
, Tong He, Chunhua Shen, Zhi Tian, Dong Gong, Changming Sun, Youliang Yan, 2019
A Comprehensive Overhaul of Feature Distillation
, Byeongho Heo, Jeesoo Kim, Sangdoo Yun, Hyojin Park, Nojun Kwak, Jin Young Choi, 2019,
code
Towards Understanding Knowledge Distillation
, Mary Phuong, Christoph Lampert, ICML, 2019
Knowledge Distillation from Internal Representations
, Gustavo Aguilar, Yuan Ling, Yu Zhang, Benjamin Yao, Xing Fan, Edward Guo, 2019
Knowledge Flow: Improve Upon Your Teachers
, Iou-Jen Liu, Jian Peng, Alexander G. Schwing, 2019
Similarity-Preserving Knowledge Distillation
, Frederick Tung, Greg Mori, 2019
[Correlation Congruence for Knowledge Distillation](Correlation Congruence for Knowledge Distillation), Baoyun Peng, Xiao Jin, Jiaheng Liu, Shunfeng Zhou, Yichao Wu, Yu Liu, Dongsheng Li, Zhaoning Zhang, 2019
Variational Information Distillation for Knowledge Transfer
, Sungsoo Ahn, Shell Xu Hu, Andreas Damianou, Neil D. Lawrence, Zhenwen Dai, 2019
Knowledge Distillation via Instance Relationship Graph
, Yufan Liu, Jiajiong Cao, Bing Lia, Chunfeng Yuan, Weiming Hua, Yangxi Lic, Yunqiang Duan, CVPR 2019
Structured Knowledge Distillation for Semantic Segmentation
, Yifan Liu, Changyong Shu, Jingdong Wang, Chunhua Shen, CVPR 2019
Zero-Shot Cross-Lingual Abstractive Sentence Summarization through Teaching Generation and Attention
, Xiangyu Duan, Mingming Yin, Min Zhang, Boxing Chen, Weihua Luo, ACL 2019,
code
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
, Raphael Tang, Yao Lu, Linqing Liu, Lili Mou, Olga Vechtomova, Jimmy Lin, arXiv, 2019
Multilingual Neural Machine Translation with Knowledge Distillation
, Xu Tan, Yi Ren, Di He, Tao Qin, Zhou Zhao, Tie-Yan Liu, ICLR 2019
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
, Kevin Clark, Minh-Thang Luong, Urvashi Khandelwal, Christopher D. Manning, Quoc V. Le, ACL 2019
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
, Xiaodong Liu, Pengcheng He, Weizhu Chen, Jianfeng Gao, arXiv 2019
Exploiting the Ground-Truth: An Adversarial Imitation Based Knowledge Distillation Approach for Event Detection
, AAAI 2019
2020
Dreaming to Distill: Data-free Knowledge Transfer via DeepInversion
, Hongxu Yin, Pavlo Molchanov, Zhizhong Li, Jose M. Alvarez, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz, 2020
Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation
, Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai, 2020
Data-Free Adversarial Distillation
, Gongfan Fang, Jie Song, Chengchao Shen, Xinchao Wang, Da Chen, Mingli Song, 2020
Contrastive Representation Distillation
, Yonglong Tian, Dilip Krishnan, Phillip Isola, ICLR 2020,
code
StyleGAN2 Distillation for Feed-forward Image Manipulation
, Yuri Viazovetskyi, Vladimir Ivashkin, and Evgeny Kashin, ECCV 2020
Distilling Knowledge from Graph Convolutional Networks
, Yiding Yang, Jiayan Qiu, Mingli Song, Dacheng Tao, Xinchao Wang, CVPR 2020
Self-supervised Knowledge Distillation for Few-shot Learning
, Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah, 2020,
code
Online Knowledge Distillation with Diverse Peers
, Defang Chen, Jian-Ping Mei, Can Wang, Yan Feng and Chun Chen, AAAI, 2020
Intra-class Feature Variation Distillation for Semantic Segmentation
, Yukang Wang, Wei Zhou, Tao Jiang, Xiang Bai, and Yongchao Xu, ECCV 2020
Exclusivity-Consistency Regularized Knowledge Distillation for Face Recognition
, Xiaobo Wang, Tianyu Fu, Shengcai Liao, Shuo Wang, Zhen Lei, and Tao Mei, ECCV 2020
Improving Face Recognition from Hard Samples via Distribution Distillation Loss
, Yuge Huang, Pengcheng Shen, Ying Tai, Shaoxin Li, Xiaoming Liu, Jilin Li, Feiyue Huang, Rongrong Ji, ECCV 2020
Distilling Knowledge Learned in BERT for Text Generation
, Yen-Chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu, ACL 2020,
code
2021
Dataset Distillation with Infinitely Wide Convolutional Networks
, Timothy Nguyen, Roman Novak, Lechao Xiao, Jaehoon Lee, 2021
Dataset Meta-Learning from Kernel Ridge-Regression
, Timothy Nguyen, Zhourong Chen, Jaehoon Lee, 2021
Up to 100× Faster Data-free Knowledge Distillation
, Gongfan Fang1, Kanya Mo, Xinchao Wang, Jie Song, Shitao Bei, Haofei Zhang, Mingli Song, 2021
Robustness and Diversity Seeking Data-Free Knowledge Distillation
, Pengchao Han, Jihong Park, Shiqiang Wang, Yejun Liu, 2021
Data-Free Knowledge Transfer: A Survey
, Yuang Liu, Wei Zhang, Jun Wang, Jianyong Wang, 2021
Undistillable: Making A Nasty Teacher That CANNOT teach students
, Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang, ICLR 2021
QuPeD: Quantized Personalization via Distillation with Applications to Federated Learning
, Kaan Ozkara, Navjot Singh, Deepesh Data, Suhas Diggavi, NeurIPS 2021
KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation
, Yongfei Liu, Chenfei Wu, Shao-yen Tseng, Vasudev Lal, Xuming He, Nan Duan
Online Knowledge Distillation for Efficient Pose Estimation
, Zheng Li, Jingwen Ye, Mingli Song, Ying Huang, Zhigeng Pan, ICCV 2021
Does Knowledge Distillation Really Work?
, Samuel Stanton, Pavel Izmailov, Polina Kirichenko, Alexander A. Alemi, Andrew Gordon Wilson, NeurIPS 2021
Hierarchical Self-supervised Augmented Knowledge Distillation
, Chuanguang Yang, Zhulin An, Linhang Cai, Yongjun Xu, IJCAI 2021
DarkGAN: Exploiting Knowledge Distillation for Comprehensible Audio Synthesis With GANs
, Javier Nistal, Stefan Lattner, Gaël Richard, ISMIR2021
On Self-Distilling Graph Neural Network
, Yuzhao Chen, Yatao Bian, Xi Xiao, Yu Rong, Tingyang Xu, Junzhou Huang, IJCAI 2021
Graph-Free Knowledge Distillation for Graph Neural Networks
, Xiang Deng, Zhongfei Zhang, IJCAI 2021
Self Supervision to Distillation for Long-Tailed Visual Recognition
, Tianhao Li, Limin Wang, Gangshan Wu, ICCV 2021
Cross-Layer Distillation with Semantic Calibration
, Defang Chen, Jian-Ping Mei, Yuan Zhang, Can Wang, Zhe Wang, Yan Feng, Chun Chen, AAAI 2021
Channel-wise Knowledge Distillation for Dense Prediction
, Changyong Shu, Yifan Liu, Jianfei Gao, Zheng Yan, Chunhua Shen, ICCV 2021
Training data-efficient image transformers & distillation through attention
, Hugo Touvron, Matthieu Cord, Douze Matthijs, Francisco Massa, Alexandre Sablayrolles, Herve Jegou, ICML 2021
Exploring Inter-Channel Correlation for Diversity-preserved Knowledge Distillation
, Li Liu, Qingle Huang, Sihao Lin, Hongwei Xie, Bing Wang, Xiaojun Chang, Xiaodan Liang, ICCV 2021,
code
torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation
, Yoshitomo Matsubara, International Workshop on Reproducible Research in Pattern Recognition 2021,
code
2022
LGD: Label-guided Self-distillation for Object Detection
, Peizhen Zhang, Zijian Kang, Tong Yang, Xiangyu Zhang, Nanning Zheng, Jian Sun, AAAI 2022
MonoDistill: Learning Spatial Features for Monocular 3D Object Detection
, Anonymous, ICLR 2022
Bag of Instances Aggregation Boosts Self-supervised Distillation
, Haohang Xu, Jiemin Fang, Xiaopeng Zhang, Lingxi Xie, Xinggang Wang, Wenrui Dai, Hongkai Xiong, Qi Tian, ICLR 2022
Meta Learning for Knowledge Distillation
, Wangchunshu Zhou, Canwen Xu, Julian McAuley, 2022
Focal and Global Knowledge Distillation for Detectors
, Zhendong Yang, Zhe Li, Xiaohu Jiang, Yuan Gong, Zehuan Yuan, Danpei Zhao, Chun Yuan, CVPR 2022
Self-Distilled StyleGAN: Towards Generation from Internet Photos
, Ron Mokady, Michal Yarom, Omer Tov, Oran Lang, Daniel Cohen-Or, Tali Dekel, Michal Irani, Inbar Mosseri, 2022
Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation
, Gang Li, Xiang Li, Yujie Wang, Shanshan Zhang, Yichao Wu, Ding Liang, AAAI 2022
Decoupled Knowledge Distillation
, Borui Zhao, Quan Cui, Renjie Song, Yiyu Qiu, Jiajun Liang, CVPR 2022,
code
Graph Flow: Cross-layer Graph Flow Distillation for Dual-Efficient Medical Image Segmentation
, Wenxuan Zou, Muyi Sun, 2022
Dataset Distillation by Matching Training Trajectories
, George Cazenavette, Tongzhou Wang, Antonio Torralba, Alexei A. Efros, Jun-Yan Zhu, CVPR 2022
Knowledge Distillation with the Reused Teacher Classifier
, Defang Chen, Jian-Ping Mei, Hailin Zhang, Can Wang, Yan Feng, Chun Chen, CVPR 2022
Self-Distillation from the Last Mini-Batch for Consistency Regularization
, Shen Yiqing, Xu Liwu, Yang Yuzhe, Li Yaqian and Guo Yandong, CVPR 2022
code
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
, Xianing Chen, Qiong Cao, Yujie Zhong, Shenghua Gao, CVPR 2022
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning
, Lin Zhang, Li Shen, Liang Ding, Dacheng Tao, Ling-Yu Duan, CVPR 2022
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection
, Yi Wei, Zibu Wei, Yongming Rao, Jiaxin Li, Jiwen Lu, Jie Zhou, 2022
Localization Distillation for Dense Object Detection
, Zhaohui Zheng, Rongguang Ye, Ping Wang, Dongwei Ren, Wangmeng Zuo, Qibin Hou, Ming-Ming Cheng, CVPR 2022,
code
Localization Distillation for Object Detection
, Zhaohui Zheng, Rongguang Ye, Qibin Hou, Dongwei Ren, Ping Wang, Wangmeng Zuo, Ming-Ming Cheng, 2022,
code
Cross-Image Relational Knowledge Distillation for Semantic Segmentation
, Chuanguang Yang, Helong Zhou, Zhulin An, Xue Jiang, Yongjun Xu, Qian Zhang, CVPR 2022,
code
Knowledge distillation: A good teacher is patient and consistent
, Lucas Beyer, Xiaohua Zhai, Amélie Royer, Larisa Markeeva, Rohan Anil, Alexander Kolesnikov, CVPR 2022
Spot-adaptive Knowledge Distillation
, Jie Song, Ying Chen, Jingwen Ye, Mingli Song, TIP 2022,
code
MSDN: Mutually Semantic Distillation Network for Zero-Shot Learning
, Shiming Chen, Ziming Hong, Guo-Sen Xie, Wenhan Yang, Qinmu Peng, Kai Wang, Jian Zhao, Xinge You, CVPR 2022
Knowledge Distillation via the Target-aware Transformer
, Sihao Lin, Hongwei Xie, Bing Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang, Gang Wang, CVPR 2022
PointDistiller: Structured Knowledge Distillation Towards Efficient and Compact 3D Detection
, Linfeng Zhang, Runpei Dong, Hung-Shuo Tai, Kaisheng Ma, arXiv 2022,
code
Wavelet Knowledge Distillation: Towards Efficient Image-to-Image Translation
, Linfeng Zhang, Xin Chen, Xiaobing Tu, Pengfei Wan, Ning Xu, Kaisheng Ma, CVPR 2022
Contrastive Learning Rivals Masked Image Modeling in Fine-tuning via Feature Distillation
, Yixuan Wei, Han Hu, Zhenda Xie, Zheng Zhang, Yue Cao, Jianmin Bao, Dong Chen, Baining Guo, Tech Report 2022,
code
Knowledge Distillation via the Target-aware Transformer
, Sihao Lin, Hongwei Xie, Bing Wang, Kaicheng Yu, Xiaojun Chang, Xiaodan Liang, Gang Wang, CVPR 2022
BERT Learns to Teach: Knowledge Distillation with Meta Learning
, Wangchunshu Zhou, Canwen Xu, Julian McAuley, ACL 2022,
code
Nearest Neighbor Knowledge Distillation for Neural Machine Translation
, Zhixian Yang, Renliang Sun, Xiaojun Wan, NAACL 2022
Knowledge Condensation Distillation
, Chenxin Li, Mingbao Lin, Zhiyuan Ding, Nie Lin, Yihong Zhuang, Yue Huang, Xinghao Ding, Liujuan Cao, ECCV 2022,
code
Masked Generative Distillation
, Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan, ECCV 2022,
code
DTG-SSOD: Dense Teacher Guidance for Semi-Supervised Object Detection
, Gang Li, Xiang Li, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang
Distilled Dual-Encoder Model for Vision-Language Understanding
, Zekun Wang, Wenhui Wang, Haichao Zhu, Ming Liu, Bing Qin, Furu Wei,
code
Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection
, Hongyu Zhou, Zheng Ge, Songtao Liu, Weixin Mao, Zeming Li, Haiyan Yu, Jian Sun, ECCV 2022,
code
Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation
, Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang
TinyViT: Fast Pretraining Distillation for Small Vision Transformers
, Zhiwei Hao, Jianyuan Guo, Ding Jia, Kai Han, Yehui Tang, Chao Zhang, Han Hu, Yunhe Wang, ECCV 2022
Self-slimmed Vision Transformer
, Zhuofan Zong, Kunchang Li, Guanglu Song, Yali Wang, Yu Qiao, Biao Leng, Yu Liu, ICLR 2022
KD-MVS: Knowledge Distillation Based Self-supervised Learning for MVS
, Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, CHi Zhang, ECCV 2022,
code
Rethinking Data Augmentation for Robust Visual Question Answering
, Long Chen, Yuhang Zheng, Jun Xiao, ECCV 2022,
code
ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval
, Yuxiang Lu, Yiding Liu, Jiaxiang Liu, Yunsheng Shi, Zhengjie Huang, Shikun Feng Yu Sun, Hao Tian, Hua Wu, Shuaiqiang Wang, Dawei Yin, Haifeng Wang
Prune Your Model Before Distill It
, Jinhyuk Park, Albert No, ECCV 2022,
code
Efficient One Pass Self-distillation with Zipf's Label Smoothing
, Jiajun Liang, Linze Li, Zhaodong Bing, Borui Zhao, Yao Tang, Bo Lin, Haoqiang Fan, ECCV 2022,
code
R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis
, ECCV 2022,
code
D3Former: Debiased Dual Distilled Transformer for Incremental Learning
, Abdelrahman Mohamed, Rushali Grandhe, KJ Joseph, Salman Khan, Fahad Khan,
code
SdAE: Self-distillated Masked Autoencoder
, Yabo Chen, Yuchen Liu, Dongsheng Jiang, Xiaopeng Zhang, Wenrui Dai, Hongkai Xiong, Qi Tian, ECCV 2022,
code
Masked Generative Distillation
, Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan, ECCV 2022,
code
MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
, Chuanguang Yang, Zhulin An, Helong Zhou, Linhang Cai, Xiang Zhi, Jiwen Wu, Yongjun Xu, Qian Zhang, ECCV 2022,
code
Mind the Gap in Distilling StyleGANs
, Guodong Xu, Yuenan Hou, Ziwei Liu, Chen Change Loy, ECCV 2022,
code
Prune Your Model Before Distill It
, Jinhyuk Park and Albert No, ECCV 2022,
code
HIRE: Distilling high-order relational knowledge from heterogeneous graph neural networks
, Jing Liu, Tongya Zheng, Qinfen Hao, Neurocomputing
A Fast Knowledge Distillation Framework for Visual Recognition
, Zhiqiang Shen, Eric Xing, ECCV 2022,
code
Knowledge Distillation from A Stronger Teacher
, Tao Huang, Shan You, Fei Wang, Chen Qian, Chang Xu, NeurIPS 2022,
code
ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval
, Nicola Messina, Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, Fabrizio Falchi, Giuseppe Amato, Rita Cucchiara, CBMI 2022,
code
Towards Efficient 3D Object Detection with Knowledge Distillation
, Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi, NeurlPS 2022,
code
Pro-KD: Progressive Distillation by Following the Footsteps of the Teacher
, Mehdi Rezagholizadeh, Aref Jafari, Puneeth Salad, Pranav Sharma, Ali Saheb Pasand, Ali Ghodsi, COLING 2022
Noisy Self-Knowledge Distillation for Text Summarization
, Yang Liu, Sheng Shen, Mirella Lapata, arXiv 2021
On Distillation of Guided Diffusion Models
, Chenlin Meng, Ruiqi Gao, Diederik P. Kingma, Stefano Ermon, Jonathan Ho, Tim Salimans, arXiv 2022
ViTKD: Practical Guidelines for ViT feature knowledge distillation
, Zhendong Yang, Zhe Li, Ailing Zeng, Zexian Li, Chun Yuan, Yu Li, arXiv 2022,
code
Self-Regulated Feature Learning via Teacher-free Feature Distillation
, Lujun Li, ECCV 2022,
code
DETRDistill: A Universal Knowledge Distillation Framework for DETR-families
, Jiahao Chang, Shuo Wang, Guangkai Xu, Zehui Chen, Chenhongyi Yang, Feng Zhao, arXiv 2022
Learning to Explore Distillability and Sparsability: A Joint Framework for Model Compression
, Yufan Liu, Jiajiong Cao, Bing Li, Weiming Hu, Stephen Maybank, TPAMI 2022
Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?
, Keshigeyan Chandrasegaran, Ngoc-Trung Tran, Yunqing Zhao, Ngai-Man Cheung, ICML 2022
2023
Curriculum Temperature for Knowledge Distillation
, Zheng Li, Xiang Li, Lingfeng Yang, Borui Zhao, Renjie Song, Lei Luo, Jun Li, Jian Yang, AAAI 2023,
code
Disjoint Masking with Joint Distillation for Efficient Masked Image Modeling
, Xin Ma, Chang Liu, Chunyu Xie, Long Ye, Yafeng Deng, Xiangyang Ji, arXiv 2023,
code
Open Source Agenda is not affiliated with "Knowledge Distillation Papers" Project. README Source:
lhyfst/knowledge-distillation-papers
Stars
724
Open Issues
2
Last Commit
1 year ago
Repository
lhyfst/knowledge-distillation-papers
Tags
Dark Knowledge
Knowledge Distillation
Model Compression
Paper
Reading List
Open Source Agenda Badge
Submit Review
Review Your Favorite Project
Submit Resource
Articles, Courses, Videos
Submit Article
Submit a post to our blog
From the blog
Dec 11, 2022
How to Choose Which Programming Language to Learn First?
From the blog
Dec 11, 2022
How to Choose Which Programming Language to Learn First?
Home
Projects
Resources
Alternatives
Blog
Sign In
Sign In to OSA
I agree with
Terms of Service
and
Privacy Policy
Sign In with Github