Research Article

Improved Knowledge Distillationwith Dynamic Network Pruning

Volume: 10 Number: 3 September 30, 2022
EN

Improved Knowledge Distillationwith Dynamic Network Pruning

Abstract

Deploying convolutional neural networks to mobile or embedded devices is often prohibited by limited memory and computational resources. This is particularly problematic for the most successful networks, which tend to be very large and require long inference times. Many alternative approaches have been developed for compressing neural networks based on pruning, regularization, quantization or distillation. In this paper, we propose the “Knowledge Distillation with Dynamic Pruning” (KDDP), which trains a dynamically pruned compact student network under the guidance of a large teacher network. In KDDP, we train the student network with supervision from the teacher network, while applying L1 regularization on the neuron activations in a fully-connected layer. Subsequently, we prune inactive neurons. Our method automatically determines the final size of the student model. We evaluate the compression rate and accuracy of the resulting networks on an image classification dataset, and compare them to results obtained by Knowledge Distillation (KD). Compared to KD, our method produces better accuracy and more compact models.

Keywords

References

  1. [1] Y. LeCun, J. S. Denker, S. A. Solla, R. E. Howard, and L. D. Jackel, “Optimal brain damage.,” in Advances in Neural Processing Systems (NIPS Conference), vol. 2, pp. 598–605, 1989.
  2. [2] B. Hassibi and D. Stork, “Second order derivatives for network pruning: Optimal brain surgeon,” in Advances in Neural Information Processing Systems 5 (NIPS Conference), pp. 164–171, 1992.
  3. [3] S. Srinivas and R. V. Babu, “Data-free parameter pruning for deep neural networks,” in Proceedings of the British Machine Vision Conference 2015, BMVC 2015, Swansea, UK, September 7-10, 2015, pp. 31.1–31.12, 2015
  4. [4] S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” in Advances in Neural Information Processing Systems (NIPS Conference), pp. 1135–1143, 2015.
  5. [5] H. Zhou, J. M. Alvarez, and F. Porikli, “Less is more: Towards compact cnns,” in European Conference on Computer Vision (ECCV), pp. 662–677, Springer, 2016.
  6. [6] W. Wen, C. Wu, Y. Wang, Y. Chen, and H. Li, “Learning structured sparsity in deep neural networks,” in Advances in Neural Information Processing Systems (NIPS Conference), pp. 2074–2082, 2016.
  7. [7] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning efficient convolutional networks through network slimming,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2736–2744, 2017.
  8. [8] J. Jin, A. Dundar, and E. Culurciello, “Flattened convolutional neural networks for feedforward acceleration,” in 3rd International Conference on Learning Representations, ICLR, San Diego, CA, USA, May 7-9, 2015, Workshop Track Proceedings, 2015.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

September 30, 2022

Submission Date

July 6, 2022

Acceptance Date

August 31, 2022

Published in Issue

Year 2022 Volume: 10 Number: 3

APA
Şener, E., & Akbaş, E. (2022). Improved Knowledge Distillationwith Dynamic Network Pruning. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, 10(3), 650-665. https://doi.org/10.29109/gujsc.1141648
AMA
1.Şener E, Akbaş E. Improved Knowledge Distillationwith Dynamic Network Pruning. GUJS Part C. 2022;10(3):650-665. doi:10.29109/gujsc.1141648
Chicago
Şener, Eren, and Emre Akbaş. 2022. “Improved Knowledge Distillationwith Dynamic Network Pruning”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji 10 (3): 650-65. https://doi.org/10.29109/gujsc.1141648.
EndNote
Şener E, Akbaş E (September 1, 2022) Improved Knowledge Distillationwith Dynamic Network Pruning. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 10 3 650–665.
IEEE
[1]E. Şener and E. Akbaş, “Improved Knowledge Distillationwith Dynamic Network Pruning”, GUJS Part C, vol. 10, no. 3, pp. 650–665, Sept. 2022, doi: 10.29109/gujsc.1141648.
ISNAD
Şener, Eren - Akbaş, Emre. “Improved Knowledge Distillationwith Dynamic Network Pruning”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım ve Teknoloji 10/3 (September 1, 2022): 650-665. https://doi.org/10.29109/gujsc.1141648.
JAMA
1.Şener E, Akbaş E. Improved Knowledge Distillationwith Dynamic Network Pruning. GUJS Part C. 2022;10:650–665.
MLA
Şener, Eren, and Emre Akbaş. “Improved Knowledge Distillationwith Dynamic Network Pruning”. Gazi Üniversitesi Fen Bilimleri Dergisi Part C: Tasarım Ve Teknoloji, vol. 10, no. 3, Sept. 2022, pp. 650-65, doi:10.29109/gujsc.1141648.
Vancouver
1.Eren Şener, Emre Akbaş. Improved Knowledge Distillationwith Dynamic Network Pruning. GUJS Part C. 2022 Sep. 1;10(3):650-65. doi:10.29109/gujsc.1141648

                                TRINDEX     16167        16166    21432    logo.png

      

    e-ISSN:2147-9526