This is a collection of papers aiming at reducing model sizes or the ASIC/FPGA accelerator for Machine Learning, especially deep neural network related applications. (Inspiled by Neural-Networks-on-Silicon)
Tutorials:
Hardware Accelerator: Efficient Processing of Deep Neural Networks. (link)
Model Compression: Model Compression and Acceleration for Deep Neural Networks. (link)