A PyTorch implementation for PyramidNets (Deep Pyramidal Residual Networks, https://arxiv.org/abs/1610.02915)
This repository contains a PyTorch implementation for the paper: Deep Pyramidal Residual Networks (CVPR 2017, Dongyoon Han*, Jiwhan Kim*, and Junmo Kim, (equally contributed by the authors*)). The code in this repository is based on the example provided in PyTorch examples and the nice implementation of Densely Connected Convolutional Networks.
Two other implementations with LuaTorch and Caffe are provided:
To train additive PyramidNet-200 (alpha=300 with bottleneck) on ImageNet-1k dataset with 8 GPUs:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python train.py --data ~/dataset/ILSVRC/Data/CLS-LOC/ --net_type pyramidnet --lr 0.05 --batch_size 128 --depth 200 -j 16 --alpha 300 --print-freq 1 --expname PyramidNet-200 --dataset imagenet --epochs 100
To train additive PyramidNet-110 (alpha=48 without bottleneck) on CIFAR-10 dataset with a single-GPU:
CUDA_VISIBLE_DEVICES=0 python train.py --net_type pyramidnet --alpha 64 --depth 110 --no-bottleneck --batch_size 32 --lr 0.025 --print-freq 1 --expname PyramidNet-110 --dataset cifar10 --epochs 300
To train additive PyramidNet-164 (alpha=48 with bottleneck) on CIFAR-100 dataset with 4 GPUs:
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --net_type pyramidnet --alpha 48 --depth 164 --batch_size 128 --lr 0.5 --print-freq 1 --expname PyramidNet-164 --dataset cifar100 --epochs 300
Thanks to the implementation, which support the TensorBoard to track training progress efficiently, all the experiments can be tracked with tensorboard_logger.
Tensorboard_logger can be installed with
pip install tensorboard_logger
Deep convolutional neural networks (DCNNs) have shown remarkable performance in image classification tasks in recent years. Generally, deep neural network architectures are stacks consisting of a large number of convolution layers, and they perform downsampling along the spatial dimension via pooling to reduce memory usage. At the same time, the feature map dimension (i.e., the number of channels) is sharply increased at downsampling locations, which is essential to ensure effective performance because it increases the capability of high-level attributes. Moreover, this also applies to residual networks and is very closely related to their performance. In this research, instead of using downsampling to achieve a sharp increase at each residual unit, we gradually increase the feature map dimension at all the units to involve as many locations as possible. This is discussed in depth together with our new insights as it has proven to be an effective design to improve the generalization ability. Furthermore, we propose a novel residual unit capable of further improving the classification accuracy with our new network architecture. Experiments on benchmark CIFAR datasets have shown that our network architecture has a superior generalization ability compared to the original residual networks.
We provide a simple schematic illustration to compare the several network architectures, which have (a) basic residual units, (b) bottleneck, (c) wide residual units, and (d) our pyramidal residual units, and (e) our pyramidal bottleneck residual units, as follows:
The results are readily reproduced, which show the same performances as those reproduced with A LuaTorch implementation for PyramidNets.
Comparison of the state-of-the-art networks by [Top-1 Test Error Rates VS # of Parameters]:
Network Type | Alpha | # of Params | Top-1 err(%) | Top-5 err(%) | Model File |
---|---|---|---|---|---|
ResNet-101 (Caffe model) | - | 44.7M | 23.6 | 7.1 | Original Model |
ResNet-101 (Luatorch model) | - | 44.7M | 22.44 | 6.21 | Original Model |
PyramidNet-v1-101 | 360 | 42.5M | 21.98 | 6.20 | Download |
Please cite our paper if PyramidNets are used:
@article{DPRN,
title={Deep Pyramidal Residual Networks},
author={Han, Dongyoon and Kim, Jiwhan and Kim, Junmo},
journal={IEEE CVPR},
year={2017}
}
If this implementation is useful, please cite or acknowledge this repository on your work.
Dongyoon Han ([email protected]), Jiwhan Kim ([email protected]), Junmo Kim ([email protected])