A TensorFlow implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks (http://arxiv.org/pdf/1312.6082.pdf)
A TensorFlow implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
Accuracy 93.45% on test dataset after about 14 hours
Training | Test |
---|---|
![]() |
![]() |
![]() |
![]() |
digit "10" means no digits
Python 2.7
Tensorflow
h5py
In Ubuntu:
$ sudo apt-get install libhdf5-dev
$ sudo pip install h5py
Clone the source code
$ git clone https://github.com/potterhsu/SVHNClassifier
$ cd SVHNClassifier
Download SVHN Dataset format 1
Extract to data folder, now your folder structure should be like below:
SVHNClassifier
- data
- extra
- 1.png
- 2.png
- ...
- digitStruct.mat
- test
- 1.png
- 2.png
- ...
- digitStruct.mat
- train
- 1.png
- 2.png
- ...
- digitStruct.mat
(Optional) Take a glance at original images with bounding boxes
Open `draw_bbox.ipynb` in Jupyter
Convert to TFRecords format
$ python convert_to_tfrecords.py --data_dir ./data
(Optional) Test for reading TFRecords files
Open `read_tfrecords_sample.ipynb` in Jupyter
Open `donkey_sample.ipynb` in Jupyter
Train
$ python train.py --data_dir ./data --train_logdir ./logs/train
Retrain if you need
$ python train.py --data_dir ./data --train_logdir ./logs/train2 --restore_checkpoint ./logs/train/latest.ckpt
Evaluate
$ python eval.py --data_dir ./data --checkpoint_dir ./logs/train --eval_logdir ./logs/eval
Visualize
$ tensorboard --logdir ./logs
(Optional) Try to make an inference
Open `inference_sample.ipynb` in Jupyter
Open `inference_outside_sample.ipynb` in Jupyter
$ python inference.py --image /path/to/image.jpg --restore_checkpoint ./logs/train/latest.ckpt
Clean
$ rm -rf ./logs
or
$ rm -rf ./logs/train2
or
$ rm -rf ./logs/eval