A single handwritten digit classifier, using the MNIST dataset. Pure Numpy.
An implementation of multilayer neural network using numpy
library. The implementation
is a modified version of Michael Nielsen's implementation in
Neural Networks and Deep Learning book.
If you are familiar with basics of Neural Networks, feel free to skip this section. For total beginners who landed up here before reading anything about Neural Networks:
Note: There are other functions in use other than sigmoid, but this information for now is sufficient for beginners.
This book and Stanford's Machine Learning Course by Prof. Andrew Ng are recommended as good resources for beginners. At times, it got confusing to me while referring both resources:
MATLAB has 1-indexed data structures, while numpy
has them 0-indexed. Some parameters
of a neural network are not defined for the input layer, so there was a little mess up in
mathematical equations of book, and indices in code. For example according to the book, the
bias vector of second layer of neural network was referred as bias[0]
as input layer (first
layer) has no bias vector. I found it a bit inconvenient to play with.
I am fond of Scikit Learn's API style, hence my class has a similar structure of code. While
theoretically it resembles the book and Stanford's course, you can find simple methods such
as fit
, predict
, validate
to train, test, validate the model respectively.
I have followed a particular convention in indexing quantities. Dimensions of quantities are listed according to this figure.
sizes = [2, 3, 1]
numpy.ndarrays
). weights[l]
is a matrix of weights entering the
lth layer of the network (Denoted as wl).weights[0]
is redundant, and further it
follows as weights[1]
being the collection of weights entering layer 1 and so on.weights = |¯ [[]], [[a, b], [[p], ¯|
| [c, d], [q], |
|_ [e, f]], [r]] _|
numpy.ndarrays
). biases[l]
is a vector of biases of neurons in the
lth layer of network (Denoted as bl).biases[0]
is redundant, and further it
follows as biases[1]
being the biases of neurons of layer 1 and so on.biases = |¯ [[], [[0], [[0]] ¯|
| []], [1], |
|_ [2]], _|
zs[0]
is redundant.zs
will be same as biases
.biases
, zs
and
activations
are similar.activations[0]
can be related
to x - the input training example.#to train and test the neural network algorithm, please use the following command
python main.py