This repository contains my solutions to the assignments for Stanford's CS231n "Convolutional Neural Networks for Visual Recognition" (Spring 2020).
This repository contains my solutions to the assignments for Stanford's CS231n "Convolutional Neural Networks for Visual Recognition" course (Spring 2020).
Stanford's CS231n is one of the best ways to dive into Deep Learning in general, in particular, into Computer Vision. If you plan to excel in another subfield of Deep Learning (say, Natural Language Processing or Reinforcement Learning), we still recommend that you start with CS231n, because it helps build intuition, fundamental understanding and hands-on skills. Beware, the course is very challenging!
To motivate you to work hard, here are actual applications that you'll implement in A3 - Style Transfer and Class Visualization.
For the one on the left, you take a base image and a style image and apply the "style" to the base image (reminds you of Prisma and Artisto, right?). The example on the right is a random image, gradually perturbed in a way that a neural network classifies it more and more confidently as a gorilla. DIY Deep Dream, isn't it? And it's all math under the hood, it's cool to figure out how it all works. You'll get to this understanding with CS231n, it'll be hard but at the same time an exciting journey from a simple kNN implementation to these fascinating applications. If you think that these two applications are eye-catchy, then take another look at the picture above - a Convolutional Neural Network classifying images. That's the basics of how machines can "see" the world. The course will teach you both how to build such an algorithm from scratch and how to use modern tools to run state-of-the-art models for your tasks.
Find course notes and assignments here and be sure to check out the video lectures for Winter 2016 and Spring 2017!
Assignments have been completed using both TensorFlow and PyTorch.
Q1: k-Nearest Neighbor Classifier
Q2: Training a Support Vector Machine
Q3: Implement a Softmax classifier
Q5: Higher Level Representations: Image Features
Q1: Fully-connected Neural Network
Q3: Dropout
Q5: PyTorch / TensorFlow v2 on CIFAR-10 / TensorFlow v1 (Tweaked TFv1 model)
Model | Training Accuracy | Test Accuracy |
---|---|---|
Base network | 92.86 | 88.90 |
VGG-16 | 99.98 | 93.16 |
VGG-19 | 99.98 | 93.24 |
ResNet-18 | 99.99 | 93.73 |
ResNet-101 | 99.99 | 93.76 |
Q1: Image Captioning with Vanilla RNNs
Q2: Image Captioning with LSTMs
Q3: Network Visualization: Saliency maps, Class Visualization, and Fooling Images (PyTorch / TensorFlow v2 / TensorFlow v1)
Q4: Style Transfer (PyTorch / TensorFlow v2 / TensorFlow v1)
Q5: Generative Adversarial Networks (PyTorch / TensorFlow v2 / TensorFlow v1)
For some parts of the 3rd assignment, you'll need GPUs. Kaggle Kernels or Google Colaboratory will do.
I recognize the hard time people spend on building intuition, understanding new concepts and debugging assignments. The solutions uploaded here are only for reference. They are meant to unblock you if you get stuck somewhere. Please do not copy any part of the solutions as-is (the assignments are fairly easy if you read the instructions carefully).