Image classification of wildflowers using deep residual learning and convolutional neural nets
Images © Jennifer Waller 2017
This project uses convolutional neural nets to classify images of wildflowers found in Colorado's Front Range.
Accurate identification of wildflowers is a task with relevance to both recreation and environmental management. Currently, there are several mobile apps designed to identify flowers using images; the best of these (e.g., https://identify.plantnet-project.org/) is connected with an annual international competition for advancing techniques for plant classification from images. However, none of the extant plant identification apps are particularly accurate for identification of flowers in North America.
Example: Pl@ntNet attempt to identify Delphinium nuttalianum, a plant commonly seen blooming in Colorado in early spring:
It seems reasonable that a model trained primarily on images of flora prevalent in the Front Range of Colorado would be more likely to correctly identify images of local wildflowers than global apps trained on flora located primarily in other regions of the world. The primary aim of this project is to develop a model for classification of wildflowers native to the Front Range in Colorado. A secondary aim is to develop a model that, in future, could take advantage of metadata provided by users of a mobile app while photographing wildflowers in order to provide more accurate classifications.
Initially, I planned to collect images via web scraping. However, my preliminary efforts suggested that web scraping would be very time intensive as most websites with images of wildflowers have only a few images of each species. Additionally, when considering ways to improve upon existing flower identification apps, it seemed to me that having photographs tagged with date/time and GPS location could be potentially useful. In the long term, historical GPS and date/time information could be used to improve prediction of flower species; each species is more common in particular areas/elevations and at particular times of the year. More immediately, GPS information will permit clustering of photos by location, which will allow me to cluster images within observations (i.e., one plant = one observation), a strategy employed in the 2015 LifeCLEF challenge (for a summary, see http://ceur-ws.org/Vol-1391/157-CR.pdf). For all these reasons, I chose to collect photographs of local wildflowers using my iPhone and a point and shoot camera. I also gathered mobile phone photos from friends and family.
Basic hand-written CNN using Keras with Theano backend, trained on photos taken with my iPhone 6s.
Data: For this model, I was pickier with images than in later attempts (see below); I only included images that were in focus and I removed images that were very similar.
Results: Accuracy was .88. Misclassified images were most commonly images confused as penstemon virens (suggesting that I needed more photos of penstemon virens) or images with a lot of foliage. This seemed to be due to the relative infrequency of zoomed-out images containing a lot of foliage within the data set, generally. To resolve this issue, I considered adding more zoomed-out images or simply using higher resolution images or cropping the images. The foliage-related misclassification issue is demonstrated by the images below:
Next Steps: A brief review of the literature related to image classification for flowers brought me to publications from recent successful teams in the PlantCLEF (http://www.imageclef.org/lifeclef/2016/plant) annual competition. I was particularly interested in the possibility of using a deep residual network based on work from Šulc and colleagues (http://cmp.felk.cvut.cz/~mishkdmy/papers/CMP-CLEF-2016.pdf).
The current standard for plant identification is fine tuning very deep networks trained on large datasets of images (e.g., ImageNet (http://www.image-net.org/)). One of the newer advances in deep networks is He and colleagues' residual neural network, ResNet (https://arxiv.org/abs/1512.03385)). Deep networks have been of great interest to computer vision researchers because neural networks with more layers are able to recognize more features than those with fewer layers. Being able to recognize more features is very useful for differentiating objects with a lot of visual complexity, like flowers. However, traditional neural networks suffer from oversaturation when they have a lot of layers; they actually underfit on the training data. Residual networks differ from 'traditional' deep networks because the model is trained to learn the residual error instead of the traditional mapping. ResNet also passes the identity mapping past convolutional layers in parts of the model; this also reduces the chance of oversaturation.
Image from He et al., 2015 paper: https://arxiv.org/abs/1512.03385
Fine-tuning of pre-trained ResNet50 (Keras build from https://github.com/fchollet/keras/blob/master/keras/applications/resnet50.py). ResNet50 was trained on millions of images of objects, so it is already trained to detect basic features in objects (e.g., edges, colors). By adding fully connected layers specific to the wildflower data, we essentially fine tune ResNet50 to apply its understanding of basic objects to identify features that distinguish our flower species/classes.
Image Preprocessing: This time, I wanted to try using image generation to reduce overfitting (see below). To do this, I first needed to resize (to 256x256) and center/crop (to 224x224) the images.
Image Generation: To decrease the chance of overfitting, the image generator in Keras provided augmented images for each epoch; thus, the model never saw same image twice. Random augmentations included horizontal flip, rotation (up to 30 degrees), horizontal and vertical shift.
Goëau, H., Bonnet, P., & Joly, A. (2015). LifeCLEF Plant Identification Task 2015. (http://ceur-ws.org/Vol-1391/157-CR.pdf)
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. arXiv.org. (https://arxiv.org/abs/1512.03385)
Šulc, M., Mishkin, D., & Matas, J. (2016). Very deep residual networks with MaxOut for plant identification in the wild. (http://cmp.felk.cvut.cz/~mishkdmy/papers/CMP-CLEF-2016.pdf)
© Jennifer Waller 2017. All rights reserved.