Impy is a Python3 library with features that help you in your computer vision tasks.
Impy is a library used for deep learning projects that use image datasets.
It provides:
Follow the next steps:
pip install impy-0.1-py3-none-any.whl
Impy has multiple features that allow you to solve several different problems with a few lines of code. In order to showcase the features of impy we are going to solve common problems that involve both Computer Vision and Deep Learning.
We are going to work with a mini-dataset of cars and pedestrians (available here). This dataset has object annotations that make it suitable to solve a localization problem.
In this section we are going to solve problems related with object localization.
One common problem in Computer Vision and CNNs is dealing with big images. Let's sample one of the images from our mini-dataset:
The size of this image is 3840x2160. It is too big for training. Most likely, your computer will run out of memory. In order to try to solve the big image problem, we could reduce the size of the mini-batch hyperparameter. But if the image is too big it would still not work. We could also try to reduce the size of the image. But that means the image losses quality and you would need to label the smaller image again.
Instead of hacking a solution, we are going to solve the problem efficiently. The best solution is to sample crops of a specific size that contain the maximum amount of bounding boxes possible. Crops of 1032x1032 pixels are usually small enough.
Let's see how to do this with impy:
mkdir -p $PWD/testing_cars
cd testing_cars
git clone https://github.com/lozuwa/cars_dataset
import os
from impy.ObjectDetectionDataset import ObjectDetectionDataset
def main():
# Define the path to images and annotations
images_path:str = os.path.join(os.getcwd(), "cars_dataset", "images")
annotations_path:str = os.path.join(os.getcwd(), "cars_dataset", "annotations", "xmls")
# Define the name of the dataset
dbName:str = "CarsDataset"
# Create an object of ObjectDetectionDataset
obda:any = ObjectDetectionDataset(imagesDirectory=images_path, annotationsDirectory=annotations_path, databaseName=dbName)
# Reduce the dataset to smaller Rois of smaller ROIs of shape 1032x1032.
offset:list=[1032, 1032]
images_output_path:str = os.path.join(os.getcwd(), "cars_dataset", "images_reduced")
annotations_output_path:str = os.path.join(os.getcwd(), "cars_dataset", "annotations_reduced", "xmls")
obda.reduceDatasetByRois(offset = offset, outputImageDirectory = images_output_path, outputAnnotationDirectory = annotations_output_path)
if __name__ == "__main__":
main()
mkdir -p $PWD/cars_dataset/images_reduced/
mkdir -p $PWD/cars_dataset/annotations_reduced/xmls/
python reducing_big_images.py
Impy will create a new set of images and annotations with the size specified by offset and will include the maximum number of annotations possible so you will end up with an optimal number of data points. Let's see the results of the example:
As you can see the bounding boxes have been maintained and small crops of the big image are now available. We can use this images for training and our problem is solved.
Note that in some cases you are going to end up with an inefficient amount of crops due to overlapping crops in the clustering algorithm. I am working on this and a better solution will be released soon. Nonetheless, these results are still way more efficient than what is usually done which is crop each bounding box one by one (This leads to inefficient memory usage, repeated data points, lose of context and simpler representation.).
Another common problem in Computer Vision and CNNs for object localization is data augmentation. Specifically space augmentations (e.g: scaling, cropping, rotation, etc.). For this you would usually make a custom script. But with impy we can make this easier.
touch augmentation_configuration.json
{
"multiple_image_augmentations": {
"Sequential": [
{
"image_color_augmenters": {
"Sequential": [
{
"sharpening": {
"weight": 2.0,
"save": true,
"restartFrame": false,
"randomEvent": false
}
}
]
}
},
{
"bounding_box_augmenters": {
"Sequential": [
{
"scale": {
"size": [1.2, 1.2],
"zoom": true,
"interpolationMethod": 1,
"save": true,
"restartFrame": false,
"randomEvent": false
}
},
{
"verticalFlip": {
"save": true,
"restartFrame": false,
"randomEvent": true
}
}
]
}
},
{
"image_color_augmenters": {
"Sequential": [
{
"histogramEqualization":{
"equalizationType": 1,
"save": true,
"restartFrame": false,
"randomEvent": false
}
}
]
}
},
{
"bounding_box_augmenters": {
"Sequential": [
{
"horizontalFlip": {
"save": true,
"restartFrame": false,
"randomEvent": false
}
},
{
"crop": {
"save": true,
"restartFrame": true,
"randomEvent": false
}
}
]
}
}
]
}
}
Let's analyze the configuration file step by step. Currently, this is the most complex type of data augmentation you can achieve with the library.
Note the file starts with "multiple_image_augmentations", then a "Sequential" key follows. Inside "Sequential" we define an array. This is important, each element of the array is a type of augmenter.
The first augmenter we are going to define is a "image_color_agumenters" which is going to execute a sequence of color augmentations. In this case, we have defined only one type of color augmentation which is sharpening with a weight of 2.0.
After the color augmentation, we have defined a "bounding_box_augmenters" which is going to execute a "scale" augmentation with zoom followed by a "verticalFlip".
We want to keep going. So we define two more types of image augmenters. Another "image_color_augmenters" which applies "histogramEqualization" to the image. And another "bounding_box_agumeneters" which applies a "horizontalFlip" and a "crop" augmentation.
Note there are three types of parameters in each augmenter. These are optional, but I recommend specifying them in order to fully understand your pipeline. These parameters are:
As you have seen we can define any type of crazy configuration and augment our images with the available methods while choosing whether to save each augmentation, restart the frame to its original space or randomize the event so we make things crazier. Get creative and define your own data augmentation pipelines.
Once the configuration file is created, we can apply the data augmentation pipeline with the following code.
import os
from impy.ObjectDetectionDataset import ObjectDetectionDataset
def main():
# Define the path to images and annotations
images_path:str=os.path.join(os.getcwd(), "cars_dataset", "images")
annotations_path:str=os.path.join(os.getcwd(), "cars_dataset", "annotations", "xmls")
# Define the name of the dataset
dbName:str="CarsDataset"
# Create an ObjectDetectionDataset object
obda:any=ObjectDetectionDataset(imagesDirectory=images_path, annotationsDirectory=annotations_path, databaseName=dbName)
# Apply data augmentation by using the following method of the ObjectDetectionDataset class.
configuration_file:str=os.path.join(os.getcwd(), "augmentation_configuration.json")
images_output_path:str=os.path.join(os.getcwd(), "cars_dataset", "images_augmented")
annotations_output_path:str=os.path.join(os.getcwd(), "cars_dataset", "annotations_augmented", "xmls")
obda.applyDataAugmentation(configurationFile=configuration_file, outputImageDirectory=images_output_path, outputAnnotationDirectory=annotations_output_path)
if __name__ == "__main__":
main()
python apply_bounding_box_augmentations.py
Next I present the results of the augmentations. Note the transformation does not alter the bounding boxes of the image which saves you a lot of time in case you want to increase the representational complexity of your data.
A class that holds a detection dataset. Parameters:
Class methods:
Checks the consistency of the image files with the annotation files.
Saves the bounding boxes of the data set as images. Parameters:
Iterate over images and annotations and execute redueImageDataPointByRoi for each one.
All of the augmentations ought to implement the following parameters:
Apply a bitwise_not operation to the pixels in the image. Code example:
{
"invertColor": {
"Cspace": [true, true, true]
}
}
Equalize the color space of the image. Code example:
{
"histogramEqualization": {
"equalizationType": 1
}
}
Multiply the pixel distribution with a scalar. Code example:
{
"changeBrightness": {
"coefficient": 1.2
}
}
Apply a sharpening system to the image. Code example:
{
"sharpening": {
"weight": 0.8
}
}
Add gaussian noise to the image. Code example:
```json { "addGaussianNoise": { "coefficient": 0.5 } } ```Apply a Gaussian low pass filter to the image. Code example:
{
"gaussianBlur": {
"sigma": 2
}
}
Shift the colors of the image. Code example:
{
"shiftBlur": {
}
}
All of the augmentations ought to implement the following parameters:
Scales the size of an image and maintains the location of its bounding boxes. Code example:
{
"scale": {
"size": [1.2, 1.2],
"zoom": true,
"interpolationMethod": 1
}
}
Crops the bounding boxes of an image. Specify the size of the crop in the size parameter. Code example:
{
"crop": {
"size": [50, 50]
}
}
Pads the bounding boxes of an image. i.e adds pixels from outside the bounding box. Specify the amount of pixels to be added in the size parameter. Code example:
{
"pad": {
"size": [20, 20]
}
}
Flips the bounding boxes of an image in the x axis. Code example:
{
"horizontalFlip": {
}
}
Flips the bounding boxes of an image in the y axis. Code example:
{
"verticalFlip": {
}
}
Rotates the bounding boxes of an image anti-clockwise. Code example:
{
"rotation": {
"theta": 0.5
}
}
Draws random squares of a specific color and size in the area of the bounding box. Code example:
{
"jitterBoxes": {
"size": [10, 10],
"quantity": 5,
"color": [255,255,255]
}
}
Set pixels inside the bounding box to zero depending on a probability p extracted from a normal distribution. If p > threshold, then the pixel is changed. Code example:
{
"dropout": {
"size": [5, 5],
"threshold": 0.8,
"color": [255,255,255]
}
}
If you want to contribute to this library. Please follow the next steps so you can have a development environment.
conda create --name=impy python3.7
source activate impy
git clone https://github.com/lozuwa/impy
python setup.py sdist bdist_wheel