Fast image augmentation library and an easy-to-use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
ToTensorV2
#963Added position argument to PadIfNeeded (#933 by @yisaienkov)
Possible values: center
top_left
, top_right
, bottom_left
, bottom_right
, with center
being the default value.
One possible use case for this feature is object detection where you need to pad an image to square, but you want predicted bounding boxes being equal to the bounding box of the unpadded image.
imgaug
dependency is now optional, and by default, Albumentations won't install it. This change was necessary to prevent simultaneous install of both opencv-python-headless
and opencv-python
(you can read more about the problem in this issue). If you still need imgaug
as a dependency, you can use the pip install -U albumentations[imgaug]
command to install Albumentations with imgaug
.ToTensor
that converts NumPy arrays to PyTorch tensors is completely removed from Albumentations. You will get a RuntimeError
exception if you try to use it. Please switch to ToTensorV2
in your pipelines.A.RandomToneCurve
. See a notebook for examples of this augmentation (#839 by @aaroswings)SafeRotate
. Safely Rotate Images Without Cropping (#888 by @deleomike)SomeOf
transform that applies N augmentations from a list. Generalizing of OneOf
(#889 by @henrique)By default, Albumentations doesn't require imgaug
as a dependency. But if you need imgaug
, you can install it along with Albumentations by running pip install -U albumentations[imgaug]
.
Here is a table of deprecated imgaug
augmentations and respective augmentations from Albumentations that you should use instead:
Old deprecated augmentation | New augmentation |
---|---|
IAACropAndPad | CropAndPad |
IAAFliplr | HorizontalFlip |
IAAFlipud | VerticalFlip |
IAAEmboss | Emboss |
IAASharpen | Sharpen |
IAAAdditiveGaussianNoise | GaussNoise |
IAAPerspective | Perspective |
IAASuperpixels | Superpixels |
IAAAffine | Affine |
IAAPiecewiseAffine | PiecewiseAffine |
Serialization logic is updated. Previously, Albumentations used the full classpath to identify an augmentation (e.g. albumentations.augmentations.transforms.RandomCrop
). With the updated logic, Albumentations will use only the class name for augmentations defined in the library (e.g., RandomCrop
). For custom augmentations created by users and not distributed with Albumentations, the library will continue to use the full classpath to avoid name collisions (e.g., when a user creates a custom augmentation named RandomCrop and uses it in a pipeline).
This new logic will allow us to refactor the code without breaking serialized augmentation pipelines created using previous versions of Albumentations. This change will also reduce the size of YAML and JSON files with serialized data.
The new serialization logic is backward compatible. You can load serialized augmentation pipelines created in previous versions of Albumentations because Albumentations supports the old format.
A.ReplayCompose
to work with bounding boxes and keypoints correctly. (#748)A.GlassBlur
now correctly works with float32 inputs (#826)MultiplicativeNoise
now correctly works with gray images with shape [h, w, 1]
. (#793)albumentations.augmentations.geometric
. (#784)albumentations.augmentations.crops
. (#791)setup.py
that detects existing installations of OpenCV now also looks for opencv-contrib-python
and opencv-contrib-python-headless
(#837 by @agchang-cgl)[H, W]
to the shape [H, W, 1]
. PR #604 by @Ingwar.masks
argument to the transform function. Previously this augmentation worked only with a single mask provided by the mask
argument. PR #761A.FDA
is changed to resemble API of A.HistogramMatching
. Now, both transformations expect to receive a list of reference images, a function to read those image, and additional augmentation parameters. (#734)A.HistogramMatching
now usesread_rgb_image
as a default read_fn
. This function reads an image from the disk as an RGB NumPy array. Previously, the default read_fn
was cv2.imread
which read an image as a BGR NumPy array. (#734)A.Sequential
transform that can apply augmentations in a sequence. This transform is not intended to be a replacement for A.Compose
. Instead, it should be used inside A.Compose
the same way A.OneOf
or A.OneOrOther
. For instance, you can combine A.OneOf
with A.Sequential
to create an augmentation pipeline containing multiple sequences of augmentations and apply one randomly chosen sequence to input data. (#735)A.ShiftScaleRotate
now has two additional optional parameters: shift_limit_x
and shift_limit_y
. If either of those parameters (or both of them) is set A.ShiftScaleRotate
will use the set values to shift images on the respective axis. (#735)A.ToTensorV2
now supports an additional argument transpose_mask
(False
by default). If the argument is set to True
and an input mask has 3 dimensions, A.ToTensorV2
will transpose dimensions of a mask tensor in addition to transposing dimensions of an image tensor. (#735)A.FDA
now correctly uses coordinates of the center of an image. (#730)A.HistogramMatching
. (#734)A.load()
was called to deserialize a pipeline that contained A.ToTensor
or A.ToTensorV2
, but those transforms were not imported in the code before the call. (#735)A.FDA
transform for Fourier-based domain adaptation. (#685)A.HistogramMatching
transform that applies histogram matching. (#708)A.ColorJitter
transform that behaves similarly to ColorJitter
from torchvision (though there are some minor differences due to different internal logic for working with HSV colorspace in Pillow, which is used in torchvision and OpenCV, which is used in Albumentations). (#705)A.PadIfNeeded
now accepts additional pad_width_divisor
, pad_height_divisor
(None
by default) to ensure image has width & height that is dividable by given values. (#700)A.CoarseDropout
to masks via mask_fill_value
. (#699)A.GaussianBlur
now supports the sigma parameter that sets standard deviation for Gaussian kernel. (#674, #673) .A.HueSaturationValue
for float dtype. (#696, #710)YOLO
format. (#688)ReplayCompose
is now serializable. PR #623 by IlyaOvodov
PadIfNeeded
). That happened because Albumentations checked which bounding boxes and keypoints lie outside the image only after applying all augmentations. Now Albumentations will check and remove keypoints and bounding boxes that lie outside the image after each augmentation. If, for some reason, you need the old behavior, pass check_each_transform=False
in your KeypointParams
or BboxParams
. Issue #565 and PR #566.ImageCompression
and GaussNoise
. PR #569label_fields
in BboxParams
. PR #504 by IlyaOvodov
New transforms
New features
Improvements
fill_value
to Cutout
fill_value
for image and mask targetsBug Fixes
Documentation Updated
https://github.com/albu/albumentations/commit/2e25667f8c39eba3e6be0e85719e5156422ee9a9 Target: image
This transform mimics the noise that images will have if the ISO parameter of the camera is high. Wiki
https://github.com/albu/albumentations/commit/e365b52df6c6535a1bf06733b607915231f2f9d4 Targets: image
Solarize inverts all pixels above some threshold. It is an essential part of the work AutoAugment: Learning Augmentation Policies from Data.
https://github.com/albu/albumentations/commit/9f71038c95c4124bdaf3ee13a9823225bb8d85da Target: image
Equalizes image histogram. It is an essential part of the work AutoAugment: Learning Augmentation Policies from Data.
https://github.com/albu/albumentations/commit/ad95fa005fd5325deb73461bfb6e543fca342f45 Target: image
Reduce the number of bits for each pixel. It is an essential part of the work AutoAugment: Learning Augmentation Policies from Data.
Target: image https://github.com/albu/albumentations/commit/b6127864d45cfa5b5299578d309680baa0ce7aa3 Decrease Jpeg or WebP compression to the image.
https://github.com/albu/albumentations/commit/df831d6605140e7aa013deab6012d85af9854be3 Target: image
Decreases image quality by downscaling and upscaling back.
https://github.com/albu/albumentations/commit/4dbe41e8795c7b7d48e0cc4501efe8046e21765b Targets: image, mask, bboxes, keypoints
Crop the given Image to the random size and aspect ratio. This transform is an essential part of many image classification pipelines. Very popular for ImageNet classification.
It has the same API as RandomResizedCrop in torchvision.
https://github.com/albu/albumentations/commit/4cf6c36bc2332729d91e44f58f18f44b66db3c6f Targets: image, mask
Partition an image into tiles. Shuffle them and merge back.
Targets: image, mask, bboxes, keypoints
Crop area with a mask if the mask is non-empty, else make a random crop.
https://github.com/albu/albumentations/commit/a5026800d84c6c1998f224b86dedbf3f005ae994 Targets: image, mask
Convert image and mask to torch.Tensor
https://github.com/albu/albumentations/commit/d05db9e9aae6b7607c33c4cdce69be011c2f8802
The Yolo
format of a bounding box has a format [x, y, width, height],
where values normalized to the size of the image. Ex: [0.3, 0.1, 0.05, 0.07]
https://github.com/albu/albumentations/commit/9942689f9846c59006c80718ee8db38e02ee2104
Augmentations pipeline has a lot of randomnesses, which is hard to debug. We added Determentsic / Replay mode in which you can track what parameters were applied to the input and use precisely the same transform to another input if necessary.
Jupyter notebook with an example.
fill_value
to the Cutout transform.https://github.com/albu/albumentations/commit/d85bab59eb8ccb0a2fec86750f94173e18e86395
fill_value
for images and maskshttps://github.com/albu/albumentations/commit/2c1a1485f690b4e8ead50f5bb29d3838fbbc177d
One of the use cases is it to use mask_value,
which is equal to the ignore_index
of your loss. This will decrease the level of noise and may improve convergence.
https://github.com/albu/albumentations/commit/c3cc277f37b172bebf7177c779a7cf3cdf7120d3
3.2 times faster for uint8 images.
https://github.com/albu/albumentations/commit/448761df9a008384cf914343f25e3cfb7c4d7551
2 times faster for uint8 images.
https://github.com/albu/albumentations/commit/4e12c6ec3e55cf79cf242a09c5cdc813bcfc6401
2.7 times faster for uint8 images.
https://github.com/albu/albumentations/commit/ac499d0365bfb2494cb535e82591fc3460d4595a
4 times faster for uint8 images.
https://github.com/albu/albumentations/commit/c028a9557cc960da11720a0a505a19cdd4fe0b24
https://github.com/albu/albumentations/commit/30a3f3024dc34597307c466a6307e2e6d27e9d3e Not all spatial tranforms jave keypoints support yet. In this release we added Crop, CropNonEmptyMaskIfExists, LongestMaxSize, RandomCropNearBBox, Resize, SmallestMaxSize, and Transpose.
We are delighted that albumentations are helpful to the academic community. We extended documentation with a page that lists all papers and preprints that cite albumentations in their work. This page is automatically generated by parsing Google Scholar. At this moment, this number is 24.
We are delighted that albumentations help people to get top results in machine learning competitions at Kaggle and other platforms. We added a "Hall of Fame" where people can share their achievements. This page is manually created. We encourage people to add more information about their results with pull requests, following the contributing guide.
@albu @Dipet @creafz @BloodAxe @ternaus @vfdev-5 @arsenyinfo @qubvel @toshiks @Jae-Hyuck @BelBES @alekseynp @timeous @jveitchmichaelis @bfialkoff
json
, yaml
files and they will be deserialized and used in the code.json
and yaml
files.Jupyter notebook with an example
Special thanks to @creafz
Special thanks to @vfdev-5 @ternaus @BloodAxe @kirillbobyrev
fill_value
parameter to CutOut
Special thanks to @qubvel @ternaus @albu @BloodAxe