Releases: albumentations-team/albumentations
Albumentations 1.4.24 Release Notes
- Support Our Work
- Core
- Transforms
- Bugfixes
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Core
- Added new keypoints format
xyz
for ImageOnly and Dual transforms (z coordinate stays unchanged)
Transforms
New transform AtLeastOneBBoxRandomCrop
Crop an area from image while ensuring at least one bounding box is present in the crop.
Improvements
- SmallestMaxSize: Added option for separate max_size for height/width
- LongestMaxSize: Added option for separate max_size for height/width
- Added keypoints support to:
CenterCrop3D
,CoarseDropout3D
,CubicSymmetry
,Pad3D
,PadIfNeeded3D
,RandomCrop3D
(by @ternaus)
Bugfixes
- Do not import
eval-type-backport
for python 3.10 and older. by @PerchunPak - Bugfix in
ToTensorV2
by @matejpekar
Albumentations 1.4.23 Release Notes
- Support Our Work
- Core
- Transforms
- Bugfixes
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Core
Target images
as numpy array
Now supports numpy arrays with shape (num_images, height, width, num_channels)
or (num_images, height, width)
as images
in Compose
- Ideal for video processing applications
- Same transform applies to all images in the array
New 3D Data Support
- volume:
(depth, height, width)
or(depth, height, width, num_channels)
- mask3d:
(depth, height, width)
or(depth, height, width, num_channels)
- volumes:
(num_volumes, depth, height, width)
for batch processing - masks3d:
(num_volumes, depth, height, width)
for batch processing
volume = np.random.rand(96, 256, 256) # Your 3D medical volume
mask = np.zeros((96, 256, 256)) # Your 3D segmentation mask
transformed = transform(volume=volume, mask3d=mask)
transformed_volume = transformed['volume']
transformed_mask = transformed['mask3d']
Transforms
Added 3D transforms by @ternaus
Padding & Cropping
- Pad3D: Pad 3D volumes with flexible padding options
- PadIfNeeded3D: Conditional padding to meet minimum dimensions or divisibility requirements
- CenterCrop3D: Center cropping for 3D volumes
- RandomCrop3D: Random cropping of 3D volumes
transform = A.Compose([
# Crop volume to a fixed size for memory efficiency
A.RandomCrop3D(size=(64, 128, 128), p=1.0),
# Randomly remove cubic regions to simulate occlusions
A.CoarseDropout3D(
num_holes_range=(2, 6),
hole_depth_range=(0.1, 0.3),
hole_height_range=(0.1, 0.3),
hole_width_range=(0.1, 0.3),
p=0.5
),
])
volume = np.random.rand(96, 256, 256) # Your 3D medical volume
mask = np.zeros((96, 256, 256)) # Your 3D segmentation mask
transformed = transform(volume=volume, mask3d=mask)
transformed_volume = transformed['volume']
transformed_mask = transformed['mask3d']
Augmentation
- CoarseDropout3D: Random cuboid dropout regions for occlusion simulation
- CubicSymmetry: 48 possible cube symmetry transformations (24 rotations + 24 rotoreflections)
Fixes
- Added flexible brightness in RandomSunFlare by @momincks
- Bugfix in CenterCrop, RandomCrop by @iRyoka
- Fix in Normalize docstring by @mennohofste
Albumentations 1.4.22 Release Notes
- Support Our Work
- Transforms
- Core
- Bugfixes
Support Our Work
- Help Us Grow - If you find value in Albumentations, consider becoming a sponsor. Every contribution, no matter the size, helps us maintain and improve the library for everyone.
- Show Your Support - If you enjoy using Albumentations, consider giving us a ⭐ on GitHub. It helps others discover the library and motivates our team.
- Join Our Community - Have suggestions or ran into issues? We welcome your input! Share your experience in our GitHub issues or connect with us on Discord.
Transforms
Elastic Transform
- Added argument
noise_distribution
that allows sampling displacement fields fromgaussian
and fromuniform
distributions. - Deprecated parameters
border_mode
,value
,mask_value
- you can specify them, but will not have any effect.
New transform ShotNoise
Apply shot noise to the image by modeling photon counting as a Poisson process.
Shot noise (also known as Poisson noise) occurs in imaging due to the quantum nature of light.
When photons hit an imaging sensor, they arrive at random times following Poisson statistics.
This transform simulates this physical process in linear light space by:
1. Converting to linear space (removing gamma)
2. Treating each pixel value as an expected photon count
3. Sampling actual photon counts from a Poisson distribution
4. Converting back to display space (reapplying gamma)
The noise characteristics follow real camera behavior:
- Noise variance equals signal mean in linear space (Poisson statistics)
- Brighter regions have more absolute noise but less relative noise
- Darker regions have less absolute noise but more relative noise
- Noise is generated independently for each pixel and color channel
RandomGridShuffle
Addes support for bounding boxes
CorseDropout
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
GridDropout
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
MaskDropout
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
XYMasking
Added an option to inpaint holes using inpaint_ns
and inpaint_telea
from OpenCV
New transform TimeReverse
Added NewTransform TimeReverse
Reverse the time axis of a spectrogram image, also known as time inversion.
Time inversion of a spectrogram is analogous to the random flip of an image,
an augmentation technique widely used in the visual domain. This can be relevant
in the context of audio classification tasks when working with spectrograms.
The technique was successfully applied in the AudioCLIP paper, which extended
CLIP to handle image, text, and audio inputs.
This transform is implemented as a subclass of HorizontalFlip since reversing
time in a spectrogram is equivalent to flipping the image horizontally.
New transform TimeMasking
Added NewTransform TimeMasking
Apply masking to a spectrogram in the time domain.
This transform masks random segments along the time axis of a spectrogram,
implementing the time masking technique proposed in the SpecAugment paper.
Time masking helps in training models to be robust against temporal variations
and missing information in audio signals.
This is a specialized version of XYMasking configured for time masking only.
For more advanced use cases (e.g., multiple masks, frequency masking, or custom
fill values), consider using XYMasking directly.
New transform FrequencyMasking
Apply masking to a spectrogram in the frequency domain.
This transform masks random segments along the frequency axis of a spectrogram,
implementing the frequency masking technique proposed in the SpecAugment paper.
Frequency masking helps in training models to be robust against frequency variations
and missing spectral information in audio signals.
This is a specialized version of XYMasking configured for frequency masking only.
For more advanced use cases (e.g., multiple masks, time masking, or custom
fill values), consider using XYMasking directly.
Added NewTransform FrequencyMasking
It is a specialized version of XYMasking that has the similar API as FrequencyMasking from torchaudio
New Transform Pad
Pad the sides of an image by specified number of pixels.
Args:
padding (int, tuple[int, int] or tuple[int, int, int, int]): Padding values. Can be:
* int - pad all sides by this value
* tuple[int, int] - (pad_x, pad_y) to pad left/right by pad_x and top/bottom by pad_y
* tuple[int, int, int, int] - (left, top, right, bottom) specific padding per side
This is the generalization of the torchvision transform with the same name
New Transform Erasing
This is the generalization of the similar torchvision transform
Randomly erases rectangular regions in an image, following the Random Erasing Data Augmentation technique.
This augmentation helps improve model robustness by randomly masking out rectangular regions in the image,
simulating occlusions and encouraging the model to learn from partial information. It's particularly
effective for image classification and person re-identification tasks.
New Transform AdditiveNoise
Apply random noise to image channels using various noise distributions.
This transform generates noise using different probability distributions and applies it
to image channels. The noise can be generated in three spatial modes and supports
multiple noise distributions, each with configurable parameters.
Args:
noise_type: Type of noise distribution to use. Options:
- "uniform": Uniform distribution, good for simple random perturbations
- "gaussian": Normal distribution, models natural random processes
- "laplace": Similar to Gaussian but with heavier tails, good for outliers
- "beta": Flexible bounded distribution, can be symmetric or skewed
spatial_mode: How to generate and apply the noise. Options:
- "constant": One noise value per channel, fastest
- "per_pixel": Independent noise value for each pixel and channel, slowest
- "shared": One noise map shared across all channels, medium speed
Sharpen
Added 'gaussian' method for image sharpening.
New transform SaltAndPepper
Apply salt and pepper noise to the input image.
Salt and pepper noise is a form of impulse noise that randomly sets pixels to either maximum value (salt)
or minimum value (pepper). The amount and proportion of salt vs pepper noise can be controlled.
New transform PlasmaBrightNessContrast
Apply plasma fractal pattern to modify image brightness and contrast.
This transform uses the Diamond-Square algorithm to generate organic-looking fractal patterns
that are then used to create spatially-varying brightness and contrast adjustments.
The result is a natural-looking, non-uniform modification of the image.
New Transform PlasmaShadow
<img width="118...
Albumentations 1.4.21 Release Notes
- Support Our Work
- Transforms
- Core
- Benchmark
- Speedups
Support Our Work
- Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
- Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
- Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server
Transforms
Auto padding in crops
Added option to pad the image if crop size is larger than the crop size
Old way
[
A.PadIfNeeded(min_height=1024, min_width=1024, p=1),
A.RandomCrop(height=1204, width=1024, p=1)
]
New way:
A.RandomCrop(height=1204, width=1024, p=1, pad_if_needed=True)
Works for:
You may also use it to pad image to a desired size.
Core
Random state
Now random state for the pipeline does not depend on the global random state
Before
random.seed(seed)
np.random.seed(seed)
transform = A.Compose(...)
Now
transform = A.Compose(seed=seed, ...)
or
transform = A.Compose(...)
transform.set_random_seed(seed)
Saving used parameters
Now you can get exact parameters that were used in the pipeline on a given sample with
transform = A.Compose(save_applied_params=True, ...)
result = transform(image=image, bboxes=bboxes, mask=mask, keypoints=keypoints)
print(result["applied_transforms"])
Benchmark
Moved benchmark to a separate repo
https://github.com/albumentations-team/benchmark/
Current result for uint8 images:
Transform | albumentations 1.4.20 |
augly 1.0.0 |
imgaug 0.4.0 |
kornia 0.7.3 |
torchvision 0.20.0 |
---|---|---|---|---|---|
HorizontalFlip | 8325 ± 955 | 4807 ± 818 | 6042 ± 788 | 390 ± 106 | 914 ± 67 |
VerticalFlip | 20493 ± 1134 | 9153 ± 1291 | 10931 ± 1844 | 1212 ± 402 | 3198 ± 200 |
Rotate | 1272 ± 12 | 1119 ± 41 | 1136 ± 218 | 143 ± 11 | 181 ± 11 |
Affine | 967 ± 3 | - | 774 ± 97 | 147 ± 9 | 130 ± 12 |
Equalize | 961 ± 4 | - | 581 ± 54 | 152 ± 19 | 479 ± 12 |
RandomCrop80 | 118946 ± 741 | 25272 ± 1822 | 11503 ± 441 | 1510 ± 230 | 32109 ± 1241 |
ShiftRGB | 1873 ± 252 | - | 1582 ± 65 | - | - |
Resize | 2365 ± 153 | 611 ± 78 | 1806 ± 63 | 232 ± 24 | 195 ± 4 |
RandomGamma | 8608 ± 220 | - | 2318 ± 269 | 108 ± 13 | - |
Grayscale | 3050 ± 597 | 2720 ± 932 | 1681 ± 156 | 289 ± 75 | 1838 ± 130 |
RandomPerspective | 410 ± 20 | - | 554 ± 22 | 86 ± 11 | 96 ± 5 |
GaussianBlur | 1734 ± 204 | 242 ± 4 | 1090 ± 65 | 176 ± 18 | 79 ± 3 |
MedianBlur | 862 ± 30 | - | 813 ± 30 | 5 ± 0 | - |
MotionBlur | 2975 ± 52 | - | 612 ± 18 | 73 ± 2 | - |
Posterize | 5214 ± 101 | - | 2097 ± 68 | 430 ± 49 | 3196 ± 185 |
JpegCompression | 845 ± 61 | 778 ± 5 | 459 ± 35 | 71 ± 3 | 625 ± 17 |
GaussianNoise | 147 ± 10 | 67 ± 2 | 206 ± 11 | 75 ± 1 | - |
Elastic | 171 ± 15 | - | 235 ± 20 | 1 ± 0 | 2 ± 0 |
Clahe | 423 ± 10 | - | 335 ± 43 | 94 ± 9 | - |
CoarseDropout | 11288 ± 609 | - | 671 ± 38 | 536 ± 87 | - |
Blur | 4816 ± 59 | 246 ± 3 | 3807 ± 325 | - | - |
ColorJitter | 536 ± 41 | 255 ± 13 | - | 55 ± 18 | 46 ± 2 |
Brightness | 4443 ± 84 | 1163 ± 86 | - | 472 ± 101 | 429 ± 20 |
Contrast | 4398 ± 143 | 736 ± 79 | - | 425 ± 52 | 335 ± 35 |
RandomResizedCrop | 2952 ± 24 | - | - | 287 ± 58 | 511 ± 10 |
Normalize | 1016 ± 84 | - | - | 626 ± 40 | 519 ± 12 |
PlankianJitter | 1844 ± 208 | - | - | 813 ± 211 | - |
Speedups
- Speedup in PlankianJitter in uint8 mode
- Replaced
cv2.addWeighted
withwsum
from simsimd package
Albumentations 1.4.20 Release Notes
Hotfix version.
- Fix in check_version
- Fix in PieceWiseAffine
- Fix in RandomSizedCrop and RandomResizedCrop
- Fix in
RandomOrder
Albumentations 1.4.19 Release Notes
- Support Our Work
- Transforms
- Core
- Bug Fixes
Support Our Work
- Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
- Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
- Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server
Transforms
Added mask_interpolation
to all transforms that use mask interpolation, including:
- RandomSizedCrop
- RandomResizedCrop
- RandomSizedBBoxSafeCrop
- CropAndPad
- Resize
- RandomScale
- LongestMaxSize
- SmallestMaxSize
- Rotate
- SafeRotate
- OpticalDistortion
- GridDistortion
- ElasticTransform
- Perspective
- PiecewiseAffine
by @ternaus
Core
- Minimal supported python version is 3.9
- Removed dependency on scikit-image
- Updated Random number generator from np.random.state to np.random.generator. Second is 50% faster => speedups in all transforms that heavily use random generator
- Where possible moved from
cv2.LUT
tostringzilla lut
- Added parameter
mask_interpolation
to Compose that overrides mask interpolation value in all transforms in that Compose, now can use more accuratecv2.INTER_NEAREST_EXACT
for semantic segmentation and can work with depth and heatmap estimation using cubic, area, linear, etc
BugFixes
- Bugfix in ISONoise
- Bugfix: Ensure that transforms masks are contiguous arrays, by @Callidior
- Bugfix in Solarize
- Bugfix in bounding box filtering
- Bugfix in OpticalDistortion
- Bugfix in balanced scale in Affine
Albumentations 1.4.18 Release Notes
- Support Our Work
- Transforms
- Core
- Deprecations
- Bugfixes
Support Our Work
- Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
- Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
- Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server
Transforms
GridDistortion
Added support for keypoints
GridDropout
Added support for keypoints
and bounding boxes
GridElasticDeform
Added support for keypoints
and bounding boxes
MaskDropout
Added support for keypoints
and bounding boxes
Morphological
Added support for bounding boxes
and keypoints
OpticalDistortion
Added support for keypoints
PixelDropout
Added support for keypoints
and bonding boxes
XYMasking
Added support for bounding boxes
and keypoints
Core
Added support for masks as numpy arrays of the shape (num_masks, height, width)
Now you can apply transforms to masks as:
masks = <numpy array with shape (num_masks, height, width)>
transform(image=image, masks=masks)
Deprecations
Removed MixUp as it was doing almost exactly the same as TemplateTransform
Bugfixes
- Bugfix in RandomFog
- Bugfix in PlankianJitter
- Several people reported issue with masks as list of numpy arrays, I guess it was fixed as a part of some other work as I cannot reproduce it. Just in case added tests for that case.
Albumentations 1.4.17 Release Notes
- Support Our Work
- Transforms
- Core
Support Our Work
- Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
- Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
- Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server
Transforms
CoarseDropout
- Added Bounding Box support
remove_invisible=False
keeps keypoints
by @ternaus
ElasticTransform
Added support for keypoints
by @ternaus
Core
Added RandomOrder Compose
Select N transforms to apply. Selected transforms will be called in random order with force_apply=True.
Transforms probabilities will be normalized to one 1, so in this case transforms probabilities works as weights.
This transform is like SomeOf, but transforms are called with random order.
It will not replay random order in ReplayCompose.
Albumentations 1.4.16 Release Notes
- Support Our Work
- UI Tool
- Transforms
- Improvements and Bug Fixes
Support Our Work
- Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
- Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
- Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server
UI Tool
For visual debug wrote a tool that allows visually inspect effects of augmentations on the image.
You can find it at https://explore.albumentations.ai/
- Works for all ImageOnly transforms
- Authorized users can upload their own images
it is work in progress. It is not stable and polished yet, but if you have feedback or proposals - just write in the Discord Server mentioned above.
Transforms
- Updated and extended docstrings in all ImageOnly transforms.
- All ImageOnly transforms support both
uint8
andfloat32
inputs
RandomSnow
Added texture
method to RandomSnow
RandomSunflare
Added physics_based
method to RandomSunFlare
Bugfixes and improvements
- Bugfix in albucore dependency. Now every
Albumnetations
version is tailored to a specificalbucore
version. Added pre-commit hook to automatically check it on every commit. - BugFix in TextImage transform, after rewriting bbox processing in a vectorized form, transform was failing.
- As a part of the work to remove scikit-image dependency @momincks rewrote bbox_affine in a plain numpy
- Bugfix. It was unexpected, but people use bounding bboxes that are less than 1 pixel. Removed constrant on a minimum bounding box being 1x1
- Bugfix in bounding box filtering. Now if all bounding boxes were filtered return not empty array, but empty array of shape (0, 4)
Albumentations 1.4.15 Release Notes
- Support Our Work
- UI Tool
- Core
- Transforms
- Improvements and Bug Fixes
Support Our Work
- Love the library? You can contribute to its development by becoming a sponsor for the library. Your support is invaluable, and every contribution makes a difference.
- Haven't starred our repo yet? Show your support with a ⭐! It's just only one mouse click away.
- Got ideas or facing issues? We'd love to hear from you. Share your thoughts in our issues or join the conversation on our Discord server
UI Tool
For visual debug wrote a tool that allows visually inspect effects of augmentations on the image.
You can find it at https://explore.albumentations.ai/
RIght now supports only ImageOnly transforms, and not all but a subset of them.
it is work in progress. It is not stable and polished yet, but if you have feedback or proposals - just write in the Discord Server mentioned above.
Core
Bounding box and keypoint processing was vectorized
- You can pass numpy array to compose and not only list of lists.
- Presumably transforms will work faster, but did not benchmark.
Transforms
Affine
- Reflection padding correctly works In
Affine
andShiftScaleRotate
CLAHE
- Added support for float32 images
Equalize
- Added support for float32 images
FancyPCA
- Added support for float32 images
- Added support for any number of channels
PixelDistributionAdaptation
- Added support for float32
- Added support for anyu number of channels
Flip
Still works, but deprecated. It was a very strange transform, I cannot find use case, where you needed to use it.
It was equivalent to:
OneOf([Transpose, VerticalFlip, HorizontalFlip])
Most likely if you needed transform that does not create artifacts, you should look at:
- Natural images =>
HorizontalFlip
(Symmetry group has 2 elements, meaning will effectively increase your dataset 2x) - Images that look natural when you vertically flip them =>
VerticalFlip
(Symmetry group has 2 elements, meaning will effectively increase your dataset 2x) - Images that need to preserve parity, for example texts, but we may expect rotated documents =>
RandomRotate90
(Symmetry group has 2 elements, meaning will effectively increase your dataset 4x) - Images that you can flip and rotate as you wish =>
D4
(Symmetry group has 8 elements, meaning will effectively increase your dataset 8x)
ToGray
Now you can define the number of output channels in the resulting gray image. All channels will be the same.
Extended ways one can get grayscale image. Most of them can work with any number of channels as input
weighted_average
: Uses a weighted sum of RGB channels(0.299R + 0.587G + 0.114B)
Works only with 3-channel images. Provides realistic results based on human perception.from_lab
: Extracts the L channel from the LAB color space.
Works only with 3-channel images. Gives perceptually uniform results.desaturation
: Averages the maximum and minimum values across channels.
Works with any number of channels. Fast but may not preserve perceived brightness well.average
: Simple average of all channels.
Works with any number of channels. Fast but may not give realistic results.max
: Takes the maximum value across all channels.
Works with any number of channels. Tends to produce brighter results.pca
: Applies Principal Component Analysis to reduce channels.
Works with any number of channels. Can preserve more information but is computationally intensive.
SafeRotate
Now uses Affine under the hood.
Improvements and Bug Fixes
- Bugfix in
GridElasticDeform
by @4pygmalion - Speedups in
to_float
andfrom_float
- Bugfix in PadIfNeeded. Did not work when empty bounding boxes were passed.