Unverified Commit 30d8919a authored by dhansmair's avatar dhansmair Committed by GitHub
Browse files

in the resize() function in image_transforms.py, the line 267: (#20728)

`image = to_channel_dimension_format(image, ChannelDimension.LAST)`
is redundant as this same conversion is also applied in to_pil_image().

This redundant call actually makes the training fail in rare cases.
The problem can be reproduced with the following code snippet:
```
from transformers.models.clip import CLIPFeatureExtractor
vision_processor = CLIPFeatureExtractor.from_pretrained('openai/clip-vit-large-patch14')
images = [
    torch.rand(size=(3, 2, 10), dtype=torch.float),
    torch.rand(size=(3, 10, 1), dtype=torch.float),
    torch.rand(size=(3, 1, 10), dtype=torch.float)
]
for image in images:
    processed_image = vision_processor(images=image, return_tensors="pt")['pixel_values']
    print(processed_image.shape)
    assert processed_image.shape == torch.Size([1, 3, 224, 224])
```

The last image has a height of 1 pixel.
The second call to to_channel_dimesion_format() will transpose the image, and the height
dimension is wrongly treated as the channels dimension afterwards.
Because of this, the following normalize() step will result in an
exception.
parent 4f1788b3
...@@ -271,8 +271,6 @@ def resize( ...@@ -271,8 +271,6 @@ def resize(
# To maintain backwards compatibility with the resizing done in previous image feature extractors, we use # To maintain backwards compatibility with the resizing done in previous image feature extractors, we use
# the pillow library to resize the image and then convert back to numpy # the pillow library to resize the image and then convert back to numpy
if not isinstance(image, PIL.Image.Image): if not isinstance(image, PIL.Image.Image):
# PIL expects image to have channels last
image = to_channel_dimension_format(image, ChannelDimension.LAST)
image = to_pil_image(image) image = to_pil_image(image)
height, width = size height, width = size
# PIL images are in the format (width, height) # PIL images are in the format (width, height)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment