Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

ImageDataGenerator.flow_from_dataframe() with class_mode='binary' can be affected by classname unexpectedly #289

Open
ecoopnet opened this issue May 5, 2020 · 2 comments
Labels
image Related to images

Comments

@ecoopnet
Copy link

ecoopnet commented May 5, 2020

In binary class mode,
the class values of ImageDataGenerator.flow_from_dataframe() are affected by class name because class names are sorted automatically unexpectedly.

https://github.com/keras-team/keras-preprocessing/blob/0494094a3b/keras_preprocessing/image/dataframe_iterator.py#L252

Sorting is OK in categorical because order is not important.

But it is not OK in binary. because DataFrameIterator requires 2 classes for binary class mode.
And it generates single values depends on index of classes.

I could not determine it is bug or not, but I think sorting seems unnecessary on binary mode.
I expected classes are not sorted when I passed classes to argument.

Actual

generator = ImageDataGenerator(...)
flow = generator.flow_from_dataframe(class_mode='binary', classes=['normal', 'abnormal'])

print(flow.class_indices)
# it prints: {'abnormal': 0, 'normal': 1}

Expected

generator = ImageDataGenerator(...)
flow = generator.flow_from_dataframe(class_mode='binary', classes=['normal', 'abnormal'])

print(flow.class_indices)
# it prints: {'normal': 0, 'abnormal': 1} or {'abnormal': 1, 'normal': 0} 
@ecoopnet ecoopnet added the image Related to images label May 5, 2020
@ecoopnet
Copy link
Author

ecoopnet commented May 5, 2020

My workaround:
Append indexed prefix to class name to control the order of class.

# The first name is always 0, second is 1, whatever these name are.
classes=['normal', 'abnormal']

# add prefix to classes.
# also you need to add same prefix to y_cal's values of dataframe.
indexed_classes = list ( map(lambda x: "{:0>3}_{}".format(x[0], x[1]), enumerate(classes)) )
flow = generator.flow_from_dataframe(class_mode='binary', classes=indexed_classes)

print(flow.class_indices)
# it prints: {'000_normal': 0, '001_abnormal': 1}

@Dref360
Copy link
Contributor

Dref360 commented May 12, 2020

I think this is a bug, PRs are welcome.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
image Related to images
Projects
None yet
Development

No branches or pull requests

2 participants