Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Behavior when Replacing Layers in CustomModel built with Subclass API #19268

Open
ariG23498 opened this issue Mar 8, 2024 · 2 comments
Assignees
Labels

Comments

@ariG23498
Copy link
Contributor

ariG23498 commented Mar 8, 2024

I have created a CustomModel using the subclass API

class CustomModel(keras.Model):
    def __init__(self, num_classes=10):
        super().__init__()
        self.num_classes = num_classes
        self.stem = keras.layers.Conv2D(32, 3, strides=2, padding='same', name="stem")
        self.head = keras.layers.Dense(num_classes, name="head")
    
    def call(self, inputs):
        x = self.stem(inputs)
        x = keras.layers.Flatten()(x)
        x = self.head(x)
        return x

I wanted to replace (read swap) a layer from the custom model with a new layer.

model = CustomModel()
model.head = keras.layers.Dense(100, name="head")

Now when I hit model.summary I see both the layers added to the Model.

model.summary()

Model: "custom_model_3"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Layer (type)                         ┃ Output ShapeParam # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ stem (Conv2D)                        │ ?                           │     0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ head (Dense)                         │ ?                           │     0 (unbuilt) │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ head (Dense)                         │ ?                           │     0 (unbuilt) │
└──────────────────────────────────────┴─────────────────────────────┴─────────────────┘
 Total params: 0 (0.00 B)
 Trainable params: 0 (0.00 B)
 Non-trainable params: 0 (0.00 B)

When I take a similar approach with torch.nn.Module, the layers get swapped out.

class CustomModel(torch.nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.num_classes = num_classes
        self.stem = torch.nn.Conv2d(32, 32, 3, stride=2)
        self.head = torch.nn.Linear(32, num_classes)
    
    def forward(self, inputs):
        x = self.stem(inputs)
        x = keras.layers.Flatten()(x)
        x = self.head(x)
        return x

model = CustomModel()
model.head = torch.nn.Linear(32, 100)
print(model)
CustomModel(
  (stem): Conv2d(32, 32, kernel_size=(3, 3), stride=(2, 2))
  (head): Linear(in_features=32, out_features=100, bias=True)
)

I know that I can use the setattr method to set the keras.Model attribute. But here I wanted to check whether this is a supposed workflow.

Any help on this would be grately appreciated.

My opinion: We should warn the user if they want to set the attribute of a model (mainly layers). I feel adding a layer to the model (which is not intuitive at all) introduces a silent bug.

@SuryanarayanaY
Copy link
Contributor

Hi @ariG23498 ,

Keras won't allow to adding new layers once model.build() is called. Before model build call it is accepting the duplicate layers though. This behaviour is different wrt pytorch which swaps the duplicate layers instead of adding as new.Attached gist for reference.

@SuryanarayanaY SuryanarayanaY added keras-team-review-pending Pending review by a Keras team member. backend:tensorflow labels Mar 11, 2024
@ariG23498
Copy link
Contributor Author

Hi @ariG23498 ,

Keras won't allow to adding new layers once model.build() is called. Before model build call it is accepting the duplicate layers though. This behaviour is different wrt pytorch which swaps the duplicate layers instead of adding as new.Attached gist for reference.

Hey @SuryanarayanaY I did look into the GIST. I did not really understand the reason of the following code with the torch backend

model.build(input_shape=(12,4))
model.head_2 = keras.layers.Dense(10, name="head_2")
model.summary()

AFAIK the CustomModel is a torch.nn.Module. This will not include the build() method in the first place.

A better side by side analysis of the situation would be where we run the same code as follows, with different backends.

class CustomModel(keras.Model):
    def __init__(self, num_classes=10):
        super().__init__()
        self.num_classes = num_classes
        self.stem = keras.layers.Conv2D(32, 3, strides=2, padding='same', name="stem")
        self.head = keras.layers.Dense(num_classes, name="head")

    def call(self, inputs):
        x = self.stem(inputs)
        x = keras.layers.Flatten()(x)
        x = self.head(x)
        return x

model = CustomModel()
model.head = keras.layers.Dense(100, name="head")
print(model.summary())

model.build(input_shape=(12, 4))
model.head = keras.layers.Dense(100, name="head")
print(model.summary())

I did run this experiment and I saw that all the backends have the same behaviour.

@mattdangerw mattdangerw removed the keras-team-review-pending Pending review by a Keras team member. label Mar 14, 2024
@sachinprasadhs sachinprasadhs added the stat:awaiting keras-eng Awaiting response from Keras engineer label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants