Skip to content

Conversation

jlamypoirier
Copy link
Collaborator

✨ Description

An attempt at integrating multimodal vision models to main. Still a lot of work to do...

@jlamypoirier jlamypoirier changed the base branch from main to jlp/mlp_block September 26, 2025 20:29
Base automatically changed from jlp/mlp_block to main October 3, 2025 23:18
hint=FieldHint.architecture,
)
# TODO: ====== Appropriate name?? ======
decoder: BlockSequenceConfig = Field(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoder

peft=self._peft,
)
# TODO: ====== Appropriate name?? ======
self.decoder = self._config.decoder.get_layer(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoder

peft=self._peft,
)
# TODO: ====== Hidden dim ======
self.adapter = self._config.adapter.get_layer(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to make the adapter part of the encoder, because adapter tensor shapes depend on decoder. And we also want to mix and match existing pre trained encoders and decoders...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same with every module basically, their shapes all need to match. I'm organizing the modules so thy manage their internal hidden shapes, but input and output shapes are managed by the parent modules (hidden_dim argument), so in that case it makes sense to keep the adapter here.

The todo refers to the MLP assuming matching input and output dimensions, that's an easy fix but I haven't gotten to it yet.

@jlamypoirier jlamypoirier changed the base branch from main to jlp/language_model_block October 6, 2025 21:11
Base automatically changed from jlp/language_model_block to main October 6, 2025 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants