Question about the `embed_positions.weight` in Whisper model #2287

bobqianic · 2024-07-08T10:21:50Z

bobqianic
Jul 8, 2024
Collaborator

In the encoder of Whisper models, the encoder.embed_positions.weight was initialized using sine and cosine functions of different frequencies. However, in the decoder of Whisper models, the decoder.embed_positions.weight consists of trainable parameters that are initialized using random noise. But why is this the case?


encoder.embed_positions.weight


decoder.embed_positions.weight

bobqianic · 2024-07-08T12:11:10Z

bobqianic
Jul 8, 2024
Collaborator Author

Maybe...

1 reply

ggerganov Jul 8, 2024
Maintainer

Haha yes, it's voodoo magic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about the `embed_positions.weight` in Whisper model #2287

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Question about the embed_positions.weight in Whisper model #2287

Uh oh!

bobqianic Jul 8, 2024 Collaborator

Replies: 1 comment · 1 reply

Uh oh!

bobqianic Jul 8, 2024 Collaborator Author

Uh oh!

ggerganov Jul 8, 2024 Maintainer

Question about the `embed_positions.weight` in Whisper model #2287

bobqianic
Jul 8, 2024
Collaborator

Replies: 1 comment 1 reply

bobqianic
Jul 8, 2024
Collaborator Author

ggerganov Jul 8, 2024
Maintainer