You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for open-sourcing the ft-transformer repository! I’ve been exploring the project and encountered some challenges that I’d like to bring to your attention.
The documentation mentions that ft-transformer uses an approach similar to Electra for training the model.transformer. However, there’s no provided implementation of the pretraining phase in the repository, and the description of this process is somewhat unclear. This makes it difficult for users to reproduce the pretraining stage.
Problems
Missing pretraining code: The repository currently doesn’t include any implementation of the pretraining phase, making it hard for users to perform end-to-end pretraining with ft-transformer.
Insufficient documentation: The description of how Electra-style training is applied to the model.transformer lacks clarity and detail. There’s no concrete example or explanation of the method.
Suggestions
It would be helpful to include the pretraining code for Electra-style training on ft-transformer (e.g., generator and discriminator implementation).
Alternatively, a more detailed explanation in the documentation would be valuable, including:
The definition of the pretraining tasks.
The design of the loss function.
How the generator and discriminator interact.
Example code, pseudocode, or even a high-level flowchart (if the full implementation cannot be shared at this time).
This would help users better understand and reproduce the pretraining process, making the project more accessible and easier to use.
Environment Details
ft-transformer version: (please specify the version you are using)
Python version: (please specify your Python version)
Other dependencies: (if relevant, include version details of key dependencies)
Thanks
Thank you for your hard work and contribution! Looking forward to future updates.
The text was updated successfully, but these errors were encountered:
Description
Thank you for open-sourcing the ft-transformer repository! I’ve been exploring the project and encountered some challenges that I’d like to bring to your attention.
The documentation mentions that ft-transformer uses an approach similar to Electra for training the
model.transformer
. However, there’s no provided implementation of the pretraining phase in the repository, and the description of this process is somewhat unclear. This makes it difficult for users to reproduce the pretraining stage.Problems
model.transformer
lacks clarity and detail. There’s no concrete example or explanation of the method.Suggestions
This would help users better understand and reproduce the pretraining process, making the project more accessible and easier to use.
Environment Details
Thanks
Thank you for your hard work and contribution! Looking forward to future updates.
The text was updated successfully, but these errors were encountered: