Skip to content

Releases: macrocosm-os/pretraining

Release 3.2.2

15 Jul 12:54
52962cf
Compare
Choose a tag to compare

Announcing Release 3.2.2

This is a patch release with some minor changes:

Changes

  • Validators Logging: The logging for validators has been redirected to the pretraining-validators project within the Macrocosmos Wandb entity.
  • Repository Version Logging: The repository version is now logged in Wandb along with the validator version. The version entry in the overview tab on the validator logging page in Wandb now refers to the repository version instead of the validator version.

Release 3.2.1

25 Jun 15:53
e2ff6ae
Compare
Choose a tag to compare

Announcing Release 3.2.1

This bugfix release addresses a critical issue impacting validator performance and stability. We strongly recommend all users update to this version to ensure optimal functionality.

Bugfix

  • Description of the Bug: In the previous release (v3.2.0), all models submitted before block 3256604 were set to be evaluated on RefinedWeb instead of FineWeb-Edu Score-2. This were not the desired behavior. After the specified block, we want all models to be validated on FineWeb-Edu Score-2.
  • Resolution: We have fixed the issue, ensuring that all models are now correctly evaluated on the FineWeb-Edu Score-2 dataset starting from block 3307004 instead of block 3256604 (one week earlier), as was originally intended.

Validator

  • Validators will now correctly evaluate all models on the FineWeb-Edu Score-2 dataset, as introduced in the previous release.
  • The validator version remains decoupled from the package version, maintaining flexibility for future updates.
  • We have updated package dependencies, pinning Bittensor to version v6.9.3 to enhance the validator setup experience and stability.

We urge everyone to pull this release to update their validators and benefit from the critical bug fixes included in this release.

Release 3.2.0

18 Jun 17:04
cda4dfe
Compare
Choose a tag to compare

Announcing Release 3.2.0

This release introduces several enhancements to improve the validator's performance and stability. The main contribution is the adoption of a new, higher-quality dataset and streamlining the validator setup process.

Validator

  • Validators will now default to evaluating miners on the FineWeb Edu (score 2) dataset. This dataset, recently released and open-sourced by HuggingFace, is bigger and of higher quality.
  • Evaluation on this new dataset will take effect starting from block 3256604.
  • The validator version has been decoupled from the package version, allowing for more flexible and independent updates.
  • We have improved the validator setup experience by cleaning up package dependencies.

Other

  • We updated the leaderboard links to point to the new stable version hosted by Macrocosmos, ensuring accurate and up-to-date performance tracking.

We urge everyone to pull this release to update their validators and take advantage of these new features and improvements.

What's Changed

Full Changelog: v3.1.4...v3.2.0

Release 3.1.4

26 Apr 16:32
1285929
Compare
Choose a tag to compare

Announcing Release 3.1.4

This release contains improvements to help keep VTrust stable across validators and closes a potential exploit of copying metadata directly from the chain within the same block. This is a proactive fix for subnet 9 after observing this occurring on subnet 6.

Validators:

  • Validators now allow for the metadata hash to match the hugging face repo or the hugging face repo + the hotkey that uploaded it.

  • Validators now default to evaluating on 18 pages of falcon instead of 12.

  • Validators now use weights instead of incentive for shortcutting and retrying top models.

Miners:

  • Using the flag –use_hotkey_in_hash will upload metadata that combines the hash of the hugging face repo with their hotkey.

    • We recommend waiting a few days for validators to update first before using this new functionality.

    • To avoid an attacker downloading the repo within 12 seconds to create their own hash you can also upload initially in private and make it public after the block has been committed.

Release 3.1.3

20 Apr 21:42
51e0294
Compare
Choose a tag to compare

Force redownload of top models on retry.

Release 3.1.2.

19 Apr 14:41
2839383
Compare
Choose a tag to compare

Validator only fix to resolve OOM issues.

Release 3.1.0

01 Apr 20:53
2c377cf
Compare
Choose a tag to compare

Announcing Release 3.1.0

This validator-only release includes several optimizations to the process of checking for which models to evaluate next that will help keep vtrust consistently high for validators.

Validators with auto-update should pick this up automatically within ~15 minutes although we recommend all validators update when they get the chance.

Release 3.0

28 Mar 01:46
18f0056
Compare
Choose a tag to compare

This release contains the code to prepare for the move to 7b parameters as well as concentrating the rewards on fewer models and improving the speed that validators pick up new best models.

In addition to the previously announced change from 8k to 4k sequence length, we have also adjusted the future tokenizer from gpt3_5 to gpt4. To compensate, the block at which these new changes will take effect has been moved out one week to April 15, 2024 ~8:00 AM at block 2,786,061.

To reiterate the final set of changes that will occur at that block are:

  • The parameter limit will be raised to 6.9 billion.

  • The size limit for the hugging face repo for the model will be raised to 15 gigabytes.

  • New The tokenizer used for evaluation will become https://huggingface.co/Xenova/gpt-4

  • New The sequence length used for inference will be 4096.

  • When loading the pretrained model for inference the torch_dtype will be bfloat16 and the attn_implementation will be flash_attention_2.

  • New Allowed model types has been adjusted to include new model types (Phi and Gemma) and remove those not supporting flash attention.

Validators: You should upgrade immediately to align your weight distributions to the new model. You will need to install the new flash-attn requirement. Instructions at: https://github.com/RaoFoundation/pretraining/blob/main/docs/validator.md#prerequisites step 3.

Additionally, you may need to upgrade your machine by April 15, 2024 to support the following requirement changes:

Release 2.3.2

22 Mar 04:49
706f659
Compare
Choose a tag to compare

Add additional detection for reasonable model generation.

Release 2.3.1

22 Feb 03:26
d2faaec
Compare
Choose a tag to compare
  • Fix an issue with the FalconForCausalLM model
  • Increases alpha to ensure validators converge on weights faster