Skip to content

Actions: OpenMOSS/Language-Model-SAEs

Actions

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
256 workflow runs
256 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Implement Tensor Parallel Training with Dtensor
Checks #56: Pull request #34 opened by Frankstein73
July 22, 2024 08:31 2m 43s dev
dev
July 22, 2024 08:31 2m 43s
fix(sae): fix transform_to_unit_decoder_norm in tensor parallel
Checks #55: Commit 161954c pushed by Frankstein73
July 22, 2024 08:27 3m 19s dev
dev
July 22, 2024 08:27 3m 19s
Merge branch 'main' into dev
Checks #54: Commit 4fd8c86 pushed by Frankstein73
July 21, 2024 16:24 2m 35s dev
dev
July 21, 2024 16:24 2m 35s
dev
July 21, 2024 16:22 2m 43s
Merge pull request #33 from OpenMOSS/scaling_law
Checks #52: Commit d73d5e2 pushed by StarConnor
July 21, 2024 09:25 2m 36s main
July 21, 2024 09:25 2m 36s
feat(sae): Implement ckpt saving in tensor parallel environment.
Checks #51: Commit 556f6cd pushed by Frankstein73
July 20, 2024 05:02 2m 45s dev
dev
July 20, 2024 05:02 2m 45s
dev
July 18, 2024 17:53 2m 43s
dev
July 18, 2024 13:37 2m 36s
feat(config): add decay ratio
Checks #48: Pull request #33 opened by Hzfinfdu
July 18, 2024 10:03 2m 40s scaling_law
July 18, 2024 10:03 2m 40s
Merge pull request #32 from OpenMOSS/AprilTricks
Checks #47: Commit 900b98c pushed by StarConnor
July 17, 2024 15:47 2m 39s main
July 17, 2024 15:47 2m 39s
April tricks
Checks #46: Pull request #32 synchronize by Hzfinfdu
July 17, 2024 15:35 2m 40s AprilTricks
July 17, 2024 15:35 2m 40s
fix(sae): fix merge bugs
Checks #45: Commit d555c98 pushed by Frankstein73
July 15, 2024 14:05 2m 53s dev
dev
July 15, 2024 14:05 2m 53s
Add bfp16 support to inhibit transforming to fp32 when using llama3
Checks #44: Commit 4a14ae0 pushed by dest1n1s
July 15, 2024 04:47 2m 42s main
July 15, 2024 04:47 2m 42s
Add bfp16 support to inhibit transforming to fp32 when using llama3
Checks #43: Pull request #31 opened by StarConnor
July 14, 2024 13:01 2m 39s tl_dtype
July 14, 2024 13:01 2m 39s
feat: Implement tensor parallelism in SAE using device mesh
Checks #42: Commit ccac63a pushed by Frankstein73
July 14, 2024 11:03 2m 46s dev
dev
July 14, 2024 11:03 2m 46s
Merge branch 'main' into dev
Checks #41: Commit 5bf9e6a pushed by dest1n1s
July 14, 2024 07:13 2m 52s dev
dev
July 14, 2024 07:13 2m 52s
fix: fix bugs of prepend bos during eval and sampling (#30)
Checks #40: Commit 504fbb3 pushed by dest1n1s
July 6, 2024 12:57 2m 35s main
July 6, 2024 12:57 2m 35s
fix bugs of prepend bos during eval and sample
Checks #39: Pull request #30 opened by SmallMelon-L
July 6, 2024 12:41 2m 41s jxwang
July 6, 2024 12:41 2m 41s
fix bugs of prepend bos during eval and sample
Checks #38: Pull request #29 synchronize by SmallMelon-L
July 6, 2024 12:39 2m 49s jxwang
July 6, 2024 12:39 2m 49s
fix bugs of prepend bos during eval and sample
Checks #37: Pull request #29 reopened by SmallMelon-L
July 6, 2024 12:25 2m 40s jxwang
July 6, 2024 12:25 2m 40s
fix bugs of prepend bos during eval and sample
Checks #36: Pull request #29 opened by SmallMelon-L
July 6, 2024 12:19 2m 44s jxwang
July 6, 2024 12:19 2m 44s
feat(HookedTransformer) accelerate inference with flash attention
Checks #35: Commit 0e3d268 pushed by dest1n1s
July 3, 2024 10:23 2m 43s main
July 3, 2024 10:23 2m 43s
Checks
Checks #32: by dest1n1s
July 2, 2024 09:25 2m 36s main
July 2, 2024 09:25 2m 36s