From 9565f3892de28c373024ceda0b26864fe2db61a9 Mon Sep 17 00:00:00 2001 From: xrsrke Date: Mon, 11 Dec 2023 13:42:13 +0700 Subject: [PATCH] [Readme] Add contributing guideline --- CONTRIBUTING.md | 7 +++++++ README.md | 10 +++------- 2 files changed, 10 insertions(+), 7 deletions(-) create mode 100644 CONTRIBUTING.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..7f610e9 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,7 @@ +We're building an end-to-end multi-modal MoE that works in 3D parallelism, and do pre-training in a decentraized way as proposed in the paper [DiLoCo](https://arxiv.org/abs/2311.08105) + +If you want to contribute, please check the following links + +- High priority tasks [[link]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22+label%3A%22High+Priority%22) +- Beginner tasks [[link]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22+label%3A%22good+first+issue%22) +- All tasks that need help (include beginner and high priority))[[link]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22help+wanted%22) diff --git a/README.md b/README.md index a897b05..32c3718 100644 --- a/README.md +++ b/README.md @@ -14,14 +14,9 @@ We're building a library for an end-to-end framework for **training multi-modal - Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism [[link]](https://arxiv.org/abs/1909.08053) +**If you're interested in contributing, check out [[CONTRIBUTING.md]](./CONTRIBUTING.md) [[good first issue]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) [[roadmap]](https://github.com/users/xrsrke/projects/5). Come join us: [[discord link]](https://discord.gg/s9ZS9VXZ3p)** -⚠️ **The project is actively under development, and we're actively seeking collaborators. Come join us: [[discord link]](https://discord.gg/s9ZS9VXZ3p) [[roadmap]](https://github.com/users/xrsrke/projects/5) [[good first issue]](https://github.com/xrsrke/pipegoose/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22)** - -⚠️ **The APIs is still a work in progress and could change at any time. None of the public APIs are set in stone until we hit version 0.6.9.** - -⚠️ **Currently only parallelize `bloom-560m` is supported. Support for hybrid 3D parallelism and distributed optimizer for 🤗 `transformers` will be available in the upcoming weeks (it's basically done, but it doesn't support 🤗 `transformers` yet)** - -⚠️ **This library is underperforming when compared to Megatron-LM and DeepSpeed (and not even achieving reasonable performance yet).** +⚠️ **Currently only parallelize `transformers`'s `bloom` is supported.** ```diff from torch.utils.data import DataLoader @@ -94,6 +89,7 @@ We did a small scale correctness test by comparing the validation losses between - ~~Tensor Parallelism [[link]](https://wandb.ai/xariusdrake/pipegoose/runs/iz17f50n)~~ (We've found a bug in convergence, and we are fixing it) - ~~Hybrid 2D Parallelism (TP+DP) [[link]](https://wandb.ai/xariusdrake/pipegoose/runs/us31p3q1)~~ - Distributed Optimizer ZeRO-1 Convergence: [[sgd link]](https://wandb.ai/xariusdrake/pipegoose/runs/fn4t9as4?workspace) [[adam link]](https://wandb.ai/xariusdrake/pipegoose/runs/yn4m2sky) +- Mixture of Experts [[link]](https://wandb.ai/xariusdrake/pipegoose/jobs/QXJ0aWZhY3RDb2xsZWN0aW9uOjExOTU2MTU5MA==/version_details/v20) **Features** - End-to-end multi-modal including in 3D parallelism including distributed CLIP..