-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'master' into zhejiang/fix_runtime_dataloader_shuffle
- Loading branch information
Showing
63 changed files
with
1,067 additions
and
207 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
name: hpu-gaudi2-nightly | ||
|
||
on: | ||
workflow_dispatch: | ||
schedule: | ||
- cron: "0 0 * * *" | ||
pull_request: | ||
paths: | ||
- ".github/workflows/hpu-gaudi2-nightly.yml" | ||
|
||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.ref }} | ||
cancel-in-progress: true | ||
|
||
permissions: | ||
contents: read | ||
issues: write | ||
|
||
jobs: | ||
unit-tests: | ||
# The type of runner that the job will run on | ||
runs-on: [self-hosted, intel, gaudi2] | ||
container: | ||
image: vault.habana.ai/gaudi-docker/1.18.0/ubuntu22.04/habanalabs/pytorch-installer-2.4.0:latest | ||
ports: | ||
- 80 | ||
options: --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice | ||
|
||
env: | ||
PT_HPU_LAZY_MODE: 0 | ||
TORCHINDUCTOR_COMPILE_THREADS: 1 | ||
TEST_LIST: | | ||
test_adamw.py | ||
test_bf16.py | ||
test_ds_config_dict.py | ||
test_dynamic_loss_scale.py | ||
test_latest_checkpoint.py | ||
test_moe_checkpoint.py | ||
test_multi_output_model.py | ||
test_other_optimizer.py | ||
test_pipe.py | ||
test_pipeline.py | ||
test_universal_checkpoint.py | ||
test_zero_context_return.py | ||
test_zero_leaf_module.py | ||
test_zero_offloadpp.py | ||
test_zero_tiled.py | ||
# Steps represent a sequence of tasks that will be executed as part of the job | ||
steps: | ||
# Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it | ||
- uses: actions/checkout@v4 | ||
|
||
- name: Check container state | ||
run: | | ||
ldd --version | ||
hl-smi -L | ||
python -c "import torch; print('torch:', torch.__version__, torch)" | ||
python -c "import torch; print('CUDA available:', torch.cuda.is_available())" | ||
- name: Install transformers | ||
run: | | ||
git clone https://github.com/huggingface/transformers | ||
cd transformers | ||
git rev-parse --short HEAD | ||
pip install . | ||
- name: Install deepspeed | ||
run: | | ||
pip install .[dev,autotuning] | ||
ds_report | ||
- name: Python environment | ||
run: | | ||
pip list | ||
- name: Unit tests | ||
run: | | ||
unset TORCH_CUDA_ARCH_LIST # only jit compile for current arch | ||
cd tests | ||
export PT_HPU_LAZY_MODE=${PT_HPU_LAZY_MODE} | ||
export TORCHINDUCTOR_COMPILE_THREADS=${TORCHINDUCTOR_COMPILE_THREADS} | ||
TEST_LIST=$(echo "$TEST_LIST" | awk 'NF{printf "%s%s", (NR>1 ? " or " : ""), $0} END{if (NR>1) print ""}') | ||
echo "TEST_LIST ${TEST_LIST}" | ||
pytest --verbose unit/ -k "${TEST_LIST}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
|
||
# DeepSpeed Project Charter and Governance | ||
|
||
This charter sets forth the responsibilities and procedures for technical contribution to, and oversight of, the DeepSpeed open source project. All contributors (including committers, maintainers, and other technical positions) and other participants in the Project (collectively, "Collaborators") must comply with the terms of this Charter. | ||
|
||
## Mission and Scope of the Project | ||
|
||
The mission of the Project is to DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective. | ||
|
||
The scope of the Project includes collaborative development under the Project License (as defined herein) supporting the mission, including documentation, testing, integration, and the creation of other artifacts that aid the development, deployment, operation, or adoption of the open source project. | ||
|
||
## Technical Steering Committee | ||
|
||
1. The Technical Steering Committee (the "TSC") will be responsible for all technical oversight of the open source Project. | ||
|
||
2. The TSC voting members are initially the Project's Committers. At the inception of the project, the Committers of the Project will be as set forth within the "CONTRIBUTING" file within the Project's code repository. The TSC may choose an alternative approach for determining the voting members of the TSC, and any such alternative approach will be documented in the CONTRIBUTING file. Any meetings of the Technical Steering Committee are intended to be open to the public, and can be conducted electronically, via teleconference, or in person. | ||
|
||
3. TSC projects generally will involve Contributors and Committers. The TSC may adopt or modify roles so long as the roles are documented in the CONTRIBUTING file. Unless otherwise documented: | ||
|
||
- **Contributors** include anyone in the technical community that contributes code, documentation, or other technical artifacts to the Project. | ||
- **Committers** are Contributors who have earned the ability to modify ("commit") source code, documentation, or other technical artifacts in a project's repository. | ||
|
||
- A Contributor may become a Committer by a majority approval of the existing Committers. A Committer may be removed by a majority approval of the other existing Committers. | ||
|
||
4. Participation in the Project through becoming a Contributor and Committer is open to anyone so long as they abide by the terms of this Charter. | ||
|
||
5. The TSC may: | ||
- Establish workflow procedures for the submission, approval, and closure/archiving of projects. | ||
- Set requirements for the promotion of Contributors to Committer status, as applicable. | ||
- Amend, adjust, refine and/or eliminate the roles of Contributors and Committers, and create new roles, and publicly document any TSC roles, as it sees fit. | ||
|
||
6. The TSC may elect a TSC Chair, who will preside over meetings of the TSC and will serve until their resignation or replacement by the TSC. The TSC Chair, or any other TSC member so designated by the TSC, will serve as the primary communication contact between the Project and AI & Data, a directed fund of The Linux Foundation. | ||
|
||
7. Responsibilities: The TSC will be responsible for all aspects of oversight relating to the Project, which may include: | ||
|
||
- Coordinating the technical direction of the Project. | ||
- Approving project or system proposals (including, but not limited to, incubation, deprecation, and changes to a sub-project's scope). | ||
- Organizing sub-projects and removing sub-projects. | ||
- Creating sub-committees or working groups to focus on cross-project technical issues and requirements. | ||
- Appointing representatives to work with other open source or open standards communities. | ||
- Establishing community norms, workflows, issuing releases, and security issue reporting policies. | ||
- Approving and implementing policies and processes for contributing (to be published in the CONTRIBUTING file) and coordinating with the series manager of the Project (as provided for in the Series Agreement, the "Series Manager") to resolve matters or concerns that may arise as set forth in Section 7 of this Charter. | ||
- Discussions, seeking consensus, and where necessary, voting on technical matters relating to the code base that affect multiple projects. | ||
- Coordinating any marketing, events, or communications regarding the Project. | ||
|
||
## TSC Voting | ||
|
||
1. While the Project aims to operate as a consensus-based community, if any TSC decision requires a vote to move the Project forward, the voting members of the TSC will vote on a one vote per voting member basis. | ||
|
||
2. Quorum for TSC meetings requires at least fifty percent of all voting members of the TSC to be present. The TSC may continue to meet if quorum is not met but will be prevented from making any decisions at the meeting. | ||
|
||
3. Except as provided in Section 7.c. and 8.a, decisions by vote at a meeting require a majority vote of those in attendance, provided quorum is met. Decisions made by electronic vote without a meeting require a majority vote of all voting members of the TSC. | ||
|
||
4. In the event a vote cannot be resolved by the TSC, any voting member of the TSC may refer the matter to the Series Manager for assistance in reaching a resolution. | ||
|
||
## Compliance with Policies | ||
|
||
1. This Charter is subject to the Series Agreement for the Project and the Operating Agreement of LF Projects. Contributors will comply with the policies of LF Projects as may be adopted and amended by LF Projects, including, without limitation, the policies listed at https://lfprojects.org/policies/. | ||
|
||
2. The TSC may adopt a code of conduct ("CoC") for the Project, which is subject to approval by the Series Manager. In the event that a Project-specific CoC has not been approved, the LF Projects Code of Conduct listed at https://lfprojects.org/policies will apply for all Collaborators in the Project. | ||
|
||
3. When amending or adopting any policy applicable to the Project, LF Projects will publish such policy, as to be amended or adopted, on its website at least 30 days prior to such policy taking effect; provided, however, that in the case of any amendment of the Trademark Policy or Terms of Use of LF Projects, any such amendment is effective upon publication on LF Project's website. | ||
|
||
4. All Collaborators must allow open participation from any individual or organization meeting the requirements for contributing under this Charter and any policies adopted for all Collaborators by the TSC, regardless of competitive interests. Put another way, the Project community must not seek to exclude any participant based on any criteria, requirement, or reason other than those that are reasonable and applied on a non-discriminatory basis to all Collaborators in the Project community. | ||
|
||
5. The Project will operate in a transparent, open, collaborative, and ethical manner at all times. The output of all Project discussions, proposals, timelines, decisions, and status should be made open and easily visible to all. Any potential violations of this requirement should be reported immediately to the Series Manager. | ||
|
||
## Community Assets | ||
|
||
1. LF Projects will hold title to all trade or service marks used by the Project ("Project Trademarks"), whether based on common law or registered rights. Project Trademarks will be transferred and assigned to LF Projects to hold on behalf of the Project. Any use of any Project Trademarks by Collaborators in the Project will be in accordance with the license from LF Projects and inure to the benefit of LF Projects. | ||
|
||
2. The Project will, as permitted and in accordance with such license from LF Projects, develop and own all Project GitHub and social media accounts, and domain name registrations created by the Project community. | ||
|
||
3. Under no circumstances will LF Projects be expected or required to undertake any action on behalf of the Project that is inconsistent with the tax-exempt status or purpose, as applicable, of the Joint Development Foundation or LF Projects, LLC. | ||
|
||
## General Rules and Operations | ||
|
||
The Project will: | ||
|
||
1. Engage in the work of the Project in a professional manner consistent with maintaining a cohesive community, while also maintaining the goodwill and esteem of LF Projects, Joint Development Foundation, and other partner organizations in the open source community. | ||
2. Respect the rights of all trademark owners, including any branding and trademark usage guidelines. | ||
|
||
## Intellectual Property Policy | ||
|
||
1. Collaborators acknowledge that the copyright in all new contributions will be retained by the copyright holder as independent works of authorship and that no contributor or copyright holder will be required to assign copyrights to the Project. | ||
|
||
2. Except as described in Section 7.c., all contributions to the Project are subject to the following: | ||
|
||
- All new inbound code contributions to the Project must be made using Apache License, Version 2.0 available at http://www.apache.org/licenses/LICENSE-2.0 (the "Project License"). | ||
- All new inbound code contributions must also be accompanied by a Developer Certificate of Origin (http://developercertificate.org) sign-off in the source code system that is submitted through a TSC-approved contribution process which will bind the authorized contributor and, if not self-employed, their employer to the applicable license. | ||
- All outbound code will be made available under the Project License. | ||
- Documentation will be received and made available by the Project under the Creative Commons Attribution 4.0 International License (available at http://creativecommons.org/licenses/by/4.0/). | ||
- The Project may seek to integrate and contribute back to other open source projects ("Upstream Projects"). In such cases, the Project will conform to all license requirements of the Upstream Projects, including dependencies, leveraged by the Project. Upstream Project code contributions not stored within the Project's main code repository will comply with the contribution process and license terms for the applicable Upstream Project. | ||
|
||
3. The TSC may approve the use of an alternative license or licenses for inbound or outbound contributions on an exception basis. To request an exception, please describe the contribution, the alternative open source license(s), and the justification for using an alternative open source license for the Project. License exceptions must be approved by a two-thirds vote of the entire TSC. | ||
|
||
4. Contributed files should contain license information, such as SPDX short form identifiers, indicating the open source license or licenses pertaining to the file. | ||
|
||
## Amendments | ||
|
||
1. This charter may be amended by a two-thirds vote of the entire TSC and is subject to approval by LF Projects. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.