Skip to content

[BUG] Error while loading falcon-40b model #3803

@agunapal

Description

@agunapal

Describe the bug
Getting the following error when I try to load falcon-40b model.
The same config works for opt-30b

2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/deepspeed/lib/python3.10/site-packages/ts/handler_utils/distributed/deepspeed.py", line 47, in get_ds_engine
2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -     ds_engine = deepspeed.init_inference(
2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/deepspeed/lib/python3.10/site-packages/deepspeed/__init__.py", line 333, in init_inference
2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -     engine = InferenceEngine(model, config=ds_inference_config)
2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/deepspeed/lib/python3.10/site-packages/deepspeed/inference/engine.py", line 192, in __init__
2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -     self._apply_injection_policy(config)
2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/deepspeed/lib/python3.10/site-packages/deepspeed/inference/engine.py", line 426, in _apply_injection_policy
2023-06-24T00:46:33,240 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -     replace_transformer_layer(client_module, self.module, checkpoint, config, self.config)
2023-06-24T00:46:33,241 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -   File "/home/ubuntu/anaconda3/envs/deepspeed/lib/python3.10/site-packages/deepspeed/module_inject/replace_module.py", line 546, in replace_transformer_layer
2023-06-24T00:46:33,241 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG -     assert container_g.ckpt_load_enabled, \
2023-06-24T00:46:33,241 [INFO ] W-29500-opt_1.0-stdout MODEL_LOG - AttributeError: 'NoneType' object has no attribute 'ckpt_load_enabled'

Config

{
  "dtype": "torch.float16",
  "replace_with_kernel_inject": true,
  "tensor_parallel": {
    "tp_size": 4
  }
}

Error happens when I call this function

ds_engine = deepspeed.init_inference(
            model,
            config=ds_config,
            base_dir=model_path,
            checkpoint=checkpoint,
        )
**To Reproduce**
Steps to reproduce the behavior:
1. Simple inference script to reproduce
2. What packages are required and their versions
3. How to run the script
4. ...

**Expected behavior**
A clear and concise description of what you expected to happen.

**ds_report output**
Please run `ds_report` to give us details about your setup.

[2023-06-24 00:55:04,352] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)

DeepSpeed C++/CUDA extension op report

NOTE: Ops not installed will be just-in-time (JIT) compiled at
runtime if needed. Op compatibility means that your system
meet the required dependencies to JIT install the op.

JIT compiled ops requires ninja
ninja .................. [OKAY]

op name ................ installed .. compatible

[WARNING] async_io requires the dev libaio .so object and headers but these were not found.
[WARNING] async_io: please install the libaio-dev package with apt
[WARNING] If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
cpu_adagrad ............ [NO] ....... [OKAY]
cpu_adam ............... [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
random_ltd ............. [NO] ....... [OKAY]
[WARNING] sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.0
[WARNING] using untested triton version (2.0.0), only 1.0.0 is known to be compatible
sparse_attn ............ [NO] ....... [NO]
spatial_inference ...... [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
utils .................. [NO] ....... [OKAY]

DeepSpeed general environment info:
torch install path ............... ['/home/ubuntu/anaconda3/envs/deepspeed/lib/python3.10/site-packages/torch']
torch version .................... 2.0.1+cu117
deepspeed install path ........... ['/home/ubuntu/anaconda3/envs/deepspeed/lib/python3.10/site-packages/deepspeed']
deepspeed info ................... 0.9.4, unknown, unknown
torch cuda version ............... 11.7
torch hip version ................ None
nvcc version ..................... 11.7
deepspeed wheel compiled w. ...... torch 2.0, cuda 11.7


**Screenshots**
If applicable, add screenshots to help explain your problem.

**System info (please complete the following information):**
 - OS: Ubuntu 20.04
 - GPU count and types g5.12xlarge
 - (if applicable) what [DeepSpeed-MII](https://github.com/microsoft/deepspeed-mii) version are you using
 - (if applicable) Hugging Face Transformers/Accelerate/etc. versions
 - Python version 3.10
 - Any other relevant info about your setup

**Docker context**
Are you using a specific docker image that you can share?

**Additional context**
Add any other context about the problem here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions