Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: module 'transformer_engine' has no attribute 'pytorch' #1014

Open
Lzhang-hub opened this issue Jul 15, 2024 · 3 comments
Open

Comments

@Lzhang-hub
Copy link

I reinstall pip install flash-attn==2.6.1 in NGC pytorch docker image 24.06.
When I run train job, I got follow error:

Traceback (most recent call last):
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/pretrain_gpt.py", line 8, in <module>
    from megatron.training import get_args
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/training/__init__.py", line 16, in <module>
    from .initialize  import initialize_megatron
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/training/initialize.py", line 18, in <module>
    from megatron.training.arguments import parse_args, validate_args
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/training/arguments.py", line 13, in <module>
    from megatron.core.models.retro.utils import (
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/core/models/retro/__init__.py", line 12, in <module>
    from .decoder_spec import get_retro_decoder_block_spec
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/core/models/retro/decoder_spec.py", line 9, in <module>
    from megatron.core.models.gpt.gpt_layer_specs import (
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/core/models/gpt/__init__.py", line 1, in <module>
    from .gpt_model import GPTModel
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/core/models/gpt/gpt_model.py", line 17, in <module>
    from megatron.core.transformer.transformer_block import TransformerBlock
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/core/transformer/transformer_block.py", line 16, in <module>
    from megatron.core.transformer.custom_layers.transformer_engine import (
  File "/data1/nfs15/nfs/bigdata/zhanglei/ai-platform/hpc-test/multi-node-train/megatron-lm-train/Megatron-LM/20240411/Megatron-LM/megatron/core/transformer/custom_layers/transformer_engine.py", line 80, in <module>
    class TELinear(te.pytorch.Linear):
AttributeError: module 'transformer_engine' has no attribute 'pytorch'
@timmoon10
Copy link
Collaborator

This looks like an import error, probably from Flash Attention. Our import logic has an unfortunate side effect of suppressing error messages (see #862 (review)), so can you try replacing import transformer_engine with import transformer_engine.pytorch?

@arelkeselbri
Copy link

I'm having this same error. Replacing with import transformer_engine.pytorch changes. Can you give me any hint on how to solve this?

Traceback (most recent call last):
  File "/NeMo-Aligner/examples/nlp/gpt/train_gpt_sft.py", line 19, in <module>
    from nemo.collections.nlp.data.language_modeling.megatron.gpt_sft_chat_dataset import get_prompt_template_example
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/nemo/collections/nlp/__init__.py", line 15, in <module>
    from nemo.collections.nlp import data, losses, models, modules
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/nemo/collections/nlp/models/__init__.py", line 28, in <module>
    from nemo.collections.nlp.models.language_modeling import MegatronGPTPromptLearningModel
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/nemo/collections/nlp/models/language_modeling/__init__.py", line 16, in <module>
    from nemo.collections.nlp.models.language_modeling.megatron_gpt_prompt_learning_model import (
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/nemo/collections/nlp/models/language_modeling/megatron_gpt_prompt_learning_model.py", line 31, in <module>
    from nemo.collections.nlp.models.language_modeling.megatron_gpt_model import MegatronGPTModel
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/nemo/collections/nlp/models/language_modeling/megatron_gpt_model.py", line 41, in <module>
    from nemo.collections.nlp.models.language_modeling.megatron.falcon.falcon_spec import get_falcon_layer_spec
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/nemo/collections/nlp/models/language_modeling/megatron/falcon/falcon_spec.py", line 19, in <module>
    from megatron.core.transformer.attention import SelfAttention, SelfAttentionSubmodules
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/megatron/core/transformer/attention.py", line 12, in <module>
    from megatron.core.transformer.custom_layers.transformer_engine import SplitAlongDim
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/megatron/core/transformer/custom_layers/transformer_engine.py", line 7, in <module>
    import transformer_engine.pytorch as te
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/transformer_engine/pytorch/__init__.py", line 34, in <module>
    _load_library()
  File "/NeMo-Aligner/venv/lib/python3.10/site-packages/transformer_engine/pytorch/__init__.py", line 25, in _load_library
    so_path = next(so_dir.glob(f"transformer_engine_torch.*.{extension}"))
StopIteration

@pizts
Copy link

pizts commented Sep 3, 2024

same error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants