Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bugs for the llama inference example #30

Open
yinsong1986 opened this issue Sep 18, 2024 · 1 comment
Open

Potential bugs for the llama inference example #30

yinsong1986 opened this issue Sep 18, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@yinsong1986
Copy link

yinsong1986 commented Sep 18, 2024

Hi team,

Sharing some observation for potential bugs as below:

  • class NeuronLlamaModel(NeuronBaseModel, LlamaPreTrainedModel):
    """
    The neuron version of the LlamaModel
    """
    def setup_attr_for_model(self, config: NeuronLlamaConfig):
    # Needed for init_inference_optimization()
    self.on_device_sampling = config.on_device_sampling
    self.tp_degree = config.tp_degree
    self.hidden_size = config.hidden_size
    self.num_attention_heads = config.num_attention_heads
    self.num_key_value_heads = config.num_key_value_heads
    self.max_batch_size = config.max_batch_size
    self.buckets = config.buckets
    def init_model(self, config: NeuronLlamaConfig):
    def forward(self, x):
    """
    Forward pass of the ResBlock.
    Args:
    x (torch.Tensor): Input tensor.
    Returns:
    torch.Tensor: Output after the residual connection and activation.
    """
    return x + self.act(self.linear(x))
    class NeuronLlamaModel(NeuronBaseModel, LlamaPreTrainedModel):
    this seems to be duplicated. The first class definition should be changed to ResBlock
  • https://github.com/aws-neuron/neuronx-distributed/blob/main/examples/inference/modules/model_base.py#L453-L486 these lines seems to be not relevant, and should be deleted and replaced with return [res] + updated_kv_cache

Thank you!

@yinsong1986 yinsong1986 changed the title Potentail bugs for the llama inference example Potential bugs for the llama inference example Sep 18, 2024
@jyang-aws
Copy link

@yinsong1986 thanks for reporting! Our inference team will take a look.

@aws-taylor aws-taylor added the bug Something isn't working label Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants