Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few questions & suggestions on HF_LLM.py #39

Open
5 tasks
sadra-barikbin opened this issue Feb 27, 2024 · 1 comment
Open
5 tasks

A few questions & suggestions on HF_LLM.py #39

sadra-barikbin opened this issue Feb 27, 2024 · 1 comment

Comments

@sadra-barikbin
Copy link
Contributor

sadra-barikbin commented Feb 27, 2024

Hi there!
I had a few questions & suggestions regarding HF_LLM.py. Sorry in advance for being naive.

  • For decoder-only and pre_encode_input=True case, the tokenized_contexts=batch_inputs argument to custom module calls
    lacks the last token of the context which has been prepended to the output. Is this really desired? Couldn't be confusing for the user?
    if _key == "__score":
    results = _fn(_outputs, minibatch, batch_inputs[step_idx:step_idx + current_minibatch_size])
    else:
    results = _fn(_outputs,
    minibatch=minibatch,
    tokenized_contexts=batch_inputs[step_idx:step_idx + current_minibatch_size],
    **kwargs)
  • Why don't we use bos_token here while creating decoder input in encoder-decoder setup?
    {"input_ids": [self.pad_token], "attention_mask": [1]},
  • To batchify generation:
    def generate(self, contexts, return_logprobs=False, **kwargs):
    generations = []
    for text_input in contexts:
  • We could add output_hidden_states not as a tensor:
    if not module_function_keys == ['__score']:
    minibatch_inputs["output_hidden_states"] = True
    # Transform it to tensors on the right device
    lamorel_logger.debug(f"Putting it on device {self.device}")
    minibatch = {}
    for key, value in minibatch_inputs.items():
    if key in ["encoder_outputs", "past_key_values"]:
    minibatch[key] = value
    else:
    minibatch[key] = torch.tensor(value, device=self.device)
  • When pretrained=False and use_cpu=False, HF_LLM.__init__ raises error in:
    self._LLM_model = _model_constructor(**constructor_kwargs)

    Since it seems that we could not give adhoc model config parameters to model_cls.from_config, but to model_cls.from_pretrained or we could have insteadconfig = config_cls.from_pretrained(path, **adhoc);model_cls.from_config(config). To reproduce:
from transformers import AutoModelForCausalLM, AutoConfig
config = AutoConfig.from_pretrained("hf-internal-testing/tiny-random-LlamaForCausalLM")
model = AutoModelForCausalLM.from_config(config, device_map='auto') # raises error
@ClementRomac
Copy link
Collaborator

ClementRomac commented Mar 7, 2024

Hey,

For decoder-only and pre_encode_input=True case, the tokenized_contexts=batch_inputs argument to custom module calls
lacks the last token of the context which has been prepended to the output. Is this really desired? Couldn't be confusing for the user?

Yes, when using pre_encode_input=True with decoder-only models, the input is first given to the LLM and the past_key_values is obtained. However, transformers only returns the hidden_states for inputs, not for the past_key_values. So to get the hidden states from the input's last token, this token should be removed from the inputs used when pre-encoding.
I agree that this may be confusing. This is among a long list of things that aren't documented. I unfortunately have a very limited bandwith to work on improving the documentation :/

Why don't we use bos_token here while creating decoder input in encoder-decoder setup?
The bos_token is not provided for all tokens. When I first implemented the hf_llm.py, I was mostly using T5 models which do not have any bos_token. This said, I had to force the pad token to 0 as a pad_token is also not implemented for all models...
I don't know if there is any clean universal solution. Nevertheless, it seems the token 0 is often used for padding.

  • To batchify generation:

Yes, this needs to be done :)

  • We could add output_hidden_states not as a tensor:

Right. We would have to check the transformers API handles a boolean.

  • When pretrained=False and use_cpu=False, HF_LLM.__init__ raises error in:

I need to test this. I am not sure when I can do it though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants