[Proposal] Ensure TransformerLens does not load from hugging face when config is passed in #754

hamind · 2024-10-11T08:45:44Z

Proposal

Change some code that could load model locally.

Motivation

Today I want to load gpt2 model that download from huggingface website locally like Llama, but it keeps try to conncetting huggingface to download.
Then I check the code and find that

No loadable local model approach
If huggingface model already exists, there is no need to download model config from huggingface and could direct get it fron huggingface model.

Pitch

For model downloaded from huggingface or not cache, providing a approach to load model locally.

Alternatives

For function "HookedTransformer.from_pretrained", could consider to add parameters to pass local model address.
If huggingface model already exists, get the config from huggingface model directly.

Checklist

I have checked that there is no similar issue in the repo (required)

bryce13950 · 2024-10-15T21:58:49Z

Could you share the code you are using to load TransformerLens? You should be able to pass in your local version of the model with the param hf_model

hamind · 2024-10-16T07:28:41Z

I've modified less code, so I've just pasted the relevant code directly here. I've labeled the python file location and line number of the code, as well as the original version of the code which I've represented as a comment, with the new code shown below the old code for your convenience in checking.

 In transformer_lens.HookedTransformer.py line 1257
 
 cfg = loading.get_pretrained_model_config(
            official_model_name,
            # hf_cfg=hf_cfg
            hf_cfg=hf_model.config,
            checkpoint_index=checkpoint_index,
            checkpoint_value=checkpoint_value,
            fold_ln=fold_ln,
            device=device,
            n_devices=n_devices,
            default_prepend_bos=default_prepend_bos,
            dtype=dtype,
            first_n_layers=first_n_layers,
            **from_pretrained_kwargs,
        )

In transformer_lens.loading_from_pretrained.py line 1583

# if hf_cfg is not None:
#     cfg_dict["load_in_4bit"] = hf_cfg.get("quantization_config"# {}).get("load_in_4bit", False)

if hf_cfg is not None:
    cfg_dict["load_in_4bit"] = hf_cfg.to_dict().get("quantization_config"{}).get("load_in_4bit", False)

In transformer_lens.loading_from_pretrained.py line 708

# def convert_hf_model_config(model_name: str, **kwargs):
def convert_hf_model_config(model_name: str, hf_config = None, **kwargs):
    """
    Returns the model config for a HuggingFace model, converted to a dictionary
    in the HookedTransformerConfig format.

    Takes the official_model_name as an input.
    """
    if (Path(model_name) / "config.json").exists():
        logging.info("Loading model config from local directory")
        official_model_name = model_name
    else:
        official_model_name = get_official_model_name(model_name)

    # Load HuggingFace model config
    if "llama" in official_model_name.lower():
        architecture = "LlamaForCausalLM"
    elif "gemma-2" in official_model_name.lower():
        architecture = "Gemma2ForCausalLM"
    elif "gemma" in official_model_name.lower():
        architecture = "GemmaForCausalLM"
    else:
        # huggingface_token = os.environ.get("HF_TOKEN", None)
        # hf_config = AutoConfig.from_pretrained(
        #     official_model_name,
        #     token=huggingface_token,
        #     **kwargs,
        # )
        if hf_config is None:
            huggingface_token = os.environ.get("HF_TOKEN", None)
            hf_config = AutoConfig.from_pretrained(
                official_model_name,
                token=huggingface_token,
                **kwargs,
            )
        architecture = hf_config.architectures[0]
    ...

In transformer_lens.loading_from_pretrained.py line 1525 and line 1543

    if Path(model_name).exists():
        # If the model_name is a path, it's a local model
        # cfg_dict = convert_hf_model_config(official_model_name, **kwargs)
        cfg_dict = convert_hf_model_config(model_name, hf_cfg, **kwargs)
        official_model_name = model_name
    else:
        official_model_name = get_official_model_name(model_name)
    if (
        official_model_name.startswith("NeelNanda")
        or official_model_name.startswith("ArthurConmy")
        or official_model_name.startswith("Baidicoot")
    ):
        cfg_dict = convert_neel_model_config(official_model_name, **kwargs)
    else:
        if official_model_name.startswith(NEED_REMOTE_CODE_MODELS) and not kwargs.get(
            "trust_remote_code", False
        ):
            logging.warning(
                f"Loading model {official_model_name} requires setting trust_remote_code=True"
            )
            kwargs["trust_remote_code"] = True
        # cfg_dict = convert_hf_model_config(official_model_name, **kwargs)
        cfg_dict = convert_hf_model_config(official_model_name, hf_cfg, **kwargs)

hamind closed this as completed Oct 11, 2024

hamind reopened this Oct 11, 2024

bryce13950 assigned hamind Oct 15, 2024

bryce13950 added the needs-information More information is needed from the issue creator before moving forward. label Oct 15, 2024

bryce13950 unassigned hamind Nov 3, 2024

bryce13950 added complexity-moderate Moderately complicated issues for people who have intermediate experience with the code and removed needs-information More information is needed from the issue creator before moving forward. labels Nov 3, 2024

bryce13950 changed the title ~~[Proposal] Add function~~ [Proposal] Ensure TransformerLens does not load from hugging face when config is passed in Nov 3, 2024

bryce13950 mentioned this issue Nov 3, 2024

Is it possible to use a locally downloaded model without accessing HF? #655

Closed

bryce13950 mentioned this issue Nov 27, 2024

[Bug Report] Load model problem #800

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Ensure TransformerLens does not load from hugging face when config is passed in #754

[Proposal] Ensure TransformerLens does not load from hugging face when config is passed in #754

hamind commented Oct 11, 2024 •

edited

Loading

bryce13950 commented Oct 15, 2024

hamind commented Oct 16, 2024 •

edited

Loading

[Proposal] Ensure TransformerLens does not load from hugging face when config is passed in #754

[Proposal] Ensure TransformerLens does not load from hugging face when config is passed in #754

Comments

hamind commented Oct 11, 2024 • edited Loading

Proposal

Motivation

Pitch

Alternatives

Checklist

bryce13950 commented Oct 15, 2024

hamind commented Oct 16, 2024 • edited Loading

hamind commented Oct 11, 2024 •

edited

Loading

hamind commented Oct 16, 2024 •

edited

Loading