Skip to content

starmountain1997/AMP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

English | 中文

AnyModelPatcher

This project is a framework designed for patching Transformers models. It allows users to modify and enhance the functionality of various models seamlessly.

The patching mechanism utilized in this project is inspired by PEP 369, which provides a standardized approach for modifying modules after they have been imported. This allows for greater flexibility and control over the behavior of the models.

For more information on modifying Transformers models, refer to the How to Hack Any Transformers Model.

installation

install with Ascend

pip install git+https://github.com/starmountain1997/AMP.git#egg=AMP[ascend]

local install

# use bash
pip install -e .[ascend]
# use zsh
pip install .\[ascend\]

usage

Inference Script Example

import time

from tqdm import tqdm
from transformers import pipeline


def main(use_amp=False):
    if use_amp:
        from amp.models.llama import patch_llama

    model_id = "meta-llama/Llama-3.2-1B"
    warmup_times = 10
    repeat_times = 10

    pipe = pipeline(
        "text-generation",
        model=model_id,
        device_map="auto"
    )

    for _ in tqdm(range(warmup_times), desc="Warming up"):
        pipe("The key to life is")

    start_time = time.time()
    for _ in tqdm(range(repeat_times), desc="Generating text"):
        pipe("The key to life is")
    end_time = time.time()

    print(
        f"Time taken: {(end_time - start_time) / repeat_times} seconds, use_amp: {use_amp}")


if __name__ == "__main__":
    main(use_amp=False)

LLaMA-Factory Training

train

To integrate the patching functionality into the training process, add the following line in src/llamafactory/train/tuner.py:

from amp.models.llama import patch_llama

How to Patch?

For models that have been integrated into the Transformers repository, refer to llama.py for patching methods. For models that utilize dynamic loading, such as cogagent-9b, follow the patching approach demonstrated in cogagent2_9b.py. This ensures that the models can be modified effectively based on their loading mechanisms.

About

patching Transformers models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published