Intel ARC/IPEX: Implement new LLM optimisations & Consolidate CPU & XPU IPEX optimisation branches #3528

Steve-Tech · 2024-09-19T11:23:16Z

Why are these changes needed?

Intel Extension for PyTorch version 2.3.110+xpu & 2.4.0+cpu implement LLM specific optimisations, this PR makes use of those in FastChat. This seems to make inference slightly faster when using IPEX with a CPU or GPU.

While I was adding this, I also consolidated the IPEX specific cpu/xpu optimisation if statements into a single one.

Related issue number (if applicable)

N/A

Checks

I've run format.sh to lint the changes in this PR.
I've included any doc changes needed.
I've made sure the relevant tests are passing (if applicable).

Consolidate CPU & XPU IPEX optimisations, & implement LLM optimisations

a62344a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intel ARC/IPEX: Implement new LLM optimisations & Consolidate CPU & XPU IPEX optimisation branches #3528

Intel ARC/IPEX: Implement new LLM optimisations & Consolidate CPU & XPU IPEX optimisation branches #3528

Steve-Tech commented Sep 19, 2024

Intel ARC/IPEX: Implement new LLM optimisations & Consolidate CPU & XPU IPEX optimisation branches #3528

Are you sure you want to change the base?

Intel ARC/IPEX: Implement new LLM optimisations & Consolidate CPU & XPU IPEX optimisation branches #3528

Conversation

Steve-Tech commented Sep 19, 2024

Why are these changes needed?

Related issue number (if applicable)

Checks