Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

[NeuralChat] Add Multi-Socket LLM Inference Example #1073

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

letonghan
Copy link
Contributor

@letonghan letonghan commented Dec 25, 2023

Type of Change

Add NeuralChat example
API not changed

Description

Add Multi-Socket LLM inference example for NeuralChat.
Related DeepSpeed PR: microsoft/DeepSpeed#4750 (not merged yet)

Expected Behavior & Potential Risk

Custormers are able to run LLM inference using multi-socket with DeepSpeed following this example.

How has this PR been tested?

Local tested on SPR server.

Dependency Change?

no.

mengfei25 pushed a commit to mengfei25/intel-extension-for-transformers that referenced this pull request Dec 27, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant