Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive output and contextual dialogue capabilities of text-generation-inference #424

Open
1 of 4 tasks
MLikeWater opened this issue Sep 26, 2023 · 1 comment
Open
1 of 4 tasks
Labels
bug Something isn't working

Comments

@MLikeWater
Copy link

MLikeWater commented Sep 26, 2023

System Info

System Info
HL-SMI Version: hl-1.11.0-fw-45.1.1.1
Driver Version: 1.11.0-e6eb0fd

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Deploy the Llama-2-7b-chat-hf model through text-generation-inference, but there is no adaptive output when using the following command, instead the input and output size are max_new_tokens.

curl 127.0.0.1:8080/generate_stream -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":200}}'     -H 'Content-Type: application/json'

Also, how to implement chat functionality with context? Similar to GPT4, it can adaptively output appropriate content and has the ability to dialogue with context.

Expected behavior

  1. adaptive output
  2. dialogue with context
@MLikeWater MLikeWater added the bug Something isn't working label Sep 26, 2023
@regisss
Copy link
Collaborator

regisss commented Oct 13, 2023

@MLikeWater What do you mean exactly by adaptive output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants