-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a choice of how to end streaming from callback: STOP or CANCEL #1476
base: master
Are you sure you want to change the base?
Conversation
TODO: add CANCEL for ContinuousBatching |
454cdd9
to
1592ed0
Compare
10a755b
to
d18fe16
Compare
done |
2758f6b
to
03ca3ce
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, add tests for new functionality.
samples/python/prompt_lookup_decoding_lm/prompt_lookup_decoding_lm.py
Outdated
Show resolved
Hide resolved
17a9501
to
8975221
Compare
591c81a
to
8c6ff44
Compare
cac1834
to
408e4a3
Compare
@ilya-lavrenov could you please take a look ? |
@@ -23,7 +23,7 @@ def streamer(subword: str) -> bool: | |||
print(subword, end='', flush=True) | |||
|
|||
# No value is returned as in this example we don't want to stop the generation in this method. | |||
# "return None" will be treated the same as "return False". | |||
# "return None" will be treated the same as "return ov::genai::StreamerRunningStatus::RUNNING;". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# "return None" will be treated the same as "return ov::genai::StreamerRunningStatus::RUNNING;". | |
# "return None" will be treated the same as "return openvino_genai.StreamerRunningStatus.RUNNING". |
@@ -15,8 +15,8 @@ enum class GenerationStatus { | |||
RUNNING = 0, // Default status for ongoing generation | |||
FINISHED = 1, // Status set when generation has been finished | |||
IGNORED = 2, // Status set when generation run into out-of-memory condition and could not be continued | |||
DROPPED_BY_PIPELINE = 3, // Currently not used, TODO: implement abort functionality | |||
DROPPED_BY_HANDLE = 4 // Status set when generation handle is dropped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's deprecate DROPPED_BY_HANDLE
via OPENVINO_ENUM_DEPRECATED
and assign DROPPED_BY_HANDLE = STOP
|
||
bool is_stopped(); | ||
|
||
bool is_canceled(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -4,16 +4,29 @@ | |||
#pragma once | |||
|
|||
#include "openvino/genai/tokenizer.hpp" | |||
#include "openvino/genai/generation_handle.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like this header file is not required here anymore
@@ -22,6 +35,12 @@ class OPENVINO_GENAI_EXPORTS StreamerBase { | |||
/// @brief end is called at the end of generation. It can be used to flush cache if your own streamer has one | |||
virtual void end() = 0; | |||
|
|||
/// @brief get_streaming_status() is called by the pipline to take more detailed about streaming status. m_streaming_finish_status, which contains streaming status info, could be set in put(). | |||
/// @return ov::genai::StreamerRunningStatus to determine the streaming status of generation, whether generation is running, stopped or cancelled | |||
virtual StreamerRunningStatus get_streaming_status() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
virtual StreamerRunningStatus get_streaming_status() { | |
virtual StreamerRunningStatus get_streaming_status() const { |
@@ -171,7 +171,7 @@ std::pair<ov::genai::EncodedResults, bool> decode(std::shared_ptr<ov::genai::Whi | |||
|
|||
sampler.clear_request_info(sequence_group->get_request_id()); | |||
|
|||
return {results, sequence_group->handle_dropped()}; | |||
return {results, sequence_group->handle_stopped()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to handle cancel()
as well.
As Whisper does not have chat scenario, cancel()
and stop()
work with no difference
@@ -217,6 +222,106 @@ def test_callback_kwargs_batch_throws(callback): | |||
pipe.generate(['1', '2'], max_new_tokens=10, streamer=callback) | |||
|
|||
|
|||
@pytest.mark.precommit | |||
@pytest.mark.nightly | |||
def test_callback_terminate_by_bool_sampler(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_callback_terminate_by_bool_sampler(): | |
def test_callback_terminate_by_bool(): |
why do we need sampler in test name? IMO, we don't need such implementation detail here.
If you are OK, we need to drop _sampler
postfix from other tests as well.
current_iter += 1 | ||
return current_iter == num_iters | ||
|
||
ov_generation_config = GenerationConfig(max_new_tokens=100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to use ignore_eos=True
as generation in a theory can finish by itself on num_iters
's iteration
the same for other tests
@@ -114,7 +114,7 @@ int main(int argc, char* argv[]) try { | |||
print_generation_result(generation_result); | |||
} | |||
break; | |||
case ov::genai::GenerationStatus::DROPPED_BY_PIPELINE: | |||
case ov::genai::GenerationStatus::CANCEL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case ov::genai::GenerationStatus::CANCEL: | |
case ov::genai::GenerationStatus::CANCEL: | |
case ov::genai::GenerationStatus::STOP: |
@@ -124,7 +124,7 @@ int main(int argc, char* argv[]) try { | |||
print_cb_generation_result(generation_result); | |||
} | |||
break; | |||
case ov::genai::GenerationStatus::DROPPED_BY_PIPELINE: | |||
case ov::genai::GenerationStatus::CANCEL: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
case ov::genai::GenerationStatus::CANCEL: | |
case ov::genai::GenerationStatus::CANCEL: | |
case ov::genai::GenerationStatus::STOP: |
No description provided.