temp #3

Wovchena · 2023-12-06T23:54:37Z

No description provided.

Added cmake build type before project clause

Preemption algorithm finalization

Compression currently fails with the latest `optimum-intel` version Changes: - Update usage of `_check_default_4bit_configs ` after huggingface/optimum-intel#843 - Update optimum-intel version --------- Co-authored-by: Ekaterina Aidova <[email protected]>

@Wovchena

Co-authored-by: Alina Kladieva <[email protected]> Co-authored-by: Anastasiia Pnevskaia <[email protected]> Co-authored-by: Nikita Malinin <[email protected]> Co-authored-by: Yaroslav Tarkan <[email protected]> Co-authored-by: Anatoliy Talamanov <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Miłosz Żeglarski <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Alexander Suvorov <[email protected]> Co-authored-by: Xiake Sun <[email protected]> Co-authored-by: Damian Kalinowski <[email protected]> Co-authored-by: Andrei Kochin <[email protected]> Co-authored-by: Ekaterina Aidova <[email protected]>

Co-authored-by: Ilya Lavrenov <[email protected]>

- Simplified partial preemption algorithm for groups with multiple sequences. - Removed dividing into single sequence and multiple sequence path.

…olkit#728)

Co-authored-by: Zlobin Vladimir <[email protected]>

…penvinotoolkit#649) Changes: - Further split of greedy and multinomial paths - using original logits buffer in greedy and whenever possible in multinomial sampling. Sorted vector is created only when top_p or top_k filters need to be applied. - Fixing issue with top_k filter being applied always when multinomial sampling is used unless it's explicitly set to 0. Now default value (which is max for size_t) will not trigger applying top_k filter. The filter will also not be applied if top_k is bigger than logits vector size. - Skipping multinomial tests

Co-authored-by: Alina Kladieva <[email protected]> Co-authored-by: Anastasiia Pnevskaia <[email protected]> Co-authored-by: Nikita Malinin <[email protected]> Co-authored-by: Yaroslav Tarkan <[email protected]> Co-authored-by: Anatoliy Talamanov <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Miłosz Żeglarski <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Alexander Suvorov <[email protected]> Co-authored-by: Xiake Sun <[email protected]> Co-authored-by: Damian Kalinowski <[email protected]> Co-authored-by: Andrei Kochin <[email protected]> Co-authored-by: Ekaterina Aidova <[email protected]> Co-authored-by: guozhong wang <[email protected]>

…envinotoolkit#690) When user sets `INFERENCE_PRECISION_HINT` change the kvcache type. Ticket: [145861](https://jira.devtools.intel.com/browse/CVS-145861) --------- Co-authored-by: Dariusz Trawinski <[email protected]>

* Use sequence length axis in `trimm_tensor`

**TODO:** - [ ] Python API and sample - [ ] Update doc strings - [x] Update main README.md (PR openvinotoolkit#930) - [ ] Add sample with custom device mapping - [ ] Experiment with reshape + compile as part of Ctor - [x] Add LoRA (PR openvinotoolkit#911) - [X] Use std::optional for prompt2, prompt3 and maybe negative prompts as well - [X] Update https://github.com/openvinotoolkit/openvino.genai/blob/master/src/docs/SUPPORTED_MODELS.md with text 2 image generation models

Draft VLM pipeline test Ticket: CVS-153186 --------- Co-authored-by: wenyi5608 <[email protected]> Co-authored-by: Wovchena <[email protected]> Co-authored-by: Yaroslav Tarkan <[email protected]> Co-authored-by: Alina Kladieva <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Pavel Esir <[email protected]> Co-authored-by: Artur Paniukov <[email protected]> Co-authored-by: Ekaterina Aidova <[email protected]> Co-authored-by: Ilya Lavrenov <[email protected]> Co-authored-by: Mikhail Ryzhov <[email protected]> Co-authored-by: Andrei Kochin <[email protected]>

Chat for continuous batching and for static pipeline should match with stateful and HF https://github.com/huggingface/transformers/blob/main/src/transformers/tokenization_utils_base.py#L1884-L1893 --------- Co-authored-by: Vladimir Zlobin <[email protected]>

Extracted from openvinotoolkit#882

Preparing for changes from openvinotoolkit/openvino#26952 Co-authored-by: Alina Kladieva <[email protected]>

Use new Constant construct to make it from memory pointer. --------- Co-authored-by: Ilya Lavrenov <[email protected]>

Co-authored-by: Ilya Lavrenov <[email protected]>

fix the issue openvinotoolkit#709 --------- Co-authored-by: Chen Peter <[email protected]>

This PR adds: - [x] Long-form audio support with sequential chunking. Common Todos for Whisper support: - [ ] Long-form audio support with [parallel chunking](https://huggingface.co/blog/asr-chunking). - [ ] add perf metrics - [ ] update documentation - [ ] add cpp, python samples tests - [ ] support timestamps streaming - [ ] expose only meaningful parameters in `GenerationConfig` (`task`, `language`, `return_timestamps`, etc) - [ ] Move all whisper pipeline files to dedicated subfolder - [ ] Whisper pipeline doesn't need tokenizer, it uses detokenizer only. Implement detokenizer only initialization for `ov::genai::Tokenizer` - [ ] Check discrete GPU. Integrated GPU works as expected. - [ ] Investigate use of `RemoteTensor` for GPU - [ ] Add batch - [ ] Add sampler, inherit WhisperGenerationConfig from GenerationConfig - [ ] Investigate language autodetection with single decoder (without past) call - [ ] Update python bindings cmake to include whole directory instead of explicit list of files - [ ] Add samples with audio preparation examples - [ ] Add links to audio files so users can download them in samples - [ ] Move supported models list from samples README to common supported models section - [ ] Avoid building GenAI in each tests job as it takes a lot of time - [ ] Double check FP32 support - [ ] Fix tests sporadic fails. Sometimes whisper model cannot be downloaded from HF due to network issues - [ ] Fix stop criteria. Current approach stops on eos_token which is no speech token. But there could be more speech tokens further which are wrongly skipped now. Completed: - [x] support different languages, language autodetection - [x] support translation - [x] support timestamps Current limitations: - No resampling during preprocessing. Input raw speech should have 16k Hz sampling rate - No normalization during preprocessing. Input raw speech should be normalized to near [-1, 1] range Tickets: CVS-147994, CVS-146010, CVS-152542 --------- Co-authored-by: Ilya Lavrenov <[email protected]>

Wovchena force-pushed the temp branch 2 times, most recently from d168f1b to f08cf61 Compare December 8, 2023 10:41

Wovchena force-pushed the temp branch 2 times, most recently from f978336 to 23b3427 Compare January 3, 2024 13:25

Wovchena force-pushed the temp branch from 23b3427 to 2acc263 Compare January 26, 2024 14:18

Wovchena force-pushed the temp branch from 641573e to 60bc1c1 Compare February 26, 2024 09:14

Wovchena pushed a commit that referenced this pull request May 10, 2024

Merge pull request #3 from ilya-lavrenov/cmake-build-type

023cf1e

Added cmake build type before project clause

Wovchena pushed a commit that referenced this pull request Jun 19, 2024

Merge pull request #3 from popovaan/preemtion_alg

17fdc12

Preemption algorithm finalization

Wovchena force-pushed the temp branch from 3ea88c8 to 0b57650 Compare July 1, 2024 11:34

olpipi and others added 21 commits July 30, 2024 18:20

Fix to throw exception in case of empty chat template in chat scenario (

4228131

openvinotoolkit#697)

update optimum commit for master (openvinotoolkit#710)

3f55103

change commit for optimum (openvinotoolkit#715)

cd188b9

Correct samples requirements update (openvinotoolkit#653)

621254d

Bump versions (openvinotoolkit#627)

5f16634

Rename benchmark_genai leftover (openvinotoolkit#729)

fd8a71f

Add benchmark_genai to root list (openvinotoolkit#727)

4e66950

Co-authored-by: Ilya Lavrenov <[email protected]>

Allow to build GenAI with OpenVINO via extra modules

97a05e1

Simplified partial preemption algorithm. (openvinotoolkit#730)

66f9d62

- Simplified partial preemption algorithm for groups with multiple sequences. - Removed dividing into single sequence and multiple sequence path.

Move tokenizers_relative_to_genai impl to cpp from header (openvinoto…

2599eed

…olkit#728)

[CI] OV local build (openvinotoolkit#602)

3cb2829

Co-authored-by: Zlobin Vladimir <[email protected]>

fix python code example for perf metrics (openvinotoolkit#739)

eeabbad

Use sequence length axis in tensor trim (openvinotoolkit#723)

4e1e755

* Use sequence length axis in `trimm_tensor`

Small cmake improvements (openvinotoolkit#741)

50182b4

[Continuous batching] Enable python tests (openvinotoolkit#746)

d9324ea

Wovchena and others added 27 commits October 12, 2024 15:56

Retrigger

20a6954

Move to visual_language

0737db2

Correct py_vlm_pipeline.cpp include

0bddfba

fix

1b2da2d

Move vision_encoder, pipeline.hpp

7f0ef7a

Replace export_MiniCPM-V-2_6.py

457024c

Align impl (openvinotoolkit#954)

0659a5d

Downgrade optimum

d11f18d

Everywhere python -m pip install -U optimum<1.23 --no-dependencies

a82fe79

Remove duplicates

6d37b64

Fix dtype

b8fd628

Fixed misprint (openvinotoolkit#965)

d041d88

Generic fixes for CB integration via LLMPipeline (openvinotoolkit#961)

3f01a95

Extracted from openvinotoolkit#882

[SDXL] fix for euler_scheduler (openvinotoolkit#964)

7833ed6

[CI] Changed OpenVINO wheels path (openvinotoolkit#938)

deb4ae2

Preparing for changes from openvinotoolkit/openvino#26952 Co-authored-by: Alina Kladieva <[email protected]>

Use Constant ctor to share memory (openvinotoolkit#963)

684251c

Use new Constant construct to make it from memory pointer. --------- Co-authored-by: Ilya Lavrenov <[email protected]>

Update Stable Diffusion models comparison (openvinotoolkit#956)

04879dd

Hide VLM files and API (openvinotoolkit#951)

3d0fcee

Co-authored-by: Ilya Lavrenov <[email protected]>

Add hook sample for new transformers (openvinotoolkit#801)

00e532d

fix the issue openvinotoolkit#709 --------- Co-authored-by: Chen Peter <[email protected]>

Merge branch 'master' into replace-export_MiniCPM-V-2_6.py

b5bad1f

fix merge

7bdce55

delete src/cpp/src/visual_language/vlm_pipeline.cpp

ff4f4be

Temp

34d3c91

Wovchena force-pushed the temp branch from 43548ee to 34d3c91 Compare October 15, 2024 07:43

Wovchena added 2 commits October 15, 2024 11:49

temp

508767d

Add matching test

e7ae5cf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

temp #3

temp #3

Wovchena commented Dec 6, 2023

temp #3

Are you sure you want to change the base?

temp #3

Conversation

Wovchena commented Dec 6, 2023