Merge branch 'master' into ov_file_path_definition

openvinotoolkit · Jan 6, 2025 · 95e2f55 · 95e2f55
2 parents 844c811 + 1d955cd
commit 95e2f55
Show file tree

Hide file tree

Showing 69 changed files with 1,262 additions and 499 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,14 @@
 <div align="center">
 <img src="docs/dev/assets/openvino-logo-purple-black.svg" width="400px">
 
+<h3 align="center">
+Open-source software toolkit for optimizing and deploying deep learning models.
+</h3>
+
+<p align="center">
+ <a href="https://docs.openvino.ai/2024/index.html"><b>Documentation</b></a> • <a href="https://blog.openvino.ai"><b>Blog</b></a> • <a href="https://docs.openvino.ai/2024/about-openvino/key-features.html"><b>Key Features</b></a> • <a href="https://docs.openvino.ai/2024/learn-openvino.html"><b>Tutorials</b></a> • <a href="https://docs.openvino.ai/2024/documentation/openvino-ecosystem.html"><b>Integrations</b></a> • <a href="https://docs.openvino.ai/2024/about-openvino/performance-benchmarks.html"><b>Benchmarks</b></a> • <a href="https://github.com/openvinotoolkit/openvino.genai"><b>Generative AI</b></a>
+</p>
+
 [![PyPI Status](https://badge.fury.io/py/openvino.svg)](https://badge.fury.io/py/openvino)
 [![Anaconda Status](https://anaconda.org/conda-forge/openvino/badges/version.svg)](https://anaconda.org/conda-forge/openvino)
 [![brew Status](https://img.shields.io/homebrew/v/openvino)](https://formulae.brew.sh/formula/openvino)
@@ -10,14 +18,14 @@
 [![brew Downloads](https://img.shields.io/homebrew/installs/dy/openvino)](https://formulae.brew.sh/formula/openvino)
  </div>
 
-Welcome to OpenVINO™, an open-source software toolkit for optimizing and deploying deep learning models.
 
 - **Inference Optimization**: Boost deep learning performance in computer vision, automatic speech recognition, generative AI, natural language processing with large and small language models, and many other common tasks.
-- **Flexible Model Support**: Use models trained with popular frameworks such as TensorFlow, PyTorch, ONNX, Keras, and PaddlePaddle. Convert and deploy models without original frameworks.
+- **Flexible Model Support**: Use models trained with popular frameworks such as PyTorch, TensorFlow, ONNX, Keras, PaddlePaddle, and JAX/Flax. Directly integrate models built with transformers and diffusers from the Hugging Face Hub using Optimum Intel. Convert and deploy models without original frameworks.
 - **Broad Platform Compatibility**: Reduce resource demands and efficiently deploy on a range of platforms from edge to cloud. OpenVINO™ supports inference on CPU (x86, ARM), GPU (OpenCL capable, integrated and discrete) and AI accelerators (Intel NPU).
 - **Community and Ecosystem**: Join an active community contributing to the enhancement of deep learning performance across various domains.
 
-Check out the [OpenVINO Cheat Sheet](https://docs.openvino.ai/2024/_static/download/OpenVINO_Quick_Start_Guide.pdf) for a quick reference.
+Check out the [OpenVINO Cheat Sheet](https://docs.openvino.ai/2024/_static/download/OpenVINO_Quick_Start_Guide.pdf) and [Key Features](https://docs.openvino.ai/2024/about-openvino/key-features.html) for a quick reference.
+
 
 ## Installation
 
@@ -40,6 +48,8 @@ Learn how to optimize and deploy popular models with the [OpenVINO Notebooks](ht
 - [Multimodal assistant with LLaVa and OpenVINO](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llava-multimodal-chatbot/llava-multimodal-chatbot-genai.ipynb)
 - [Automatic speech recognition using Whisper and OpenVINO](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/whisper-asr-genai/whisper-asr-genai.ipynb)
 
+Discover more examples in the [OpenVINO Samples (Python & C++)](https://docs.openvino.ai/2024/learn-openvino/openvino-samples.html) and [Notebooks (Python)](https://docs.openvino.ai/2024/learn-openvino/interactive-tutorials-python.html).
+
 Here are easy-to-follow code examples demonstrating how to run PyTorch and TensorFlow model inference using OpenVINO:
 
 **PyTorch Model**
@@ -86,25 +96,43 @@ data = np.random.rand(1, 224, 224, 3)
 output = compiled_model({0: data})
 ```
 
-OpenVINO also supports CPU, GPU, and NPU devices and works with models in TensorFlow, PyTorch, ONNX, TensorFlow Lite, PaddlePaddle model formats.
-With OpenVINO you can do automatic performance enhancements at runtime customized to your hardware (preserving model accuracy), including:
-asynchronous execution, batch processing, tensor fusion, load balancing, dynamic inference parallelism, automatic BF16 conversion, and more.
+OpenVINO supports the CPU, GPU, and NPU [devices](https://docs.openvino.ai/2024/openvino-workflow/running-inference/inference-devices-and-modes.html) and works with models from PyTorch, TensorFlow, ONNX, TensorFlow Lite, PaddlePaddle, and JAX/Flax [frameworks](https://docs.openvino.ai/2024/openvino-workflow/model-preparation.html). It includes [APIs](https://docs.openvino.ai/2024/api/api_reference.html) in C++, Python, C, NodeJS, and offers the GenAI API for optimized model pipelines and performance.
+
+## Generative AI with OpenVINO
+
+Get started with the OpenVINO GenAI [installation](https://docs.openvino.ai/2024/get-started/install-openvino/install-openvino-genai.html) and refer to the [detailed guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide/genai-guide.html) to explore the capabilities of Generative AI using OpenVINO.
+
+Learn how to run LLMs and GenAI with [Samples](https://github.com/openvinotoolkit/openvino.genai/tree/master/samples) in the [OpenVINO™ GenAI repo](https://github.com/openvinotoolkit/openvino.genai). See GenAI in action with Jupyter notebooks: [LLM-powered Chatbot](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/README.md) and [LLM Instruction-following pipeline](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-question-answering/README.md).
+
+## Documentation
+
+[User documentation](https://docs.openvino.ai/) contains detailed information about OpenVINO and guides you from installation through optimizing and deploying models for your AI applications.
+
+[Developer documentation](./docs/dev/index.md) focuses on the OpenVINO architecture and describes [building](./docs/dev/build.md)  and [contributing](./CONTRIBUTING.md) processes.
 
 ## OpenVINO Ecosystem
 
--   [🤗Optimum Intel](https://github.com/huggingface/optimum-intel) -  a simple interface to optimize Transformers and Diffusers models.
+### OpenVINO Tools
+
 -   [Neural Network Compression Framework (NNCF)](https://github.com/openvinotoolkit/nncf) - advanced model optimization techniques including quantization, filter pruning, binarization, and sparsity.
 -   [GenAI Repository](https://github.com/openvinotoolkit/openvino.genai) and [OpenVINO Tokenizers](https://github.com/openvinotoolkit/openvino_tokenizers) - resources and tools for developing and optimizing Generative AI applications.
 -   [OpenVINO™ Model Server (OVMS)](https://github.com/openvinotoolkit/model_server) - a scalable, high-performance solution for serving models optimized for Intel architectures.
 -   [Intel® Geti™](https://geti.intel.com/) - an interactive video and image annotation tool for computer vision use cases.
 
-Check out the [Awesome OpenVINO](https://github.com/openvinotoolkit/awesome-openvino) repository to discover a collection of community-made AI projects based on OpenVINO!
+### Integrations
 
-## Documentation
+-   [🤗Optimum Intel](https://github.com/huggingface/optimum-intel) - grab and use models leveraging OpenVINO within the Hugging Face API.
+-   [Torch.compile](https://docs.openvino.ai/2024/openvino-workflow/torch-compile.html) - use OpenVINO for Python-native applications by JIT-compiling code into optimized kernels.
+-   [OpenVINO LLMs inference and serving with vLLM](https://docs.vllm.ai/en/stable/getting_started/openvino-installation.html) - enhance vLLM's fast and easy model serving with the OpenVINO backend.
+-   [OpenVINO Execution Provider for ONNX Runtime](https://onnxruntime.ai/docs/execution-providers/OpenVINO-ExecutionProvider.html) - use OpenVINO as a backend with your existing ONNX Runtime code.
+-   [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/openvino/) - build context-augmented GenAI applications with the LlamaIndex framework and enhance runtime performance with OpenVINO.
+-   [LangChain](https://python.langchain.com/docs/integrations/llms/openvino/) - integrate OpenVINO with the LangChain framework to enhance runtime performance for GenAI applications.
 
-[User documentation](https://docs.openvino.ai/) contains detailed information about OpenVINO and guides you from installation through optimizing and deploying models for your AI applications.
+Check out the [Awesome OpenVINO](https://github.com/openvinotoolkit/awesome-openvino) repository to discover a collection of community-made AI projects based on OpenVINO!
 
-[Developer documentation](./docs/dev/index.md) focuses on how OpenVINO [components](./docs/dev/index.md#openvino-components) work and describes [building](./docs/dev/build.md)  and [contributing](./CONTRIBUTING.md) processes.
+## Performance
+
+Explore [OpenVINO Performance Benchmarks](https://docs.openvino.ai/2024/about-openvino/performance-benchmarks.html) to discover the optimal hardware configurations and plan your AI deployment based on verified data.
 
 ## Contribution and Support
 
@@ -118,9 +146,8 @@ You can ask questions and get support on:
 * The [`openvino`](https://stackoverflow.com/questions/tagged/openvino) tag on Stack Overflow\*.
 
 
-## Additional Resources
+## Resources
 
-* [Product Page](https://software.intel.com/content/www/us/en/develop/tools/openvino-toolkit.html)
 * [Release Notes](https://docs.openvino.ai/2024/about-openvino/release-notes-openvino.html)
 * [OpenVINO Blog](https://blog.openvino.ai/)
 * [OpenVINO™ toolkit on Medium](https://medium.com/@openvino)
@@ -145,4 +172,3 @@ By contributing to the project, you agree to the license and copyright terms the
 
 ---
 \* Other names and brands may be claimed as the property of others.
-
diff --git a/cmake/developer_package/cross_compile/cross_compiled_func.cmake b/cmake/developer_package/cross_compile/cross_compiled_func.cmake
@@ -11,7 +11,7 @@ set(_ACCEPTED_ARCHS_AVX     "^(ANY|SSE42|AVX)$")
 set(_ACCEPTED_ARCHS_AVX2    "^(ANY|SSE42|AVX|AVX2)$")
 set(_ACCEPTED_ARCHS_AVX512F "^(ANY|SSE42|AVX|AVX2|AVX512F)$")
 set(_ACCEPTED_ARCHS_NEON_FP16 "^(ANY|NEON_FP16)$")
-set(_ACCEPTED_ARCHS_SVE     "^(ANY|SVE)$")
+set(_ACCEPTED_ARCHS_SVE     "^(ANY|NEON_FP16|SVE)$")
 
 ## Arch specific definitions
 set(_DEFINE_ANY       "")
@@ -186,10 +186,10 @@ endfunction()
 #  Return currently requested ARCH id
 #
 function(_currently_requested_top_arch VAR)
-    if(ENABLE_NEON_FP16)
-        set(RES NEON_FP16)
-    elseif(ENABLE_SVE)
+    if(ENABLE_SVE)
         set(RES SVE)
+    elseif(ENABLE_NEON_FP16)
+        set(RES NEON_FP16)
     elseif(ENABLE_AVX512F)
         set(RES AVX512F)
     elseif(ENABLE_AVX2)

diff --git a/cmake/developer_package/features.cmake b/cmake/developer_package/features.cmake
@@ -4,6 +4,7 @@
 
 include(options)
 include(target_flags)
+include(compile_flags/os_flags)
 
 if(WIN32)
     set (CPACK_GENERATOR "ZIP" CACHE STRING "Cpack generator for OpenVINO")
@@ -49,9 +50,9 @@ ov_dependent_option (ENABLE_AVX2 "Enable AVX2 optimizations" ON "X86_64 OR (X86
 
 ov_dependent_option (ENABLE_AVX512F "Enable AVX512 optimizations" ON "X86_64 OR (X86 AND NOT EMSCRIPTEN)" OFF)
 
-ov_dependent_option(ENABLE_NEON_FP16 "Enable ARM FP16 optimizations" ON "AARCH64" OFF)
+ov_dependent_option (ENABLE_NEON_FP16 "Enable ARM FP16 optimizations" ON "AARCH64" OFF)
 
-ov_dependent_option(ENABLE_SVE "Enable SVE optimizations" ON "AARCH64" OFF)
+ov_dependent_option (ENABLE_SVE "Enable SVE optimizations" ON "AARCH64" OFF)
 
 # Type of build, we add this as an explicit option to default it to ON
 get_property(BUILD_SHARED_LIBS_DEFAULT GLOBAL PROPERTY TARGET_SUPPORTS_SHARED_LIBS)
@@ -106,3 +107,11 @@ if(ENABLE_AVX512F)
         set(ENABLE_AVX512F OFF CACHE BOOL "" FORCE)
     endif()
 endif()
+
+if(ENABLE_SVE)
+    ov_check_compiler_supports_sve("-march=armv8-a+sve")
+
+    if(NOT CXX_HAS_SVE)
+        set(ENABLE_SVE OFF CACHE BOOL "" FORCE)
+    endif()
+endif()
diff --git a/docs/articles_en/about-openvino/performance-benchmarks.rst b/docs/articles_en/about-openvino/performance-benchmarks.rst
@@ -56,8 +56,7 @@ implemented in your solutions. Click the buttons below to see the chosen benchma
 
          :material-regular:`table_view;1.4em` LLM performance for AI PC
 
-.. uncomment under 
-   .. .. grid-item::
+   .. grid-item::
 
       .. button-link:: #
          :class: ovms-toolkit-benchmark-llm-result

diff --git a/docs/articles_en/about-openvino/release-notes-openvino.rst b/docs/articles_en/about-openvino/release-notes-openvino.rst
@@ -641,7 +641,7 @@ Previous 2024 releases
    * New samples and pipelines are now available:
 
      * An example IterableStreamer implementation in
-       `multinomial_causal_lm/python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/multinomial_causal_lm>`__
+       `multinomial_causal_lm/python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python//text_generation/multinomial_causal_lm>`__
 
    * GenAI compilation is now available as part of OpenVINO via the –DOPENVINO_EXTRA_MODULES CMake
      option.

diff --git a/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst b/docs/articles_en/learn-openvino/llm_inference_guide/genai-guide.rst
@@ -367,7 +367,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
 
 
          For more information, refer to the
-         `Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/chat_sample/>`__.
+         `Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/text_generation/chat_sample/>`__.
 
       .. tab-item:: C++
          :sync: cpp
@@ -415,7 +415,7 @@ make sure to :doc:`install OpenVINO with GenAI <../../get-started/install-openvi
 
 
          For more information, refer to the
-         `C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/chat_sample/>`__
+         `C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/text_generation/chat_sample/>`__
 
 
 .. dropdown:: Using GenAI with Vision Language Models
@@ -803,7 +803,7 @@ runs prediction of the next K tokens, thus repeating the cycle.
 
 
       For more information, refer to the
-      `Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/speculative_decoding_lm/>`__.
+      `Python sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/python/text_generation/speculative_decoding_lm/>`__.
 
 
    .. tab-item:: C++
@@ -859,7 +859,7 @@ runs prediction of the next K tokens, thus repeating the cycle.
 
 
       For more information, refer to the
-      `C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/speculative_decoding_lm/>`__
+      `C++ sample <https://github.com/openvinotoolkit/openvino.genai/tree/master/samples/cpp/text_generation/speculative_decoding_lm/>`__
 
 
 

diff --git a/docs/articles_en/openvino-workflow/model-preparation/convert-model-pytorch.rst b/docs/articles_en/openvino-workflow/model-preparation/convert-model-pytorch.rst
@@ -206,14 +206,16 @@ Here is an example of how to convert a model obtained with ``torch.export``:
 Converting a PyTorch Model from Disk
 ####################################
 
-PyTorch provides the capability to save models in two distinct formats: ``torch.jit.ScriptModule`` and ``torch.export.ExportedProgram``.
-Both formats can be saved to disk as standalone files, enabling them to be reloaded independently of the original Python code.
+PyTorch can save models in two formats: ``torch.jit.ScriptModule`` and ``torch.export.ExportedProgram``.
+Both formats may be saved to drive as standalone files and reloaded later, independently of the
+original Python code.
 
 ExportedProgram Format
 ++++++++++++++++++++++
 
-The ``ExportedProgram`` format is saved on disk using `torch.export.save() <https://pytorch.org/docs/stable/export.html#serialization>`__.
-Below is an example of how to convert an ``ExportedProgram`` from disk:
+You can save the ``ExportedProgram`` format using
+`torch.export.save() <https://pytorch.org/docs/stable/export.html#serialization>`__.
+Here is an example of how to convert it:
 
 .. tab-set::
 
@@ -236,8 +238,9 @@ Below is an example of how to convert an ``ExportedProgram`` from disk:
 ScriptModule Format
 +++++++++++++++++++
 
-`torch.jit.save() <https://pytorch.org/docs/stable/generated/torch.jit.save.html>`__ serializes ``ScriptModule`` object on disk.
-To convert the serialized ``ScriptModule`` format, run ``convert_model`` function with ``example_input`` parameter as follows:
+`torch.jit.save() <https://pytorch.org/docs/stable/generated/torch.jit.save.html>`__ serializes
+the ``ScriptModule`` object on a drive. To convert the serialized ``ScriptModule`` format, run
+the ``convert_model`` function with ``example_input`` parameter as follows:
 
 .. code-block:: py
    :force:
@@ -252,15 +255,15 @@ To convert the serialized ``ScriptModule`` format, run ``convert_model`` functio
 Exporting a PyTorch Model to ONNX Format
 ########################################
 
-An alternative method of converting PyTorch models is exporting a PyTorch model to ONNX with
-``torch.onnx.export`` first and then converting the resulting ``.onnx`` file to OpenVINO Model
-with ``openvino.convert_model``. It can be considered as a backup solution if a model cannot be
-converted directly from PyTorch to OpenVINO as described in the above chapters. Converting through
-ONNX can be more expensive in terms of code, conversion time, and allocated memory.
+An alternative method of converting a PyTorch models is to export it to ONNX first
+(with ``torch.onnx.export``) and then convert the resulting ``.onnx`` file to the OpenVINO IR
+model (with ``openvino.convert_model``). It should be considered a backup solution, if a model
+cannot be converted directly, as described previously. Converting through ONNX can be more
+expensive in terms of code overhead, conversion time, and allocated memory.
 
 1. Refer to the `Exporting PyTorch models to ONNX format <https://pytorch.org/docs/stable/onnx.html>`__
    guide to learn how to export models from PyTorch to ONNX.
-2. Follow :doc:`Convert an ONNX model <convert-model-onnx>` chapter to produce OpenVINO model.
+2. Follow the :doc:`Convert an ONNX model <convert-model-onnx>` guide to produce OpenVINO IR.
 
 Here is an illustration of using these two steps together:
 

diff --git a/docs/articles_en/openvino-workflow/torch-compile.rst b/docs/articles_en/openvino-workflow/torch-compile.rst
@@ -5,7 +5,8 @@ PyTorch Deployment via "torch.compile"
 
 The ``torch.compile`` feature enables you to use OpenVINO for PyTorch-native applications.
 It speeds up PyTorch code by JIT-compiling it into optimized kernels.
-By default, Torch code runs in eager-mode, but with the use of ``torch.compile`` it goes through the following steps:
+By default, Torch code runs in eager-mode, but with the use of ``torch.compile`` it goes
+through the following steps:
 
 1. **Graph acquisition** - the model is rewritten as blocks of subgraphs that are either: