Releases: opea-project/GenAIComps
Generative AI Components v1.1 Release Notes
OPEA Release Notes v1.1
We are pleased to announce the release of OPEA version 1.1, which includes significant contributions from the open-source community. This release addresses over 470 pull requests.
More information about how to get started with OPEA v1.1 can be found at Getting Started page. All project source code is maintained in the repository. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.
What's New in OPEA v1.1
This release introduces more scenarios with general availability, including:
- Newly supported Generative AI capabilities: Image-to-Video, Text-to-Image, Text-to-SQL and Avatar Animation.
- Generative AI Studio that offers a no-code alternative to create enterprise Generative AI applications.
- Expands the portfolio of supported hardware to include Intel® Arc™ GPUs and AMD® GPUs.
- Enhanced monitoring support, providing real-time insights into runtime status and system resource utilization for CPU and Intel® Gaudi® AI Accelerator, as well as Horizontal Pod Autoscaling (HPA).
- Helm Chart support for 7 new GenAIExamples and their microservices.
- Benchmark tools for long-context language models (LCLMs) such as LongBench and HELMET.
Highlights
New GenAI Examples
- AvatarChatbot: a chatbot that combines a virtual "avatar" that can run on either Intel Gaudi 2 AI Accelerator or Intel Xeon Scalable Processors.
- DBQnA: for seamless translation of natural language queries into SQL and deliver real-time database results.
- EdgeCraftRAG: a customizable and tunable RAG example for edge solutions on Intel® Arc™ GPUs.
- GraphRAG: a Graph RAG-based approach to summarization.
- Text2Image: an application that generates images based on text prompts.
- WorkflowExecAgent: a workflow executor example to handle data/AI workflow operations via LangChain agents to execute custom-defined workflow-based tools.
Enhanced GenAI Examples
- Multi-media support: DocSum, MultimodalQnA
- Multi-language support: AudioQnA, DocSum
New GenAI Components
- Text-to-Image: add Stable Diffusion microservice
- Image-to-Video: add Stable Video Diffusion microservice
- Text-to-SQL: add Text-to-SQL microservice
- Text-to-Speech: add GPT-SoVITS microservice
- Avatar Animation: add Animation microservice
- RAG: add GraphRAG with llama-index microservice
Enhanced GenAI Components
- Asynchronous support for microservices (28672956, 9df4b3c0, f3746dc8)
- Add vLLM backends for summarization, FAQ generation, code generation, and Agents
- Multimedia support (29ef6426, baafa402)
GenAIStudio
GenAI Studio, a new project of OPEA, streamlines the creation of enterprise Generative AI applications by providing an alternative UI-based processes to create end-to-end solutions. It supports GenAI application definition, evaluation, performance benchmarking, and deployment. The GenAI Studio empowers developers to effortlessly build, test, optimize their LLM solutions, and create a deployment package. Its intuitive no-code/low-code interface accelerates innovation, enabling rapid development and deployment of cutting-edge AI applications with unparalleled efficiency and precision.
Enhanced Observability
Observability offers real-time insights into component performance and system resource utilization. We enhanced this capability by monitoring key system metrics, including CPU, host memory, storage, network, and accelerators (such as Intel Gaudi), as well as tracking OPEA application scaling.
Helm Charts Support
OPEA examples and microservices support Helm Charts as the packaging format on Kubernetes (k8s). The newly supported examples include AgentQnA, AudioQnA, FaqGen, VisualQnA. The newly supported microservices include chathistory, mongodb, prompt, and Milvus for data-prep and retriever. Helm Charts have now option to get Prometheus metrics from the applications.
Long-context Benchmark Support
We added the following two benchmark kits to response to the community's requirements of long-context language models.
- HELMET: a comprehensive benchmark for long-context language models covering seven diverse categories of tasks. The datasets are application-centric and are designed to evaluate models at different lengths and levels of complexity.
- LongBench: a benchmark tool for bilingual, multitask, and comprehensive assessment of long context understanding capabilities of large language models.
Newly Supported Models
- llama-3.2 (1B/3B/11B/90B)
- glm-4-9b-chat
- Qwen2/2.5 (7B/32B/72B)
Newly Supported Hardware
- Intel® Arc™ GPU: vLLM powered by OpenVINO can perform optimal model serving on Intel® Arc™ GPU.
- AMD® GPU: deploy GenAI examples on AMD® GPUs using AMD® ROCm™: CodeTrans, CodeGen, FaqGen, DocSum, ChatQnA.
Notable Changes
GenAIExamples
- Functionalities
- New GenAI Examples
- [AvatarChatbot] Initiate "AvatarChatbot" (audio) example (cfffb4c, 960805a)
- [DBQnA] Adding DBQnA example in GenAIExamples (c0643b7, 6b9a27d)
- [EdgeCraftRag] Add EdgeCraftRag as a GenAIExample (c9088eb, 7949045, 096a37a)
- [GraphRAG] Add GraphRAG example a65640b
- [Text2Image]: Add example for text2image 085d859
- [WorkflowExecAgent] Add Workflow Executor Example bf5c391
- Enhanced GenAI Examples
- [AudioQnA] Add multi-language AudioQnA on Xeon 658867f
- [AgentQnA] Update AgentQnA example for v1.1 release 5eb3d28
- [ChatQnA] Enable vLLM Profiling for ChatQnA ([00d9bb6](https://github.com/opea-project...
- New GenAI Examples
Generative AI Components v1.0 Release Notes
OPEA Release Notes v1.0
What’s New in OPEA v1.0
-
Highlights
- Improve the RAG performance through microservice optimizations (e.g., Hugging Face TGI, vLLM) and megaservice tuning
- Provide the experimental LLM model training support, includes full fine-tuning and parameter-efficient fine-tuning (PEFT)
- Improve RAG with Knowledge Graph based on Neo4j
- Improve VisualQnA and provide multi-modality RAG support
- Faster microservice launch through removal of some dispatch overhead
- Enable Gateway with guardrail, and integrate nginx with CORS protection and data preparation
- Enable HorizontalPodAutoscaler (HPA) for better resource management
- Define the metrics of RAG performance and enable accuracy evaluation for more GenAI examples
- Further improvement on documentation and developer experience
-
Other features
- Enable OpenAI compatible format on applicable microservices
- Support microservice launch from ModelScope to address China ecosystem need
- Support Red Hat OpenShift Container Platform (RHOCP)
- Refactor the code and CI/CD pipeline to provide better support for contributors
- Improve Docker versioning to avoid the potential conflict
- Enhance GenAI Microservice Connector (GMC), including improvements such as router performance optimizations and other updates
- Introduce Memory Bandwidth Exporter that integrates with Kubernetes Node Resource Interface
-
Learn more about OPEA at
- Getting Started: https://opea-project.github.io/latest/index.html
- Github: https://github.com/opea-project
- Docker Hub: https://hub.docker.com/u/opea
-
Release Documentation:
- Landing Page: https://opea.dev/
- Release Notes: https://github.com/opea-project/docs/tree/main/release_notes
Details
GenAIExamples
-
Deployment
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Update mount path in xeon k8s(2a6af64)
- Add Nginx - k8s manifest in CodeTrans(6a679ba)
- Add Nginx - docker in CodeTrans(cc84847)
- watch more docker compose files changes(4b0bc26)
- Add chatQnA UI manifest(758d236)
- Revert the LLM model for kubernetes GMS(f5f1e32)
- [ChatQnA] Update retrieval & dataprep manifests(6730b24)
- [ChatQnA]Update manifests(3563f5d)
- [ChatQnA] Update benchmarking manifests(36fb9a9)
- [ChatQnA] udate OOB & Tuned manifests(ac34860)
- Add nginx and UI to the ChatQnA manifest(05f9828)
- [ChatQnA] Update OOB with wrapper manifests.(933c3d3)
- [Translation] Support manifests and nginx(1e13031)
- update V1.0 benchmark manifest (e5affb9)
- update image name(e2a74f7)
- K8S manifest: Update ChatQnA/CodeGen/CodeTrans/DocSum(0629696)
- Change megaservice path in line with new file structure(5ab27b6)
- Add ui/nginx support in K8S manifest for ChatQnA/CodeGen/CodeTrans/Docsum(ba94e01)
- Add chatQnA UI manifest(758d236)
- Yaml: add comments to specify gaudi device ids.(63406dc)
- add tgi bf16 setup on CPU k8s.(ba17031)
-
Documentation
- [ChatQnA] Update README for ModelScope(aebc23f)
- Update README.md(4bd7841)
- [ChatQnA] Update README for without Rerank Pipeline(6b617d6)
- [ChatQnA] Update Benchmark README for w/o rerank(4a51874)
- Fix readme for nv gpu(43b2ae5)
- [ChatQnA] Update Benchmark README to Fix Input Length(55d287d)
- Refine ChatQnA README for TGI(afc3341)
- Add default model for VisualQnA README(07baa8f)
- Update readme for manifests of some examples(adb157f)
- doc: use markdown table in supported_examples(9cf1d88)
- doc: remove invalid code block language(c6d811a)
- add AudioQnA readme with supported model(f4f4da2)
- add more code owners(7f89797)
- doc: fix headings(7a0fca7)
- [Codegen] Refine readme to prompt users on how to change the model.(814164d)
- Update README.md and remove some open-source details(2ef83fc)
- Add issue template(84a781a)
- doc: fix headings and indenting(67394b8)
- Add default model in readme for FaqGen and DocSum(d487093)
- Change docs of kubernetes for curl commands in README(4133757)
- Update v0.9 RAG release data(947936e)
- Explain Default Model in ChatQnA and CodeTrans READMEs(2a2ff45)
- Update docker images list.(a8244c4)
- refactor the network port setting for AWS(bc81770)
- Add validate microservice details link(bd811bd)
- [ChatQnA] Add Nginx in Docker Compose and README(6c36448
- [Doc] Update CodeGen and Translation READMEs(a09395e)
- [Doc] Refine READMEs(372d78c)
- Remove marketing materials(d85ec09)
- doc PR to main instead of of v1.0r(dc94026)
- Update README.md for Multiplatforms(b205dc7)
- Refine the quick start of ChatQnA(3b70fb0)
- Update supported_examples(96d5cd9)
- [Doc] doc improvement(e0b3b57)
- Fix README issues(bceacdc)
- doc: fix broken image reference and markdown(d422929)
- doc: give document meaningful title(a3fa0d6)
- doc: fix incorrefine readme for reorg(d2bab99)
- doc: fix incorrect path to png image files (d97882e)
- update doc according to comments(f990f79)
- doc: fix headings and indenting(67394b8)
- Update README.md(4bd7841)
- refine readme for reorg(d2bab99)
- Update README with new examples(2d28beb)
- README: fix broken links(ff6f841)
- Update v0.9 RAG release data([947936e](https://github....
Generative AI Components v0.9 Release Notes
OPEA Release Notes v0.9
What’s New in OPEA v0.9
-
Broaden functionality
- Provide telemetry functionalities for metrics and tracing using Prometheus, Grafana, and Jaeger
- Initialize two Agent examples: AgentQnA and DocIndexRetriever
- Support for authentication and authorization
- Add Nginx Component to strengthen backend security
- Provide Toxicity Detection Microservice
- Support the experimental Fine-tuning microservice
-
Enhancement
- Align the Microservice format with the standards of OpenAI (Chat Completions, Fine-tuning... etc)
- Enhance the performance benchmarking and evaluation for GenAI Examples, ex: TGI, resource allocation, ...etc
- Enable support for launching container images as a non-root user
- Use Llama-Guard-2-8B as default Guardrails model and bge-large-zh-v1.5 as default embedding model, mistral-7b-grok as default CodeTrans model
- Add ProductivitySuite to provide access management and maintains user context
-
Deployment
- Support Red Hat OpenShift Container Platform (RHOCP)
- GenAI Microservices Connector (GMC) successfully tested on Nvidia GPUs
- Add Kubernetes support for AudioQnA and VisualQnA examples
-
OPEA Docker Hub: https://hub.docker.com/u/opea
-
Thanks for the external contribution from Sharan Shirodkar, Aishwarya Ramasethu
, Michal Nicpon and Jacob Mansdorfer
Details
GenAIExamples
-
ChatQnA
- Update port in set_env.sh(040d2b7)
- Fix minor issue in ChatQnA Gaudi docker README(a5ed223)
- update chatqna dataprep-redis port(02a1536)
- Add support for .md file in file upload in the chatqna-ui(7a67298)
- Added the ChatQnA delete feature, and updated the corresponding README(09a3196)
- fixed ISSUE-528(45cf553)
- Fix vLLM and vLLM-on-Ray UT bug(cfcac3f)
- set OLLAMA_MODEL env to docker container(c297155)
- Update guardrail docker file path(06c4484)
- remove ray serve(c71bc68)
- Refine docker_compose for dataprep param settings(3913c7b)
- fix chatqna guardrails(db2d2bd)
- Support ChatQnA pipeline without rerank microservice(a54ffd2)
- Update the number of microservice replicas for OPEA v0.9(e6b4fff)
- Update set_env.sh(9657f7b)
- add env for chatqna vllm(f78aa9e)
-
Deployment
- update manifests for v0.9(ba78b4c)
- Update K8S manifest for ChatQnA/CodeGen/CodeTrans/DocSum(01c1b75)
- Update benchmark manifest to fix errors(4fd3517)
- Update env for manifest(4fa37e7)
- update manifests for v0.9(08f57fa)
- Add AudioQnA example via GMC(c86cf85)
- add k8s support for audioqna(0a6bad0)
- Update mainifest for FaqGen(80e3e2a)
- Add kubernetes support for VisualQnA(4f7fc39)
- Add dataprep microservice to chatQnA example and the e2e test(1c23d87)
-
Documentation
- [doc] Update README.md(c73e4e0)
- doc fix: Update README.md to remove specific dicscription of paragraph-1(5a9c109)
- doc: fix markdown in docker_image_list.md(9277fe6)
- doc: fix markdown in Translation/README.md(d645305)
- doc: fix markdown in SearchQnA/README.md(c461b60)
- doc: fix FaqGen/README.md markdown(704ec92)
- doc: fix markdown in DocSum/README.md(83712b9)
- doc: fix markdown in CodeTrans/README.md(076bca3)
- doc: fix CodeGen/README.md markdown(33f8329)
- doc: fix markdown in ChatQnA/README.md(015a2b1)
- doc: fix headings in markdown files(21fab71)
- doc: missed an H1 in the middle of a doc(4259240)
- doc: remove use of HTML for table in README(e81e0e5)
- Update ChatQnA readme with OpenShift instructions(ed48371)
- Convert HTML to markdown format.(14621f8)
- Fix typo {your_ip} to {host_ip}(ad8ca88)
- README fix typo(abc02e1)
- fix script issues in MD file(acdd712)
- Minor documentation improvements in the CodeGen README(17b9676)
- Refine Main README(08eb269)
- [Doc]Add a micro/mega service WorkFlow for DocSum(343d614)
- Update README for k8s deployment(fbb81b6)
-
Other examples
- Clean deprecated VisualQnA code(87617e7)
- Using TGI official release docker image for intel cpu(b2771ad)
- Add VisualQnA UI(923cf69)
- fix container name(5ac77f7)
- Add VisualQnA docker for both Gaudi and Xeon using TGI serving(2390920)
- Remove LangSmith from Examples(88eeb0d)
- Modify the language variable to match language highlight.(f08d411)
- Remove deprecated folder.(7dd9952)
- update env for manifest(4fa37e7)
- AgentQnA example(67df280)
- fix tgi xeon tag(6674832)
- Add new DocIndexRetriever example(566cf93)
- Add env params for chatqna xeon test(5d3950)
- ProductivitySuite Combo Application with REACT UI and Keycloak Authen(947cbe3)
- change codegen tgi model(06cb308)
- change searchqna prompt(acbaaf8)
- minor fix mismatched hf token(ac324a9)
- fix translation gaudi env(4f3be23)
- Minor fixes for CodeGen Xeon and Gaudi Kubernetes codegen.yaml (c25063f)
-
CI/CD/UT
Generative AI Components v0.8 Release Notes
OPEA Release Notes v0.8
What’s New in OPEA v0.8
-
Broaden functionality
- Support frequently asked questions (FAQs) generation GenAI example
- Expand the support of LLMs such as Llama3.1 and Qwen2 and support LVMs such as llava
- Enable end-to-end performance and accuracy benchmarking
- Support the experimental Agent microservice
- Support LLM serving on Ray
-
Multi-platform support
- Release the Docker images of GenAI components under OPEA dockerhub and support the deployment with Docker
- Support cloud-native deployment through Kubernetes manifests and GenAI Microservices Connector (GMC)
- Enable the experimental authentication and authorization support using JWT tokens
- Validate ChatQnA on multiple platforms such as Xeon, Gaudi, AIPC, Nvidia, and AWS
-
OPEA Docker Hub: https://hub.docker.com/u/opea
Details
GenAIExamples
-
ChatQnA
- Add ChatQnA instructions for AIPC(26d4ff)
- Adapt Vllm response format (034541)
- Update tgi version(5f52a1)
- Update README.md(f9312b)
- Udpate ChatQnA docker compose for Dataprep Update(335362)
- [Doc] Add valid micro-service details(e878dc)
- Updates for running ChatQnA + Conversational UI on Gaudi(89ddec)
- Fix win PC issues(ba6541)
- [Doc]Add ChatQnA Flow Chart(97da49)
- Add guardrails in the ChatQnA pipeline(955159)
- Fix a minor bug for chatqna in docker-compose(b46ae8)
- Support vLLM/vLLM-on-Ray/Ray Serve for ChatQnA(631d84)
- Added ChatQnA example using Qdrant retriever(c74564)
- Update TEI version v1.5 for better performance(f4b4ac)
- Update ChatQnA upload feature(598484)
- Add auto truncate for embedding and rerank(8b6094)
-
Deployment
- Add Kubernetes manifest files for deploying DocSum(831463)
- Update Kubernetes manifest files for CodeGen(2f9397)
- Add Kubernetes manifest files for deploying CodeTrans(c9548d)
- Updated READMEs for kubernetes example pipelines(c37d9c)
- Update all examples yaml files of GMC in GenAIExample(290a74)
- Doc: fix minor issue in GMC doc(d99461)
- README for installing 4 worklods using helm chart(6e797f)
- Update Kubernetes manifest files for deploying ChatQnA(665c46)
- Add new example of SearchQnA for GenAIExample(21b7d1)
- Add new example of Translation for GenAIExample(d0b028)
-
Other examples
- Update reranking microservice dockerfile path (d7a5b7)
- Update tgi-gaudi version(3505bd)
- Refine README of Examples(f73267)
- Update READMEs(8ad7f3)
- [CodeGen] Add codegen flowchart(377dd2)
- Update audioqna image name(615f0d)
- Add auto-truncate to gaudi tei (8d4209)
- Update visualQnA chinese version(497895)
- Fix Typo for Translation Example(95c13d)
- FAQGen Megaservice(8c4a25)
- Code-gen-react-ui(1b48e5)
- Added doc sum react-ui(edf0d1)
-
CI/UT
- Frontend failed with unknown timeout issue (7ebe78)
- Adding Chatqna Benchmark Test(11a56e)
- Expand tgi connect timeout(ee0dcb)
- Optimize gmc manifest e2e tests(15fc6f)
- Add docker compose yaml print for test(bb4230)
- Refactor translation ci test (b7975e)
- Refactor searchqna ci test(ecf333)
- Translate UT for UI(284d85)
- Enhancement the codetrans e2e test(450efc)
- Allow gmc e2e workflow to get secrets(f45f50)
- Add checkout ref in gmc e2e workflow(62ae64)
- SearchQnA UT(268d58)
GenAIComps
-
Cores
-
LLM
- Optional vllm microservice container build(963755)
- Refine vllm instruction(6e2c28)
- Introduce 'entrypoint.sh' for some Containers(9ecc5c)
- Support llamaindex for retrieval microservice and remove langchain(61795f)
- Update tgi with text-generation-inference:2.1.0(f23694)
- Fix requirements(f4b029)
- Add vLLM on Ray microservice(ec3b2e)
- Update code/readme/UT for Ray Serve and VLLM([dd939c](https://gith...
Generative AI Components v0.7 Release Notes
GenAIComps
-
Cores
-
LLM
- Support Qwen2 in LLM Microservice(3f5cde)
- Fix the vLLM docker compose issues(3d134d)
- Enable vLLM Gaudi support for LLM service based on officially habana vllm release(0dedc2)
- Openvino support in vllm(7dbad0)
- Support Ollama microservice(a00e36)
- Support vLLM XFT LLM microservice(2a6a29, 309c2d, fe5f39)
- Add e2e test for llm summarization tgi(e8ebd9)
-
DataPrep
- Support Dataprep(f7443f), embedding(f37ce2) microservice with Llama Index
- Fix dataprep microservice path issue(e20acc)
- Add milvus microservice(e85033)
- Add Ray version for multi file process(40c1aa)
- Fix dataprep timeout issue(61ead4)
- Add e2e test for dataprep redis langchain(6b7bec)
- Supported image summarization with LVM in dataprep microservice(86412c)
- Enable conditional splitting for html files(e1dad1)
- Added support for pyspark in dataprep microservice(a5eb14)
- DataPrep extract info from table in the docs(953e78)
- Added support for extracting info from image in the docs(e23745)
-
Other Components
- Add PGvector support in Vectorstores(1b7001) and Retriever(75eff6), Dataprep(9de3c7)
- Add Mosec embedding(f76685) and reranking(a58ca4)
- Add knowledge graph components(4c0afd)
- Add LVMs LLaVA component(bd385b)
- Add asr/tts components for xeon and hpu(cef6ea)
- Add WebSearch Retriever Microservice(900178)
- Add initial pii detection microservice(e38041)
- Pinecone support for dataprep and retrieval microservice(8b6486)
- Support prometheus metrics for opea microservices(758914), (900178)
- Add no_proxy env for micro services(df0c11)
- Enable RAGAS(8a670e)
- Fix RAG performance issues(70c23d)
- Support rerank and retrieval of RAG OPT(b51675)
- Reranking using an optimized bi-encoder(574847)
- Use parameter for retriever(358dbd), reranker(dfdd08)
-
CI
Others
Generative AI Components v0.6 Release Notes
GenAIComps
- Activate a suite of microservices including ASR, LLMS, Rerank, Embedding, Guardrails, TTS, Telemetry, DataPrep, Retrieval, and VectorDB.
- ASR functionality is fully operational on Xeon architecture, pending readiness on Gaudi.
- Retrieval capabilities are functional on LangChain, awaiting readiness on LlamaIndex.
- VectorDB functionality is supported on Redis, Chroma, and Qdrant, with readiness pending on SVS.
- Added 14 file formats support in data preparation microservices and enabled Safeguard of conversation in guardrails.
- Added the Ray Gaudi Supported for LLM Service.