This repository contains multiple examples demonstrating different aspects of Ray Serve, including a translation service and a composed application example.
- Create a new conda environment:
conda create -n ray_examples_env python=3.10
conda activate ray_examples_env- Install the required packages:
pip install -r requirements.txtThe translation service example demonstrates how to serve a Hugging Face transformer model using Ray Serve.
- Start the Ray Serve application in one terminal:
cd <PATH>/RayServeExamples/RayServeDemo/TranslationApp
serve run serve_quickstart:translator_app- Test the translation service using a separate terminal:
# Using Python client
python TranslationApp/model_client.pyThe Composing App example demonstrates how to create complex applications by combining multiple Ray Serve deployments. This example showcases:
- Multiple Deployments: How to create and chain multiple deployments together
- Deployment Composition: Techniques for combining different services into a single application
- Routing and Load Balancing: How Ray Serve handles requests between different deployments
- State Management: Best practices for managing state across deployments
- Start the Ray Serve application:
cd <PATH>/RayServeExamples/RayServeDemo/CompositionsApp
serve run serve_quickstart_composed:app- Test the composed application using the provided client:
python ComposiningApp/composed_client.py- Incase ray[serve] is not install execute
pip install ray[serve] - Incase ray is not installed execute
pip install ray - Always run the client call from the different terminal
- If you are running the ray on the local system in order to execute the command on ray cluster use
ray start --headand it will start the local ray cluster - In order to stop the ray cluster use
ray stop
TranslationApp/: Contains the translation service implementationComposiningApp/: Contains the composed application code with multiple deploymentsrequirements.txt: Lists all required Python packages.gitignore: Specifies files to be ignored by git
The main dependencies are:
- Ray Serve for serving the models and managing deployments
- Transformers for the translation model
- FastAPI for the web server
- Various utility packages for HTTP handling and async operations
- Use clear naming conventions for deployments
- Implement proper error handling
- Use async/await for better performance
- Consider deployment scaling based on load
- Use proper logging for debugging
Feel free to contribute additional examples or improvements to the existing examples. Please follow the existing code style and include proper documentation for any new features.