Skip to content
@KubedAI

Kube-dAI

Building Smarter, Faster Data and AI Platforms on Kubernetes

Kube-dAI 🚀

Kube (Kubernetes) and dAI (Data and AI)

Pronounced: "Cubed AI"

Kube-dAI: Empowering Scalable Data and AI Solutions on Kubernetes

Welcome to Kube-dAI, an open-source initiative dedicated to providing highly scalable deployment patterns, infrastructure-as-code templates, and best practices for data processing and AI workloads on Kubernetes.

🚀 About Kube-dAI

Kube-dAI aims to empower organizations with battle-tested, scalable architectures for running data and AI workloads on Kubernetes, primarily on AWS Cloud. Our project offers:

  • End-to-End Deployment Blueprints: Automated scripts for deploying VPCs and EKS clusters with essential Kubernetes addons on AWS.
  • Best Practices: Compute, storage, and networking optimization techniques specific to AWS infrastructure.
  • Comprehensive Benchmarks: Performance evaluation and cost-efficiency analyses on AWS services.
  • Infrastructure-as-Code Templates: Rapid deployment using Terraform and Helm tailored for AWS.
  • Observability Integration: Tools for monitoring and logging, including Spark History Server and FluentBit.

Whether you're orchestrating large-scale data processing pipelines or deploying cutting-edge AI models, Kube-dAI provides the insights and tools you need to maximize your Kubernetes infrastructure on AWS.

🔍 Project Focus

Our project delves deep into compute and storage best practices for each deployment, covering:

🔍 Focus Areas

  • Data Processing Pipelines: Optimized for Spark, Flink, Trino, and more.
  • Machine Learning Inference & Training: Using KServe, RayServe, NVIDIA Triton.
  • Scalable Infrastructure: Automated deployment with Terraform, Crossplane, and GitOps using ArgoCD.
  • GitOps: LContinuous delivery and infrastructure management with GitOps.

🔧 How It Works

Our repositories provide everything you need to get started:

  • Infrastructure as Code (IaC): Easily deploy VPCs, EKS clusters, and essential Kubernetes add-ons on AWS.
  • Data & AI Workloads: Deploy Spark, Ray, Trino, and ML models with scalable architecture.
  • Monitoring & Observability: Pre-integrated tools like Prometheus, Grafana, and the Spark History Server.

📊 Benchmarks and Best Practices

Each repository includes detailed benchmarks and best practices, covering:

  • Scalability Tests: From gigabytes to petabytes on AWS infrastructure.
  • Throughput and Latency Measurements
  • Resource Utilization Optimizations
  • Cost-Efficiency Analyses
  • Comparative Studies of AWS Storage Solutions: EBS, EFS, S3, etc.
  • Scheduler Performance: Evaluations of YuniKorn, Volcano, and KubeQueue.
  • Best Practices for Specific Workload Types

📂 Repositories

Data Processing:

  • 🚀 spark-rapids-on-kubernetes: The first live repository! Accelerate Apache Spark workloads using GPUs with Spark RAPIDS on AWS. Fully launched and ready to use!
  • spark-on-kubernetes: Apache Spark deployment patterns and benchmarks on AWS. (Launching soon)
  • flink-on-kubernetes: Apache Flink deployment patterns and benchmarks on AWS. (Launching soon)
  • raydata-on-kubernetes: RayData deployment patterns and benchmarks on AWS. (Launching soon)
  • trino-on-kubernetes: Trino deployment patterns and benchmarks on AWS. (Launching soon)
  • druid-on-kubernetes: Apache Druid deployment patterns and benchmarks on AWS. (Launching soon)

Artificial Intelligence:

  • rayserve-on-kubernetes: RayServe deployment patterns and benchmarks on AWS. (Launching soon)
  • triton-on-kubernetes: NVIDIA Triton Inference Server deployment patterns and benchmarks on AWS. (Launching soon)
  • kserve-on-kubernetes: KServe deployment patterns and benchmarks on AWS. (Launching soon)
  • lws-on-kubernetes: LWS deployment patterns and benchmarks on AWS. (Launching soon)

💻 Getting Started

To leverage Kube-dAI for your projects:

  1. Explore Our Repositories: Find relevant technologies and patterns.

  2. Clone the Repository of Interest:

    git clone https://github.com/Kube-dAI/<repository-name>.git
    
  3. Follow the detailed README in each repository for setup, benchmarking, and best practices.

🤝 Contributing

We welcome contributions from the community! To contribute:

  1. Fork the relevant repository.
  2. Create a new branch for your feature, benchmark, or optimization.
  3. Submit a pull request with your improvements.

Please review our Contribution Guidelines for more details.

🔗 Blogs

📜 License

This project is open-source and available under the Apache License 2.0.

📞 Support

For assistance or to discuss advanced use cases:

  • Open an issue in the relevant repository

📝 Disclaimer

Kube-dAI is a community-driven project and is not affiliated with AWS or any other company. All trademarks and registered trademarks are the property of their respective owners.

Popular repositories Loading

  1. spark-history-server spark-history-server Public

    Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs

    Shell 16 9

  2. spark-rapids-on-kubernetes spark-rapids-on-kubernetes Public template

    Accelerating Data processing workloads on GPUs with Spark-RAPIDS

    HCL 5 1

  3. .github .github Public

  4. airflow-dags airflow-dags Public

    Sample DAGs repo to use with Apache Airflow GitSync feature.

    Python 6

Repositories

Showing 4 of 4 repositories
  • spark-rapids-on-kubernetes Public template

    Accelerating Data processing workloads on GPUs with Spark-RAPIDS

    KubedAI/spark-rapids-on-kubernetes’s past year of commit activity
    HCL 5 Apache-2.0 1 8 0 Updated Oct 16, 2024
  • .github Public
    KubedAI/.github’s past year of commit activity
    0 0 0 0 Updated Oct 14, 2024
  • spark-history-server Public

    Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs

    KubedAI/spark-history-server’s past year of commit activity
    Shell 16 Apache-2.0 9 4 0 Updated Sep 19, 2024
  • airflow-dags Public

    Sample DAGs repo to use with Apache Airflow GitSync feature.

    KubedAI/airflow-dags’s past year of commit activity
    Python 0 6 0 0 Updated Jun 28, 2023

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…