Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Registry #125

Open
9 tasks
ChakshuGautam opened this issue Jul 17, 2024 · 3 comments
Open
9 tasks

Model Registry #125

ChakshuGautam opened this issue Jul 17, 2024 · 3 comments
Assignees

Comments

@ChakshuGautam
Copy link
Collaborator

ChakshuGautam commented Jul 17, 2024

  • Model Version Management (commit hash, semantic version) - should happen while training
  • Provide model files (onnx, pt, bin) through a CDN
  • Rollback to an older version
  • Deployment by a version number
  • Track costs during training

Clicking train button on Admi Panel

ML Pod:

Admin panel :

  • when train button is clicked ,it'll hit model registry API to get the:

    • Base Model Branch on HF - the base model which will be used to train the dataset with
    • task_type: classfication/NER etc
    • model_format: onnx/pytorch - safetensors
    • model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
    • epochs (number of epochs the model is getting trained for)
    • args : training arguements used to fine tune the model
    • quantization: None mostly unless specified)
  • Admin Panel will hit dataset registry to get dataset id for the given model-botid

  • Admin Panel will hit /train API with the following parameters:

{
    "model": Base Model Branch on HF (from model registry)
    "epochs":  (from model registry)
      "task_type":  (from model registry)
    "dataset":  (from dataset registry)
        "versioning": {
         "owner": botid   
        "environment": bot environment 
        “model_name ': (from model registry) 
    },
“args”: (from model registry) 
}



Dataset service:

  • To create dataset for models with the following for each model-botid t least :

    • Base Model Branch on HF - the base model which will be used to train the dataset with
    • task_type: classfication/NER etc
    • model_format: onnx/pytorch - safetensors
    • model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
    • epochs (number of epochs the model is getting trained for)
    • args : training arguements used to fine tune the model
    • quantization: None mostly unless specified)
  • to create dataset for datasets with :
    datasetid for each model for each bot

@ChakshuGautam
Copy link
Collaborator Author

ChakshuGautam commented Aug 21, 2024

@suresh12 to review the Doc

@Gautam-Rajeev
Copy link
Contributor

Clicking train button on Admi Panel

ML Pod:

  • Modify the train API to support versioning

Admin panel :

  • when train button is clicked ,it'll hit model registry API to get the:

    • Base Model Branch on HF - the base model which will be used to train the dataset with
    • task_type: classfication/NER etc
    • model_format: onnx/pytorch - safetensors
    • model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
    • epochs (number of epochs the model is getting trained for)
    • args : training arguements used to fine tune the model
    • quantization: None mostly unless specified)
  • Admin Panel will hit dataset registry to get dataset id for the given model-botid

  • Admin Panel will hit /train API with the following parameters:

{
    "model": Base Model Branch on HF (from model registry)
    "epochs":  (from model registry)
      "task_type":  (from model registry)
    "dataset":  (from dataset registry)
        "versioning": {
         "owner": botid   
        "environment": bot environment 
        “model_name ': (from model registry) 
    },
“args”: (from model registry) 
}



Dataset service:

  • To create dataset for models with the following for each model-botid t least :

    • Base Model Branch on HF - the base model which will be used to train the dataset with
    • task_type: classfication/NER etc
    • model_format: onnx/pytorch - safetensors
    • model_name (purpose for which model is getting trained ) like agri_classification in AKAI/KMAI {can be same as service_name}
    • epochs (number of epochs the model is getting trained for)
    • args : training arguements used to fine tune the model
    • quantization: None mostly unless specified)
  • to create dataset for datasets with :
    datasetid for each model for each bot

@KDwevedi
Copy link
Contributor

KDwevedi commented Sep 5, 2024

Scoping Model Registry from ML Flow us lift and use directly.

Desirable Features

  • Model Metadata Storage
  • Version Management + Finetuning
  • Deployment
  • Utilising CDNs for making BIN and other model files available
  • Metrics for model use recorded
  • PoC

@KDwevedi KDwevedi assigned KDwevedi and unassigned suresh12 Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants