Skip to content

model training

KimJeongChul edited this page Apr 30, 2019 · 1 revision

Model Training

Library : (boto3 / azure.functions, azure.storage.file / google.cloud.storage), sklearn, pandas, time, os, re

  • aws : build your deployment package

aws-build-deployment-package -> pandas, sklearn

  • google : requirements.txt
google-cloud-storage
gcsfs
scikit-learn
pandas
numpy
  • azure : requirements.txt
az==0.1.0.dev1
azure-functions==1.0.0b3
azure-functions-worker==1.0.0b3
grpcio==1.14.2
grpcio-tools==1.14.2
protobuf==3.6.1
six==1.12.0
azure_storage_blob==1.0.0
azure-storage-file==1.0.0
cryptography==2.0
numpy
pandas
scikit-learn

Workload Input: Text

Workload Output: json

Lambda payload(test-event) example:

dataset_object_key : amazon fine food reviews reviews10mb.csv, reviews20mb.csv, reviews50mb.csv, reviews100mb.csv or https://snap.stanford.edu/data/web-FineFoods.html

model : logisitc regression model name, example : lr_model.pk

model_object_key :

{
    "dataset_object_key": [DATASET_OBJECT_KEY],
    "dataset_bucket": [DATASET_BUCKET_NAME],
    "model_bucket": [MODEL_BUCKET_NAME],
    "model_object_key": [MODEL_OBJECT_KEY]
}

Lambda Output : prediction, latency