Skip to content

mtech00/MLE_

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MLE Epam

This project is based on MLE Basic Example by MarkKorvin. The requirements page mentioned it can be used as a template, so I used the main structure from that template even though I did not approve some parts (I will explain why in later sections).


Overview

Before going into how we can run training and inference, firstly we will clone this repository to our local environment. In this example, I am using a Windows environment, but I will use only universal commands the same as on Linux and macOS.

git clone https://github.com/mtech00/MLE_Epam.git

After cloning, you will have this directory and files:

  • inference: Dockerfile, inference.csv, inference.py, requirements.txt
  • training: Dockerfile, train.csv, train.py, requirements.txt
  • .gitignore
  • readme.md
  • data_loader.py

On the requirements for the work, it says you must have data ready for inference and training, but it also says you have a data loader script. After cloning this repo, you will not use this script unless you have a problem about datasets. If there is a problem, just run data_loader.py; it will automatically get the iris dataset, split it into train and inference subsets, and dump the relevant folders to the training and inference directories.

Since this is a containerized project on Docker, you must have Docker Engine/Daemon running successfully. That is all you need; everything else will be handled by the Dockerfiles and Docker Engine.


Project Structure

  1. Train folder
  2. Build train_image
  3. Run train_container
  4. Copy model to inference folder
  5. Build inference_image
  6. Run inference_container
  7. Copy results.csv file

Your results.csv file will be ready to analyze afterward.


Steps to Train

  1. Go to the repo folder

    cd MLE_Epam
  2. Go to the training folder

    cd training
  3. Build the training Docker image

    docker build -t training_image .

    Building will take about one minute due to PyTorch, even though I used the CPU-only version. (If you are using an ARM-based CPU, use only the corresponding Torch build. Since the industry is mostly x86-based,##edit## I used the universal Torch model on the CPU index since this model is very small, and downloading a GPU-based version would be overkill. However, mentioning only the CPU version in the requirements text is not ideal due to platform dependency, such as x86 or ARM architectures. Instead of explicitly stating 'Torch' for the CPU index, it would be better to automatically determine the appropriate CPU architecture , but I decided for the universal Torch library for Apple Silicon Mac users.)

  4. Run the training Docker container

    docker run --name training_container training_image

    In the Markov example, it uses the container ID with a specified name. However, this approach is unnecessary and overly complicated. Even if a name conflict occurs, it can be easily resolved by renaming .The container will automatically run and stop. If there is an error, the scripts will raise info about it. If it runs successfully, you will see info about every epoch, dataset metrics, and where the model (model.pt) is saved (inside the container).

  5. Copy the trained model to the inference folder

    docker cp training_container:/app/model.pt ../inference/model.pt

    MarkKorvin’s example used docker cp; I am not sure if this is the best option. I think we could use a shared volume or Docker networks without manual commands, but I followed the project requirements. This command lets us access files inside Docker containers and copy them to our local directory. It works with relative paths.


Steps to Inference

  1. Go back to the parent directory

    cd ..
  2. Go to the inference folder

    cd inference
  3. Build the inference Docker image

    docker build -t inference_image .

    Again, this build will take about one minute. In this container, I think we can find more lightweight libraries for inference like ONNX, etc. I mentioned every specific version in the requirements.txt file due to reliability. Choosing known combinations is better. If we don’t specify a version, it will use the latest ones, but there might be compatibility issues with the newest versions, which we cannot predict.

  4. Run the inference Docker container

    docker run --name inference_container inference_image

    This will handle errors such as there being no model, and it will give info about the inference dataset and where it dumps results.csv (inside the container).

  5. Copy the results file to local

    docker cp inference_container:/app/results.csv ../inference/results.csv

    Now you have your results.csv file in your local environment.


Conclusion

Without the headache of code dependencies, environment setups, or even Python installation on your local machine, you can train the neural network and perform inference purely using Docker. The final results.csv file contains your inference results.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published