Automatic setup and deploy a MLflow server. This includes:
- A MLflow server
- A Postgres database to store MLflow metadata, such as:
- experiment data;
- run data;
- model registry data.
- Minio, an open-source object storage system, to store and retrieve large files, such as:
- model artifacts;
- data sets.
- uv or Miniconda3 (I will be using
uv
here) - Docker & Docker Compose
You could use WSL2
on a Windows
machine, as an alternative to an Ubuntu
machine.
- Clone this repo and navigate inside:
git clone https://github.com/pandego/mlflow-postgres-minio.git cd ./mlflow-postgres-minio
- Rename the
.env.example
file to.env
and fill it with your credentials:cp .env.example .env
- (Optional) Add these to the
~/.bashrc
environment; edit to your own preferred secrets:export AWS_ACCESS_KEY_ID=minio export AWS_SECRET_ACCESS_KEY=minio123 # CHANGE THIS ON THE .env FILE export MLFLOW_S3_ENDPOINT_URL=http://localhost:9000
- Save/close (
control + X > y > Enter
), and then refresh the~/.bashrc
file:source ~/.bashrc
- Save/close (
-
Launch the
docker-compose
command to build and start all containers needed for the MLflow service:docker compose --env-file .env up -d --build --force-recreate
-
Give it a few minutes and once
docker-compose
is finished, check the containers' health:docker ps
-
You should see something like this:
-
You should also be able to navigate to:
- The MLflow UI -> http://localhost:5050
- The Minio UI -> http://localhost:9001
That's it! 🥳 You can now start using MLflow!
The first thing you need to do is to create a python MLflow environment.
- Depending on your package manager,
conda
,poetry
oruv
, you will need different commands to create and activate your python environment.:
- Assuming you have
uv
installed, simply the following commands to install and activate the MLflow environment:uv sync source .venv/bin/activate
- Create and activate your MLflow experiment environment:
conda env create -f environment.yml conda activate mlflow_env poetry install --no-root
- From your previously created Python environment, run the example provided in this repo:
python train.py
- Note: You should be able to see the model in the MLflow UI -> http://localhost:5050
- Within the MLflow experiment you will find examples of how to use and validate the model. for convenience, I added the example in the
validate.py
script. To use it, simply run the following command:python validate.py
- Note1: You will need to add the
RUN_ID
to the.env
file. You can find it in theArtifacts
section in the MLFlow UI: - Note2: MLflow will spin up a temporary python environment and validate the model with the provided
input_example
data defined during training. - Note3:: The script is using
uv
to run the model validation, if you are usingconda
, you need to changeenv_manager
in the script toconda
.
- Note1: You will need to add the
- You can also test-run a prediction directly from the MLflow artifacts:
python predict.py \ --model-uri "s3://mlflow/1/<RUN_ID>/artifacts/model" \ --input-file "wine_quality_data.csv"
- Note: We are using the
wine_quality_data.csv
file provided in the repo as input, and by default the--model-uri
will be defined by theMODEL_REPO_DIR
in the.env
file.
- Note: We are using the
As before, this will depend on your Python package manager. The commands will be different depending on your package manager of your choice.
- Start your MLflow API by running the following command, replacing the
<RUN_ID>
for your own:source .venv/bin/activate mlflow models serve -m <RUN_ID> -h localhost -p 1234 --timeout 0 --no-conda
- Using
conda
+poetry
is a bit more complex, as you need to installpyenv
first.
Install pyenv
Pyenv is used with MLflow to manage different Python versions and packages in isolated environments.
- Remove previous installations (optional):
rm -rf ~/.pyenv
- Install any necessary package:
sudo apt-get update -y sudo apt-get install -y make build-essential libssl-dev zlib1g-dev libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev
- Automatic install
pyenv
:curl https://pyenv.run | bash
- Edit the
~/.bashrc
file to recognizepyenv
sudo nano ~/.bashrc
- And copy/paste the following line into the end of the file:
# Config for PyEnv export PYENV_ROOT="$HOME/.pyenv" export PATH="$PYENV_ROOT/bin:$PATH" eval "$(pyenv init --path)"
- Save/close (
control + X > y > Enter
), and then refresh the~/.bashrc
file:source ~/.bashrc
- Save/close (
- Finally, start your MLflow API by running the following command, replacing the
<RUN_ID>
for your own:conda activate mlflow_env mlflow models serve -m <RUN_ID> -h localhost -p 1234 --timeout 0
Mlflow also allows you to build a dockerized API based on a model stored in one of your runs.
- The following command allows you to build this dockerized API:
mlflow models build-docker \ --model-uri <RUN_ID> \ --name adorable-mouse \
- All is left to do is to run this container:
docker run -p 1234:8080 adorable-mouse
- In case you just want to generate a docker file for later use, use the following command:
mlflow models generate-dockerfile \ --model-uri <RUN_ID> \ --output-directory ./adorable-mouse \
- You can then include it in a
docker-compose.yml
, for instance:services: mlflow-model-serve-from-adorable-mouse: build: ./adorable-mouse image: adorable-mouse container_name: adorable-mouse_instance restart: always ports: - 1234:8080 healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8080/health/"] interval: 30s timeout: 10s retries: 3
- You can run this
docker-compose.yml
with the following command:docker compose up -d
Let's now test the served model (API) we just built.
- In a different terminal, send a request to test the served model:
curl -X POST -H "Content-Type: application/json" --data '{"dataframe_split": {"data": [[7.4,0.7,0,1.9,0.076,11,34,0.9978,3.51,0.56,9.4]], "columns": ["fixed acidity","volatile acidity","citric acid","residual sugar","chlorides","free sulfur dioxide","total sulfur dioxide","density","pH","sulphates","alcohol"]}}' http://localhost:1234/invocations
- The output should be something like the following:
$ {"predictions": [5.57]}
🎊 Et voilà ! 🎊