diff --git a/docs/quick-start/index.md b/docs/getting-started/index.md similarity index 57% rename from docs/quick-start/index.md rename to docs/getting-started/index.md index 21ee661f2..ed8e3e389 100644 --- a/docs/quick-start/index.md +++ b/docs/getting-started/index.md @@ -1,12 +1,15 @@ -# MLServer Quick-Start Guide +# Getting Started with MLServer This guide will help you get started creating machine learning microservices with MLServer -in less than 30 minutes. Our use case is to create a service that helps us compare the similarity +in less than 30 minutes. Our use case will be to create a service that helps us compare the similarity between two documents. Think about whenever you are comparing a book, news article, blog post, or tutorial to read next, wouldn't it be great to have a way to compare with -similar ones that you have already liked (without having to rely on a recommendation's system)? +similar ones that you have already read and liked (without having to rely on a recommendation's system)? That's what we'll focus on this guide, on creating a document similarity service. 📜 + 📃 = 😎👌🔥 +The code is showcased as if it were cells inside a notebook but you can run each of the steps +inside Python files with minimal effort. + ## 00 What is MLServer? MLServer is an open-source Python library for building production-ready asynchronous APIs for machine learning models. @@ -37,8 +40,9 @@ We will also need to download the language model separately once we have spaCy i python -m spacy download en_core_web_lg ``` -If you're going over this tutorial inside a notebook, don't forget to add an exclamation mark `!` -in front of the two commands above. +If you're going over this guide inside a notebook, don't forget to add an exclamation mark `!` +in front of the two commands above. If you are in VSCode, you can keep them as they are and +change the cell type to bash. ## 02 Set Up @@ -53,7 +57,7 @@ Let's create a directory for our model. ```python -mkdir -p models_hub/similarity_model +mkdir -p similarity_model ``` Before we create a service that allows us to compare the similarity between two documents, it is @@ -70,17 +74,17 @@ import spacy nlp = spacy.load("en_core_web_lg") ``` -Now that we have our model loaded, let's look at the similarity of the abstracts of [Barbieheimer](https://en.wikipedia.org/wiki/Barbenheimer) -using the Wikipedia API to see how similar these two movies actually are. - -To do this, we will be using the Wikipedia API Python library to find the summary for each -of the movies. The main requirement of the API is that we pass into the main class, `Wikipedia()`, -a project name, an email and the language we want information to be returned in. After that, -we can search the for the movie summaries we want by passing the title of the movie to the -`.page()` method and accessing the summary of it with the `.summary` attribute. +Now that we have our model loaded, let's look at the similarity of the abstracts of +[Barbieheimer](https://en.wikipedia.org/wiki/Barbenheimer) using the `wikipedia-api` +Python library. The main requirement of the API is that we pass into the main class, +`Wikipedia()`, a project name, an email and the language we want information to be +returned in. After that, we can search the for the movie summaries we want by passing +the title of the movie to the `.page()` method and accessing the summary of it with +the `.summary` attribute. Feel free to change the movies for other topics you might be interested in. +You can run the following lines inside a notebook or, conversely, add them to a `app.py` file. ```python import wikipediaapi @@ -100,15 +104,14 @@ print(barbie) print() print(oppenheimer) ``` +If you created an `app.py` file with the code above, make sure you run `python app.py` from +the terminal. - Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig and written by Gerwig and Noah Baumbach. Based on the Barbie fashion dolls by Mattel, it is the first live-action Barbie film after numerous computer-animated direct-to-video and streaming television films. The film stars Margot Robbie as Barbie and Ryan Gosling as Ken, and follows the two on a journey of self-discovery following an existential crisis. The film also features an ensemble cast that includes America Ferrera, Kate McKinnon, Issa Rae, Rhea Perlman, and Will Ferrell. - A live-action Barbie film was announced in September 2009 by Universal Pictures with Laurence Mark producing. Development began in April 2014, when Sony Pictures acquired the film rights. Following multiple writer and director changes and the casting of Amy Schumer and later Anne Hathaway as Barbie, the rights were transferred to Warner Bros. Pictures in October 2018. Robbie was cast in 2019, and Gerwig was announced as director and co-writer with Baumbach in 2021. The rest of the cast were announced in early 2022. Filming took place primarily at Warner Bros. Studios, Leavesden, in England and on the Venice Beach Skatepark in Los Angeles from March to July 2022. - Barbie premiered at the Shrine Auditorium in Los Angeles on July 9, 2023, and was theatrically released in the United States on July 21, 2023, by Warner Bros. Pictures. Its simultaneous release with Oppenheimer led to the "Barbenheimer" phenomenon on social media, which encouraged audiences to see both films as a double feature. The film received positive reviews, and has grossed $382 million worldwide. - - Oppenheimer is a 2023 biographical thriller film written and directed by Christopher Nolan. Based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin, the film chronicles the life of J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project, and thereby ushering in the Atomic Age. Cillian Murphy stars as Oppenheimer, with Emily Blunt as Oppenheimer's wife Katherine "Kitty" Oppenheimer; Matt Damon as General Leslie Groves, director of the Manhattan Project; and Robert Downey Jr. as Lewis Strauss, a senior member of the United States Atomic Energy Commission. The ensemble supporting cast includes Florence Pugh, Josh Hartnett, Casey Affleck, Rami Malek, Gary Oldman and Kenneth Branagh. - The project was announced in September 2021 after Universal Pictures won a bidding war for Nolan's screenplay. Murphy signed on to portray Oppenheimer in October, with others in the main cast joining between November 2021 and April 2022. Pre-production was underway by January 2022, with filming taking place from February to May. Oppenheimer was filmed in a combination of IMAX 65 mm and 65 mm large-format film, including, for the first time in history, sections in IMAX black-and-white film photography. As with his previous works, Nolan used extensive practical effects and minimal computer-generated imagery. - Oppenheimer premiered at Le Grand Rex in Paris on July 11, 2023, and was theatrically released in the United States and United Kingdom on July 21, 2023, by Universal Pictures. Its simultaneous release with Barbie led to the "Barbenheimer" phenomenon on social media, which encouraged audiences to see both films as a double feature. The film has grossed over $192 million worldwide and received critical acclaim, with particular praise for its cast, screenplay, and visuals. +``` +Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig and written by Gerwig and Noah Baumbach. Based on the Barbie fashion dolls by Mattel, it is the first live-action Barbie film after numerous computer-animated direct-to-video and streaming television films. The film stars Margot Robbie as Barbie and Ryan Gosling as Ken, and follows the two on a journey of self-discovery following an existential crisis. The film also features an ensemble cast that includes America Ferrera, Kate McKinnon, Issa Rae, Rhea Perlman, and Will Ferrell... +Oppenheimer is a 2023 biographical thriller film written and directed by Christopher Nolan. Based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin, the film chronicles the life of J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project, and thereby ushering in the Atomic Age. Cillian Murphy stars as Oppenheimer, with Emily Blunt as Oppenheimer's wife Katherine "Kitty" Oppenheimer; Matt Damon as General Leslie Groves, director of the Manhattan Project; and Robert Downey Jr. as Lewis Strauss, a senior member of the United States Atomic Energy Commission. The ensemble supporting cast includes Florence Pugh, Josh Hartnett, Casey Affleck, Rami Malek, Gary Oldman and Kenneth Branagh... +``` Now that we have our two summaries, let's compare them using spacy. @@ -145,8 +148,8 @@ Time to create a machine learning API for our use-case. 😎 MLServer allows us to wrap machine learning models into APIs and build microservices with replicas of a single model, or different models all together. -To create a service with MLServer, we will define a class with two async functions, one that -loads the model and another one to run inference (i.e. predict) with. The former will load the +To create a service with MLServer, we will define a class with two asynchronous functions, one that +loads the model and another one to run inference (or predict) with. The former will load the `spacy` model we tested in the last section, and the latter will take in a list with the two documents we want to compare. Lastly, our function will return a `numpy` array with a single value, our similarity score. We'll write the file to our `similarity_model` directory and call @@ -154,7 +157,7 @@ it `my_model.py`. ```python -# models_hub/similarity_model/my_model.py +# similarity_model/my_model.py from mlserver.codecs import decode_args from mlserver import MLModel @@ -186,7 +189,7 @@ and add the name and the implementation of our model to it. ```json -# models_hub/similarity_model/model-settings.json +# similarity_model/model-settings.json { "name": "doc-sim-model", @@ -194,8 +197,6 @@ and add the name and the implementation of our model to it. } ``` - Writing models_hub/similarity_model/model-settings.json - Now that everything is in place, we can start serving predictions locally to test how things would play out for our future users. We'll initiate our server via the command line, and later on we'll see how to @@ -212,11 +213,11 @@ To learn more about gRPC, please see this tutorial [here](https://realpython.com To start our service, open up a terminal and run the following command. ```bash -mlserver start models_hub/similarity_model/ +mlserver start similarity_model/ ``` Note: If this is a fresh terminal, make sure you activate your environment before you run the command above. -If you run the command above from your notebook (e.g. `!mlserver start models_hub/similarity_model/`), +If you run the command above from your notebook (e.g. `!mlserver start similarity_model/`), you will have to send the request below from another notebook or terminal since the cell will continue to run until you turn it off. @@ -233,7 +234,7 @@ import requests Please note that the request below uses the variables we created earlier with the summaries of Barbie and Oppenheimer. If you are sending this POST request from a fresh python file, make -sure you move those lines of code into your request file. +sure you move those lines of code above into your request file. ```python @@ -242,7 +243,7 @@ inference_request = { StringCodec.encode_input(name='docs', payload=[barbie, oppenheimer], use_bytes=False).dict() ] } -inference_request +print(inference_request) ``` @@ -252,8 +253,12 @@ inference_request 'shape': [2, 1], 'datatype': 'BYTES', 'parameters': {'content_type': 'str'}, - 'data': ['Barbie is a 2023 American fantasy comedy film directed by Greta Gerwig and written by Gerwig and Noah Baumbach. Based on the Barbie fashion dolls by Mattel, it is the first live-action Barbie film after numerous computer-animated direct-to-video and streaming television films. The film stars Margot Robbie as Barbie and Ryan Gosling as Ken, and follows the two on a journey of self-discovery following an existential crisis. The film also features an ensemble cast that includes America Ferrera, Kate McKinnon, Issa Rae, Rhea Perlman, and Will Ferrell.\nA live-action Barbie film was announced in September 2009 by Universal Pictures with Laurence Mark producing. Development began in April 2014, when Sony Pictures acquired the film rights. Following multiple writer and director changes and the casting of Amy Schumer and later Anne Hathaway as Barbie, the rights were transferred to Warner Bros. Pictures in October 2018. Robbie was cast in 2019, and Gerwig was announced as director and co-writer with Baumbach in 2021. The rest of the cast were announced in early 2022. Filming took place primarily at Warner Bros. Studios, Leavesden, in England and on the Venice Beach Skatepark in Los Angeles from March to July 2022.\nBarbie premiered at the Shrine Auditorium in Los Angeles on July 9, 2023, and was theatrically released in the United States on July 21, 2023, by Warner Bros. Pictures. Its simultaneous release with Oppenheimer led to the "Barbenheimer" phenomenon on social media, which encouraged audiences to see both films as a double feature. The film received positive reviews, and has grossed $382 million worldwide.', - 'Oppenheimer is a 2023 biographical thriller film written and directed by Christopher Nolan. Based on the 2005 biography American Prometheus by Kai Bird and Martin J. Sherwin, the film chronicles the life of J. Robert Oppenheimer, a theoretical physicist who was pivotal in developing the first nuclear weapons as part of the Manhattan Project, and thereby ushering in the Atomic Age. Cillian Murphy stars as Oppenheimer, with Emily Blunt as Oppenheimer\'s wife Katherine "Kitty" Oppenheimer; Matt Damon as General Leslie Groves, director of the Manhattan Project; and Robert Downey Jr. as Lewis Strauss, a senior member of the United States Atomic Energy Commission. The ensemble supporting cast includes Florence Pugh, Josh Hartnett, Casey Affleck, Rami Malek, Gary Oldman and Kenneth Branagh.\nThe project was announced in September 2021 after Universal Pictures won a bidding war for Nolan\'s screenplay. Murphy signed on to portray Oppenheimer in October, with others in the main cast joining between November 2021 and April 2022. Pre-production was underway by January 2022, with filming taking place from February to May. Oppenheimer was filmed in a combination of IMAX 65 mm and 65 mm large-format film, including, for the first time in history, sections in IMAX black-and-white film photography. As with his previous works, Nolan used extensive practical effects and minimal computer-generated imagery.\nOppenheimer premiered at Le Grand Rex in Paris on July 11, 2023, and was theatrically released in the United States and United Kingdom on July 21, 2023, by Universal Pictures. Its simultaneous release with Barbie led to the "Barbenheimer" phenomenon on social media, which encouraged audiences to see both films as a double feature. The film has grossed over $192 million worldwide and received critical acclaim, with particular praise for its cast, screenplay, and visuals.']}]} + 'data': [ + 'Barbie is a 2023 American fantasy comedy...', + 'Oppenheimer is a 2023 biographical thriller...' + ] + }] + } @@ -291,8 +296,9 @@ Our movies are 98.6691% similar Let's decompose what just happened. -The `URL` for our service might seem a bit odd if you've never heard of the V2/Open Inference Protocol (OIP). This -protocol is a set of specifications that allows machine learning models to be shared and deployed in a +The `URL` for our service might seem a bit odd if you've never heard of the +[V2/Open Inference Protocol (OIP)](https://docs.seldon.io/projects/seldon-core/en/latest/reference/apis/v2-protocol.html). +This protocol is a set of specifications that allows machine learning models to be shared and deployed in a standardized way. This protocol enables the use of machine learning models on a variety of platforms and devices without requiring changes to the model or its code. The OIP is useful because it allows us to integrate machine learning into a wide range of applications in a standard way. @@ -321,16 +327,17 @@ To learn more about the OIP and how MLServer content types work, please have a l ## 05 Creating Model Replicas Say you need to meet the demand of a high number of users and one model might not be enough, or is not using -all of the resources of the instance it was allocated on. What we can do in this case is to create multiple -replicas of our model to increase the throughput of the requests that come in. This can be particularly useful -at the peak times of our server. To do this we need to tweak the Settings of our server via the `settings.json` -file. In it, we'll add the number of independent model we want to have to the parameter `"parallel_workers": 3`. +all of the resources of the virtual machine instance it was allocated to. What we can do in this case is +to create multiple replicas of our model to increase the throughput of the requests that come in. This +can be particularly useful at the peak times of our server. To do this, we need to tweak the settings of +our server via the `settings.json` file. In it, we'll add the number of independent models we want to +have to the parameter `"parallel_workers": 3`. Let's stop our server, change the settings of it, start it again, and test it. ```json -# models_hub/similarity_model/settings.json +# similarity_model/settings.json { "parallel_workers": 3 @@ -339,17 +346,18 @@ Let's stop our server, change the settings of it, start it again, and test it. ```bash -mlserver start models_hub/similarity_model -`````` +mlserver start similarity_model +``` ![multiplemodels](../assets/multiple_models.png) -As you can see in the output of the terminal, we now have 3 models running in parallel. The reason you might see 4 -is because, by default, MLServer will print the name of the initialized model if it is one or more, and it will also -print one for each of the replicas specified in the settings. +As you can see in the output of the terminal in the picture above, we now have 3 models running in +parallel. The reason you might see 4 is because, by default, MLServer will print the name of the +initialized model if it is one or more, and it will also print one for each of the replicas +specified in the settings. -Let's get a few more [twin films examples](https://en.wikipedia.org/wiki/Twin_films) to test our server. Get -as creative as you'd like. 💡 +Let's get a few more [twin films examples](https://en.wikipedia.org/wiki/Twin_films) to test our +server. Get as creative as you'd like. 💡 ```python @@ -424,8 +432,8 @@ for movie1, movie2 in zip((deep_impact, antz, the_dark_night), (armageddon, a_bu ![serving3](../assets/serving_2.png) -For the last step of this quick-start guide, we are going to package our model and service into a -docker image that we can reuse in another project, or share it with colleagues immediately. This step +For the last step of this guide, we are going to package our model and service into a +docker image that we can reuse in another project or share with colleagues immediately. This step requires that we have docker installed and configured in our PCs, so if you need to set up docker, you can do so by following the instructions in the documentation [here](https://docs.docker.com/get-docker/). @@ -434,26 +442,24 @@ the directory we've been using for our service (`similarity_model`). ```python -# models_hub/similarity_model/requirements.txt +# similarity_model/requirements.txt mlserver spacy==3.6.0 https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.6.0/en_core_web_lg-3.6.0-py3-none-any.whl ``` - Writing models_hub/similarity_model/requirements.txt - The next step is to build a docker image with our model, its dependencies and our server. If you've never heard -of docker assets before, here's a short description. +of **docker images** before, here's a short description. > A Docker image is a lightweight, standalone, and executable package that includes everything needed to run a piece of software, including code, libraries, dependencies, and settings. It's like a carry-on bag for your application, containing everything it needs to travel safely and run smoothly in different environments. Just as a carry-on bag allows you to bring your essentials with you on a trip, a Docker image enables you to transport your application and its requirements across various computing environments, ensuring consistent and reliable deployment. -MLServer has a convenient function that lets us create docker assets with our services. Let's use it. +MLServer has a convenient function that lets us create docker images with our services. Let's use it. ```python -mlserver build models_hub/similarity_model/ -t 'fancy_ml_service' +mlserver build similarity_model/ -t 'fancy_ml_service' ``` We can check that our image was successfully build not only by looking at the logs of the previous @@ -461,7 +467,7 @@ command but also with the `docker assets` command. ```bash -docker assets +docker images ``` Let's test that our image works as intended with the following command. Make sure you have closed your @@ -476,7 +482,7 @@ Now that you have a packaged and fully-functioning microservice with our model, to a production serving platform like [Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/#), or via different offerings available through the many cloud providers out there (e.g. AWS Lambda, Google Cloud Run, etc.). You could also run this image on KServe, a Kubernetes native tool for model serving, or -anywhere else where you can bring in your docker image with you. +anywhere else where you can bring your docker image with you. To learn more about MLServer and the different ways in which you can use it, head over to the [examples](https://mlserver.readthedocs.io/en/latest/examples/index.html) section