Prevision.io News: First Ever Pay-As-You-Go AI Management Platform Prevision.io Launches on Google Cloud.

Hey folks, now that you have built some Machine Learning models thanks to the R SDK from Prevision.io, we are going to show you how to deploy them. If no experiment has been created yet within your project, please refer to the previous blog post from this series, available on our website.

So, starting from here, we should have two versions of our experiment named “day_ahead” that can be validated by typing the following:

length(get_experiment_info(experiment_id = experiment$`_id`)$version_ids)

Reminder: if you have forgotten the experiment_id, you can get it back thanks to the get_experiment_id_from_name() function 🤓

Before deploying a model into Prevision.io, we need to define what a deployed model is, and then select which model we want to deploy.

What is a deployed model?

A deployed model is a model that is exposed to end users and IT applications available “outside” of the context of Prevision.io. Typically, a Tableau, PowerBI, or any other data visualisation tool could call a deployed model thanks to APIs in order to provide predictive KPIs. Another example of usage of deployed models is to send batch predictions via pipelines that can be scheduled to do it on a regular basis (more to come on this subject in another blog post 😇).

Since a deployed model is aimed to be a production-grade model, everything relating to it will be monitored and communicated to the end user. That means we will provide two main levels of monitoring:

  • IT based monitoring, which is about
    • CPU usage
    • RAM usage
    • Disk usage
    • Number of API calls (think “prediction” here)
    • Number of errors
    • Average response time

Usage example of a deployed model

  • Data Science based monitoring, which is about
    • Target drift
    • Feature drift
    • Data integrity
    • Back testing & performance monitoring

Feature distribution monitoring of a deployed model

We will cover more of that in a specific blog post, later on in this series, when our model will actually have made a couple of predictions 😉

Which model to deploy?

Well, all of this seems appealing, but before monitoring our model, we need to select a good candidate for deployment and, why not, good candidate🧐

Let me explain. At first, people want to deploy, most of the time, a lone model that solves a given business problem. However, when creating a new version of it typically after a retrain we may want to check if the newly trained model provides the same level of performance and stability over time. To do so, Prevision.io supports what we call a “champion challenger” approach that will allow you to deploy two models at a time: a champion that will effectively answer to the API call alongside a challenger that will only make the prediction and store it for analytical purposes. If everything goes as planned, you will be able to “hot swap”, without service interruption, the champion with the challenger.  We are going to do it right after we select our candidates model.

That being said, we have one experiment that has two versions. The first one is really not a good model, so I want to discard it and only focus on the second version. The latter has some promising models that we may choose for deployment.

Most common criteria involved when deploying a model include:

  • Algorithm family, because of regulatory reasons we may not be allowed to deploy a blend of 10 neural networks 🙃
  • Predictive performance, it’s not a bad idea to deploy a model that has a high predictive power
  • Stability, while some models may have, on average, a lower error, they may be more unstable than others. Checking the standard deviation of the error grouped by fold might be a first good approach
  • Response time, while not that true for batch based applications, real time prediction better be deserved in “real time” (< 300ms) than after 5 seconds 😉

For the sake of this tutorial, let’s pretend that we aren’t restricted on the algorithm family we can choose. We want to challenge the best performing model (lowest RMSE) and the fastest model (the one with the lower unitary prediction time).

To do so, we first need to get the model_id of the two models that fit our criteria. Hopefully, Prevision.io will set “tags” on trained models and, guess what, there is a tag available for us that will make the model retrieval easy.

The id of the model with the best predictive power can be retrieved by the following call:

best_model_id = experiment_version_info_2$best_model$`_id`

The id of the model with the lowest unitary prediction time can be retrieved by the following (and more complicated) call:

models = get_experiment_version_models(experiment_version_info_2$`_id`)

for(model in models) {
  if(!is.null(model$tag)) {
    if(!is.null(model$tag$fastest)) {
      fast_model_id = model$`_id`
    }
  }
}

If you want to retrieve more information about each of these models, you can do so by using the get_model_infos() function. However, for deploying, only model_id is needed.

Deploying models

To actually deploy the best model as the “main” and the fastest as the challenger, you need to type the following:

deploy = create_deployment_model(project_id = project$`_id`,
                        name = "R SDK Deployment", #should be unique = you may need to rename
                        experiment_id = experiment$`_id`,
                        main_model_experiment_version_id = experiment_version_info_2$`_id`,
                       challenger_model_experiment_version_id = experiment_version_info_2$`_id`,
                        description = "Champion challenger deployment",
                        main_model_id = best_model_id,
                        challenger_model_id = fast_model_id,
                        access_type = "fine_grained")

 

Let’s break down this call:

  • project_id, id of the project
  • name, name of the deployment. Please note that this should be unique because Prevision.io will generate a custom URL based on that (that will host a “default” application built for you on top of your model!) 😎
  • experiment_id, id of the experiment hosting models
  • main_model_experiment_version_id, id of the experiment version of the main model
  • challenger_model_experiment_version_id, id of the experiment version of the challenger model (don’t feed if you don’t want a challenger being deployed 🙃)
  • main_model_id, id of the main model
  • challenger_model_id, id of the challenger model (again, only if you want one to be deployed)
  • access_type, level of rights of the deployment. Can be “fine_grained” to match the user’s projects right, “private” for anyone that is logged on the specific instance and accessing the URL or “public” to provide access without even logging in.

This step will basically do all the DevOPS steps that are usually required to deploy a model such as, Docker packaging, documentation generation, compute provisioning and model exposition. This can take a couple of minutes to achieve, but afterwards you will have your two models deployed behind a custom URL, accessible to users of your project.

Successful model deployed

Since the name of my project is R SDK Deployment, the generated URL will be https://r-sdk-deployment.cloud.prevision.io/

If you click on your custom URL, you will land on an already generated default UI that will allow you to make unitary predictions to the model and also get the explanation of it! This is ideal for testing purposes or for communicating an AI model without any WEB development.

Default mini WEB application available for each deployed model with Prevision.io (note that displayed language will differ depending on your locale one)

Feel free to toy with it. You may see, for instance, that predicting with a lower temperature, closer to 0 will rise the estimated electricity consumption, all good 🧐

After making a couple of predictions, you can go to the Prevision.io UI in order to see either the Data Science or IT monitoring (or even the embedded documentation if you want to interact with provided API’s). For me (because of my deployment name), this documention is available at https://r-sdk-deployment.cloud.prevision.io/api/documentation/

API documentation automatically generated in a swagger

Now that we have a deployed model that is exposed and fully monitored, we have two ways to interact with it:

  • Either using API’s provided above with tokens that can be generated thanks to create_deployment_api_key() function. This will require further WEB based development that fall outside of this blog post series
  • Or by feeding “bulk” predictions via a pipeline and we are going to showcase how to do it in our next blog post 😇