News: First Ever Pay-As-You-Go AI Management Platform Launches on Google Cloud.



This is the fourth in an six-part series on how to use Python SDK to build production-ready and fully monitored AI models using your real-world business data. If you already have a project, called “Electricity Forecast” containing an experiment-version called ‘day_ahead’ including a bunch of trained models then you are ready to go. Otherwise, head over to the third blog post, follow the instructions and come back! 


What Are We Doing Today?


In this blog post, we are going to see how we can easily deploy a model using the Python SDK. Let’s dig in! 


Some Context Before Starting?


Aren’t you wondering what a ML deployed model is and which of the trained models is the best to deploy? If you already have the answers, don’t hesitate to skip this section. Otherwise, stick with me! You’ll end up having answers to your questions 😀 !


  1. What is ML Model Deployment?

A deployed model is a model that is exposed to end users and IT applications available “outside” of the context of Typically, a Tableau, PowerBI, or any other data visualisation tool could call a deployed model thanks to APIs in order to provide predictive KPIs. Another example of usage of deployed models is to send batch predictions via pipelines that can be scheduled on a regular basis (more to come on this subject in another blog post ).

Since a deployed model is aimed to be a production-grade model, everything relating to it will be monitored and communicated to the end user. That means we will provide two main levels of monitoring:

  • IT based monitoring, which is about
    • CPU usage
    • RAM usage
    • Disk usage
    • Number of API calls (think “prediction” here)
    • Number of errors
    • Average response time
    • …etc.

Usage example of a deployed model

  • Data Science based monitoring, which is about
    • Target drift
    • Feature drift
    • Data integrity
    • Back testing & performance monitoring
    • …etc.

Feature distribution monitoring of a deployed model

We will cover more of that in a specific blog post later on in this series once our model will have made a couple of predictions.

  1. Which Model Is The Best to Deploy?


Well, all of this seems appealing, but before monitoring our model, we need to select a good candidate for deployment.

Let me explain. At first, people want to deploy, most of the time, a lone model that solves a given business problem. However, when creating a new version of it typically after a retrain we may want to check if the newly trained model provides the same level of performance and stability over time. To do so, supports what we call a “Champion / Challenger” approach that will allow you to deploy two models at a time: a Champion that will effectively answer to the API call alongside a Challenger that will only make the prediction and store it for analytical purposes. If everything goes as planned, you will be able to “hot swap”, without service interruption, the Champion with the Challenger.  We are going to do it right after we select our candidates model.

Most common criteria involved when deploying a model include:

  • Algorithm family, because of regulatory reasons we may not be allowed to deploy a blend of 10 neural networks 
  • Predictive performance, it’s not a bad idea to deploy a model that has a high predictive power
  • Stability, while some models may have, on average, a lower error, they may be more unstable than others. Checking the standard deviation of the error grouped by fold might be a first good approach
  • Response time, while not that true for batch based applications, real time prediction better be deserved in “real time” (< 300ms) than after 5 seconds 


Let The Fun Begin!

Now that you have a firm grasp on what ML model deployment means, launch your Python code environment and follow the steps!

Step1. Pick up & Retrieve candidates

For the sake of this tutorial, let’s pretend that we aren’t restricted on the algorithm family we can choose. We want to challenge the best performing model (lowest RMSE) and the fastest model (the one with the lower unitary prediction time).

Now that we’ve made the choice of the candidate models to deploy, let’s retrieve their Ids as well as detailed information about them using their criteria (Best model & fastest model) either by logging to your instance as showcased bellow:

Or by simply typing the following lines of code: 

ev= pio.Supervised.from_id(‘619258d61253c7001c5753ed’)

#retrieve both best and fastest models
best_model = ev.best_model

#print their id

print(“the best model id is {} and the fastest model id is {}”.format(,

#print detailed information about both models including their id
print(“the best model is {} and the fastest model is {}”.format(best_model, fastest_model))



Step 2. Deploy Models

As you have already guessed, the best model is going to be considered as the ‘main’ model and the fastest model as the ‘Challenger’ model. 

Now that we have both main and Challenger models from an existing experiment, we can deploy them by just typing the following line of code:

#retrieve the project

# restore previous project
project=pio.Project.from_name(name=“Electricity Forecast”)

# deploy the experiment model! Note you should choose another deployment name
experiment_deployment = project.create_experiment_deployment(

    access_type= ‘public’

The above line of code create a new experiment deployment in the current project, using the parameters bellow:

  • name: experiment deployment name. Please note that this should be unique because will generate a custom URL based on that (that will host a “default” application built for you on top of your model!) 
  • main_model: the retrieved main model, which is in our case the best model object.
  • challenger_model: the retrieved Challenger model, which is in our case the fastest model object. Please note that mentioning a Challenger is optional. However, in case it’s mentioned, both main and challenger models should come from the same experiment.
  • access type:  level of rights of the deployment. Can be “fine_grained” to match the user’s projects right, “private” for anyone that is logged on the specific instance and accessing the URL or “public” to provide access without even logging in. The access type is assigned ‘public’ by default.

This step will basically do all the DevOPS steps that are usually required to deploy a model such as, Docker packaging, documentation generation, compute provisioning and model exposition. This can take a couple of minutes to achieve, but afterwards you will have your two models deployed behind a custom URL, accessible to all your mates (including the project’s users). 

Since the name of the experiment deployment is “my_deployed_experiment”, the generated application URL will be :

NB: As it is mentioned in the comment line of the block code above, please note that you should choose your own deployment name because we can’t deploy on the same URL more than once.   


To make sure your deployment has been successful and access the already generated default UI that will allow you to make unitary predictions to the model and get the explanations of it, you can either tape the generated application URL or do it this way:

Default mini WEB application available for each deployed model with (note that displayed language will differ depending on your locale one)

Isn’t this amazing and exciting as well! Take your time to toy with different inputs and share your insights with us. I’ll start: Predicting with a lower temperature, closer to 0 will raise the estimated electricity consumption.

For Code Players, you still can make bulk predictions from your deployed models using the following lines of code:

#retrieve experiment deployment and test dataset

experiment_deployment= pio.experiment_deployment.ExperimentDeployment.from_id(‘619b95beae3c4c001c9a82a5’)

test_dataset= pio.Dataset.from_id(‘618145a05f8f22001ced1ecd’)

# make predictions
deployment_prediction = experiment_deployment.predict_from_dataset(test_dataset)

# retrieve prediction from main model
prediction_df = deployment_prediction.get_result()

# retrieve prediction from challenger model (if any)
prediction_df = deployment_prediction.get_challenger_result()

As well as make unitary predictions from the main model, as follows: 

# create an api key for your model


# retrieve the last client id and client secret

creds = experiment_deployment.get_api_keys()[-1]

# initialize the deployed model with its url, your client id and client secret

model = pio.DeployedModel(





# make a prediction

prediction, confidance, explain = model.predict(

    predict_data={‘TS’: ‘2020-01-01T03:30:00Z’, ‘PUBLIC_HOLIDAY’: 1, ‘TEMPERATURE’:1.66622, ‘LAG_1’:59833, ‘LAG_7′:46642,’fold’:7},




For the curious people over there, if you are interested in trying out more functionalities , don’t hesitate to take a look at our Experiment Deployment and Deployed Model reference manuals, try and share with us your experience!

After making a couple of predictions, you can go to UI in order to see either the Data Science, the IT monitoring or even the embedded documentation if you want to interact with the provided API, as showcased bellow: 

API documentation automatically generated in a swagger


What’s Coming Next?


Now that we have a deployed model that is exposed and fully monitored. For this post, we already interacted with it using the API’s provided above with the generated tokens thanks to create_api_key()  function. For the next post, we are going to showcase the second way of doing things: Feeding “bulk” predictions via a pipeline and we are going to showcase how to do it in our next blog post 

Zina Rezgui

About the author

Zina Rezgui

Data Scientist