performed 16 times faster and costs 32 times less than Vertex AI, for a better predictive performance.

When talking about AI Management platform capabilities, one that is often overlooked is automated Machine Learning, aka AutoML. However, for fast prototyping or when you don’t want to search for optimal algorithms / hyperparameters by hand or do the whole feature engineering, you may want to consider it. Unfortunately, all AutoML isn’t equal in terms of performance or transparency. In fact, it’s often stated that they are blackbox.


In this article, we are going to showcase that is bringing more transparency, explainability and predictive power to models found thanks to its AutoML in comparison to what you can achieve with Vertex AI.


To support this argument, we are going to work with datasets about binary classification with limited volumetry. That means that you can replicate at a very low cost this experiment and even for free if you register on the offering). By the way, if you still don’t have a account yet, go here. I highly recommend you to take the full version in GCP (which comes with free credits) to replicate this tutorial.

Data set overview

We are going to predict if a prospective customer will accept a marketing offer (or not!).You can download the train set and the validation set. Both data sets come from an original, fully labelled data set that has been split randomly into 80% (train) and 20% (test) data sets.

Quick overview of the train set generated for you by after the dataset import

This train set has around 68,000 rows, 106 features and 1 binary target (~ 20% of positive values) → we want to create a binary classification model.

Distribution of the TARGET feature in the train set as displayed by

Note, Vertex AI is not well suited to handle data sets with missing columns or “.” in column names. The provided dataset has been reworked to make sure they work for both platforms.

Vertex AI AutoML

Now that we have an understanding of the dataset, we want to actually create Machine Learning models based on it. Let’s do this in Vertex AI.


After importing the data you can train a binary classification model using VERTEX AI AutoML.

Select AutoML

Select target column name: “TARGET”

Vertex does not provide a cross validation strategy so we’ll stick with the random assignment option.

On the next screen, for comparison purposes, make sure to check AUC as the metric to optimize:

Select AUC (default is log loss)

Finally, give a couple of hours for the training. In my case and in order to avoid spending thousands of dollars on this experiment, I asked for 3 compute hours maximum with early stopping enabled. That means that if Vertex AI doesn’t improve its performance, it will stop by itself before reaching 3 compute hours.


After launching the training phase, we have the following level of information

Data preprocessing stage

So we are now at the “Data preprocessing” stage, but we have no clue what’s next so we will have to wait..

In fact, we waited more than 4 hours in order to have final results. That is weird since we asked for only 3 compute hours. Even though the training duration was longer than configured, at least we are not billed for the additional duration.

4 hours later…

Vertex AI only shows performance metrics of the AutoML model that has been trained. It would have been nice to have more information about the model itself such as applied feature transformations, model technology, hyperparameters and feature importance. In regulated areas it is even more important to have a complete “identity card” of the model. 

Some analytics on the model behavior

“AutoML” type of model, no technology nor hyperparameters communicated to the user. Note that time elapsed is > to budget

Predictive power

As evaluates estimators with cross validation and Vertex AI with a 80/10/10 split, we chose to validate the real predictive power on the test dataset to have a fair comparison.


To compute real predictive power, a batch prediction has been requested:

Batch prediction achieved in 24 minutes…

Batch prediction took 24 minutes to predict only 17K rows.  While this is really long, I have tested predictions on 1000 rows and it still took 23’ so it seems that there is an important overhead when the job spawns.  

Also, predictions are generated in multiple .CSV files, if you want to make a simple analysis you’ll need to download them all, concatenate, and then create your analytics.

Why can’t we have a simple .CSV?

The true performance of this test set is 73.7% AUC (computed in my notebook). In the Vertex AI UI, model performance was estimated at 88.5%. This discrepancy is due to the validation method used, so please be careful about model estimation with Vertex AI. AutoML

Remember, if you still don’t have a account yet, go here and select the full version in GCP to replicate this tutorial.


After creating a project and importing the data (already done earlier), you need to create an “experiment” that uses the AutoML engine from Locate the “experiment” link on the left side bar. Configure your experiment. Make sure to give it a name and select AutoML engine & tabular classification before creating it:

Configuration of the experiment

By then, create your first version of your experiment. For the context of this blog post, only one will be done.


The minimal configuration should include the data we use (classif_80) for training, the data for validation in order to make sure we don’t overfit (classif_20) and the name of the target, as seen here:

Minimal configuration of the experiment’s version.

If you want to go further, we offer lots of options that can be reached from the top nav bar like model selection, feature engineering selection, etc..

Examples of models you can add or reject in your experiment

Examples of feature engineering you want to add or reject in your experiment

That being said, don’t spend too much time here, the goal is to compare the analytics and the overall predictive power (even if we have a lot of great features when creating our experiments 😎).


Actually, your job is now done and our engine will execute the graph that was displayed in the right part of the previous screen. That may look like nothing but, you know what’s going to happen and what step we are in. For instance, after couple of minutes of training, the execution graph will look like this:

Execution diagram of our experiment, you’ll know what task will be done by the engine

Data preparation is done, some simple models have been trained and now models are being hyper optimized.


After some time, the training is completed and the experiment is completed. The output presents 11 trained models using different technologies (XgBoost, logistic regression or decision trees):

Finished experiment, with models trained

This process took 15 minutes to achieve (remember it took 4 hours using Vertex AI):

15’ of experiment in

By clicking on a bar representing a model, you can access the complete analytics to understand the model and how it behaves as you can see below:

Model information and its hyperparameters, as optimized by the AutoML

Feature importance: S5 is the feature containing the most information

Threshold probability, cost matrix & density chart

Confusion matrix & score table

Lift analytics, per bin or cumulated alongside a ROC curve

More analytics are available and will differ depending on the type of modelization you are working on (regression, multi classification, images, time-series…)


Also, for advanced and custom analytics, you can retrieve the cross validation from the model by clicking the appropriate button. Or why not use our notebooks to create your own analytics that could be retrieved thanks to our APIs in the R SDK or the PYTHON SDK.

Predictive power

As evaluates estimators with cross validation and Vertex AI with a 80/10/10 split, we chose to compare the real predictive power on the test dataset. estimation (cross validation) :     ~ 75.5 % AUC real performance on holdout :     ~ 76.0 % AUC (73.7 % on Vertex AI)


Clearly, no overfitting here and done in 8 seconds 😉

Performance automatically computed
No need to download multiple files and compute AUC in a notebook…


As we have seen in this blog post, is more robust and offers you more ways to understand model characteristics and performance thanks to analytics that are automatically generated.


Also, the experiment took only 15 minutes on vs 4 hours on Vertex AI.  The Vertex AI model achieved a lower predictive power on the holdout (despite the estimator being better on Vertex AI, indicating potential overfitting issue).


In summary:

Comparison has been done on December 1st, 2021 with both global available commercial versions of Vertex AI and using Google Cloud.

Florian Laroumagne

About the author

Florian Laroumagne

Senior Data Scientist & Co-founder

Engineer in computer science and applied mathematics, Florian specialized in the field of Business Intelligence then Data Science. He is certified by ENSAE and obtained 2nd place in the Best Data Scientist of France competition in 2018. Florian is the co-founder of, a startup specializing in the automation of Machine Learning. He is now leading several applied predictive modeling projects for various clients.