MLOps #02: 7 things you need to learn about Continuous Training & Continuous Deployment

Learn how to leverage a meaningful Feedback Loop in MLOps pipelines

Rihab Feki
5 min readFeb 22, 2023
Photo by Sigmund on Unsplash

In these series about MLOps, I will be covering in each blog post a step of the MLOps pipeline and deep dive into it. Today, you will learn about continuous training and continuous deployment 🚀

If you want to learn more about MLOps’ best practices, then you are in the right place! In this article, I will cover the following points:

- MLOps life-cycle

- Continuous Training & its challenges

- Continuous Deployment

- Why to set up a Feedback Loop?

- Monitoring & the importance of setting alerts

MLOps life-cycle

I like to separate the MLOps life-cycle into two phases: Training & Serving. The Feedback Loop is the joining point between these two phases.

MLOps life-cycle (image by the author)

The training phase consists of two major building blocks:

  • Experimentation/Development: Comprises data processing, model training, training pipeline building, and deployment.
  • Continuous Training (CT): Comprises incorporating production data into the training dataset and re-training the model with it.

The serving phase consists of two major building blocks:

  • Continuous Deployment (CD): Comprises building the prediction service and continuously deploying the resulting model from the Continuous Training in the target environment.
  • Continuous Monitoring: Comprises monitoring of the models that are deployed to ensure they perform as expected.

In my previous article, the MLOps levels and life-cycles were presented in detail, so feel free to check it out here👇

Continuous Training

Continuous Training is a process of machine learning operations that automatically and continuously retrains machine learning models to adapt to changes in the data.

  • Over time, model performance may decay due to data drift, concept drift, etc.
  • Leveraging the data captured from production ensures that the model is re-trained with actual data to be up to date.

Challenges of Continuous Training

  • If it is not supervised, it could lead to have a new model giving predictions that are poorer than the original model. Supervised, means human in the loop, because it is important to ensure the quality and accuracy of the data to use to continuously train our model on.
  • It can involve unplanned infrastructure costs especially if u perform hyper-parameters tuning.
  • Continuous Training can be expensive and resource-intensive, so it is recommended to initiate it with caution and depending on the available budget and computing resources.

Continuous Deployment & its different modes

Continuous Deployment is a process of machine learning operations that automatically and continuously deploys re-trained models resulting from CT into production.

There are three modes of Continuous Deployment:

  • Automatic deployment
  • Manual deployment
  • Canary deployment
  • Shadow deployment

Automated MLOps pipelines

After introducing the different building blocks of MLOps pipelines, now it is time to see the bigger picture and how to connect the different pieces together to establish an automated MLOps pipeline, as shown in the figure below.

Automated MLOps pipelines (image by the author)

The joining piece of the loop which has not been covered yet is the Feedback Loop, so let’s see how it works👇

The Feedback Loop

A meaningful Feedback loop is key to achieving successful Continuous Training resulting in producing better-performing models. For that reason, it is very important to be selective about the quality of the data from production that is fed as the input of the CT.

Why do you need a Feedback Loop?

  • The data in production can be different from the training data, which means your model’s performance might decay with time.
  • To prevent that, it is recommended to enrich the training dataset with data that comes from production, to keep models up to date.

When to set up the Feedback Loop & how it works?

  • After inference requests, predictions must be human-reviewed.
  • After the review, the predictions are submitted to the Feedback Loop and merged with the initial training dataset.

But the Feedback Loop will not make sense without the monitoring !!

Continuous Monitoring

“It is not enough to monitor MODELS’ performance, DATA monitoring also matters”

How do you ensure your models continue performing at their best? By planning for and incorporating a monitoring system that supports the iteration and improvement of your models.

As a matter of fact, data changes over time and it is almost inevitable that training data becomes outdated over time and your models in production start to suffer from unseen data. Some of the consequences of that are:

Image by the author
  • CT & CD are data centric. So if the right metrics are set for the monitoring, the model performance decay will be detected early as well as data and concept drift.
  • The monitoring is human-reviewed.

The importance of setting the right alerts

Setting alerts helps to prevent performance decay and it is a great indication to push a new model to production that will lead to recovery of the deficiency observed from the monitoring phase.

There are no right or wrong metrics to set alerts for, but it always depends on the metrics that define the success of the prediction service and that preserve the business value (e.g latency in tasks that demand a real-time response, accuracy, precision, etc.)

Source: https://ml-ops.org/content/mlops-principles

“Model monitoring can be implemented by tracking the precision, recall, and F1-score of the model prediction along with the time. The decrease of the precision, recall, and F1-score triggers the model retraining, which leads to model recovery.”

Conclusion

Overall, the best way to get better models is by enhancing the quality of the data by leveraging a meaningful Feedback Loop. This is where monitoring comes into play to both direct you on how to enhance the quality of your datasets and warn you about what lacks to reach a satisfying level of model efficiency.

Now that we have been through the most important aspects of MLOps, you should be aware of the practices able to grant you that your model’s performance will improve over time.

Resources

Thanks for reading 🙏 until next time 👋

--

--