Challenge 5: Make it work and make it scale

< Previous Challenge - Home - Next Challenge >

Introduction

Having a model is only the first step, in order to use the model it has to be deployed to an endpoint. Vertex AI Endpoints provide a managed service for serving predictions.

Description

Create a new Vertex AI Endpoint and deploy the freshly trained model. Use the smallest instance size but make sure that it can scale to more than 1 instance.

The deployment of the model will take ~10 minutes to complete. Note that the Qwiklab environment we're using has a quota on the endpoint throughput (30K requests per minute), **do not exceed that**.

Success Criteria

  1. The model has been deployed to an endpoint and can serve requests
  2. Show that the Endpoint has scaled to more than 1 instance under load
  3. No code change is needed for this challenge

Tips

Learning Resources