Challenge 5: Make it work and make it scale
< Previous Challenge - Home - Next Challenge >
Introduction
Having a model is only the first step, in order to use the model it has to be deployed to an endpoint. Vertex AI Endpoints provide a managed service for serving predictions.
Description
Create a new Vertex AI Endpoint and deploy the freshly trained model. Use the smallest instance size but make sure that it can scale to more than 1 instance.
The deployment of the model will take ~10 minutes to complete.
Note that the Qwiklab environment we're using has a quota on the endpoint throughput (30K requests per minute), **do not exceed that**.
Success Criteria
- The model has been deployed to an endpoint and can serve requests
- Show that the Endpoint has scaled to more than 1 instance under load
- No code change is needed for this challenge
Tips
- In order to generate load you can use any tool you want, but the easiest approach would be to install apache-bench on Cloud Shell or your notebook environment.
Learning Resources