Prerequisites
Before you begin, ensure you have the configured Tensorkube on your AWS account. If you haven’t done that yet, follow the Getting Started guide.Deploying ResNet18 on Tensorfuse
Each tensorkube deployment requires two things - your code and your environment (as a Dockerfile). While deploying machine learning models, it is beneficial if your model is also a part of your container image. This reduces cold-start times by a significant margin. To enable this, in addition to a FastAPI app and a dockerfile, we will also write code to download the model and place it in our image file.Download the model
We will write a small script that downloads the ResNet model from the Hugging Face model hub and saves it in the./models/resnet-18
directory.
download.py
Code files
We will write a small FastAPI app that loads the model and serves predictions. The FastAPI app will have three endpoints -/readiness
, /gpu_check
and /predict/
. Remember that the /readiness
endpoint is very important as it is used by Tensorkube to check the health of your deployments.
app.py
Environment files (Dockerfile)
Next, create a Dockerfile for your FastAPI app. Remember that Tensorkube assumes that your server runs on pod 80. Given below is a simple Dockerfile that you can use:Dockerfile