Prerequisites
Before you begin, ensure you have configured Tensorkube on your AWS account. If you haven’t done that yet, follow the Getting Started guide.Deploying Jina with Tensorfuse
Each tensorkube deployment requires two things - your code and your environment (as a Dockerfile). While deploying machine learning models, it is beneficial if your model is also a part of your container image. This reduces cold-start times by a significant margin. We are using the Huggingface Text Embeddings Inference toolkit to make our models utilise the full GPU capacity. You can try any of the supported model here.Code files
We will use an nginx server to start our app. We will configure the /readiness endpoint to return a 200 status code. Remember that Tensorfuse uses this endpoint to check the health of your deployment. The Huggingface TEI toolkit serves embeddings at /embed and hence we configure all other endpoints to route to the TEI toolkit which is running on port 8000.nginx.conf
Environment files (Dockerfile)
Next, create a Dockerfile. Given below is a simple Dockerfile that you can use:Dockerfile