FLUX_API_KEY
) as a Tensorfuse secret.
Prerequisites
Before you begin, ensure you have configured Tensorfuse on your AWS account. If you haven’t done that yet, follow the Getting Started guide.Deploying FLUX.1-dev with Tensorfuse
Each Tensorkube deployment requires:- Your environment (as a Dockerfile).
- Your code (in this example, the models directory).
- A deployment configuration (
deployment.yaml
).
Step 1: Prepare the Dockerfile
We will use the official nvidia triton server image as our base image. This image comes with all the necessary dependencies to run the model. The image tag can be found in nvidia container catalog Additional to base image, we will install couple of python packages, set additonal env and copy the models directory into docker image.Dockerfile
Step 2: Prepare the models directory
We will use python backend for tritonserver to serve the model. We will create a models directory and add the model.py and config.pbtxt file in it. For more details about triton python backend refer to triton docsmodels/flux/1/model.py
models/flux/config.pbtxt
Step 3: Create Secrets
We will create a secret to store the authentication token. We will use this token to authenticate the inference requests.Step 4: Deployment config
Although you can deploy tensorfuse apps using command line, it is always recommended to have a config file so that you can follow a GitOps approach to deployment.deployment.yaml
Step 4: Accessing the deployed app
Voila! Your autoscaling production text to image service using flux.1-dev is ready. Once the deployment is successful, you can see the status of your app by running:Remember to configure a TLS endpoint with a custom domain before going to production.
DEPLOYMENT_URL
in the code and set the FLUX_API_KEY
as environment variable before running the client.py file.
client.py