Learn how to deploy your containerized applications as serverless, auto-scaling API endpoints on Tensorfuse.
Dockerfile
): A Dockerfile
defines your application’s environment. It specifies the base image, system dependencies, Python packages, and the command needed to start your service. Tensorfuse uses this file to build a container image that is identical for development and production.
deployment.yaml
): This YAML file defines the infrastructure and runtime settings for your Deployment. Here, you specify the required resources (like GPU type and count), scaling parameters, secrets to inject, and health check endpoints.
tensorkube deploy
command, Tensorfuse performs the following steps automatically:
Dockerfile
into a container image.deployment.yaml
file.
While CLI flags are useful for quick tests, we strongly recommend using a deployment.yaml
file for production workloads. This allows you to version control your infrastructure configuration alongside your code, following a GitOps approach.
To deploy, simply run:
deployment.yaml
file specifies the required GPUs, attaches secrets, and defines a readiness probe. For a full list of
available configuration options, refer to the Deployment Configuration Reference.
readiness
endpoint, Tensorfuse will not know when your container is truly ready,
which can lead to failed requests. Always include a readiness
block in your deployment.yaml
to ensure your
deployments are robust and reliable.