Getting Started
Deploy serverless GPU applications on your AWS account
Built with developer experience in mind, Tensorkube simplifies the process of deploying serverless GPU apps. In this guide, we will walk you through the process of setting up Tensorkube and deploying a FastAPI app on it.
Prerequisites
Before you begin, ensure you have the following:
- AWS credentials setup on your machine
- Python and pip installed on your machine
You can run the following commands to setup AWS credentials on your machine:
If you’re an IAM user:
or you can manually export them as environment variables:
If you’re an Identity Center User:
or you can manually export them as environment variables:
You can refer to this blog for for more information: https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-sso.html
Tensorkube currently only supports the us-east-1
region on AWS out of the box. If you want to use a different region, please reach out to us at [email protected]
.
Installation
First, install the tensorkube Python package:
Following this, run the following command to login to Tensorfuse and get your token.
You just have to sign in using your Google Workspace account and Tensorfuse will automatically manage the token for you.
After this, run the following command to setup Tensorkube on your AWS account:
This is a one time setup that will create a CloudFormation stack on your AWS account. This needs to be run once per AWS account. If you are a team that is using tensorfuse, only one of the team members is required to run this command.
This command sets up a Cloudformation stack, creates a k8s cluster and deploys custom resources in order to enable serverless GPUs on your AWS account. It can take anywhere from 8 to 15 minutes to complete.
Deploying your first Tensorkube app
Each tensorkube deployment requires two things - your code and your environment (as a Dockerfile).
Code files
Let’s create a simple FastAPI app and deploy it on Tensorkube. Before deploying your app, ensure you have a /readiness
endpoint configured in your FastAPI app.
Tensorkube uses this endpoint to check the health of your deployments. Given below is a simple FastAPI app that you can deploy:
Environment files
Add your python dependencies to requirements.txt:
Next, create a Dockerfile for your FastAPI app. Given below is a simple Dockerfile that you can use:
Deploying the app
This is the easiest part. Navigate to your project root and run the following command:
Voila! Your first deployment is ready to go. You can access your app at the URL provided in the output.
Deploying with GPUs
If you want to deploy your app with GPUs, you can specify the number of GPUs you want to use in your deployment:
The --gpu-type
arguement supports all the GPU types that are available on AWS. You can find the list of supported GPU types here.
Check the status of your deployment
You can list all your deployments using the following command:
You can also query specific details about a particular deployment, such as the enpoint url or the status of the deployment:
And that’s it. Your automatic GPU serverless deployment is now up and running. Enjoy!