As machine learning (ML) workloads and large language models (LLMs) become more prevalent, the need for GPU-based environments has grown significantly. Traditionally, developers could test their containerized applications on local machines, which are typically CPU-based. However, ML workloads often require GPUs for efficient training and inference, making local testing difficult or impossible.

Without dev containers, the typical workflow involves:

  1. Opening a cloud instance (e.g., an EC2 instance).
  2. Copying your code to the instance.
  3. Installing dependencies, Docker, and the nvidia container runtime.
  4. Starting the container manually.
  5. Repeating these steps whenever you make changes to your code.

This process is not only time-consuming but also inefficient because changes to the code are not reflected in real-time. Developers must restart containers to see their updates, which slows down iteration.

Dev containers solve this problem by providing hot-reloading containers that start a GPU-enabled instance in the cloud and enable hot-reloading on your code. Any changes you make to your code are instantly reflected in the running container, making testing and development much faster and more efficient.

How dev containers work

Dev containers take as input the gpu type, your project files and a Dockerfile. They then start an instance with the specified gpu type and start a container. The container runs a hot-reloading server that watches for changes in your project files. When a change is detected, the server rebuilds the container and restarts it with the updated code.

Getting started with dev containers

To get started with dev containers, you need to have the Tensorfuse CLI installed on your machine. You can install the CLI using the following command:

pip install --upgrade pip
pip install --upgrade tensorkube
tensorkube login

Configuration for AWS

You can run the following commands to setup AWS credentials on your machine:

aws configure

or you can manually export them as environment variables:

export AWS_ACCESS_KEY_ID=your_access_key_id
export AWS_SECRET_ACCESS_KEY=your_secret_access_key
export AWS_DEFAULT_REGION=your_default_region

Configuration for lambda

You can run the following commands to setup Lambda credentials on your machine:

tensorkube reset --cloud lambda

This will automatically prompt you to enter your Lambda API key. You can get your Lambda API key from the Lambda console.

Running the dev container

Let’s walk through the steps to start a dev container. You will need a Dockerfile and your project files in a directory. You can take the below example as a reference where we test a simple python server which returns the gpu type of the container.

Step 1: Get your project files ready

Create a directory with your project files and a Dockerfile. For this example, we will create a simple FastAPI server that returns the gpu type of the container and a Dockerfile to define it’s environment. You can create a .dockerignore file to exclude unnecessary files from the container.

main.py
from fastapi import FastAPI
import GPUtil

app = FastAPI()

@app.get("/gpus")
def get_gpus():
    gpus = GPUtil.getGPUs()
    gpu_info = [
        {
            'id': gpu.id,
            'name': gpu.name,
            'load': f"{gpu.load * 100}%",
            'memory_free': f"{gpu.memoryFree}MB",
            'memory_used': f"{gpu.memoryUsed}MB",
            'memory_total': f"{gpu.memoryTotal}MB",
            'temperature': f"{gpu.temperature} °C",
            'uuid': gpu.uuid
        }
        for gpu in gpus
    ]
    return gpu_info
Dockerfile
# Use an official Python runtime as a parent image
FROM python:3.9-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . .

# Install FastAPI, Uvicorn, and GPUtil
RUN pip install --no-cache-dir fastapi uvicorn gputil

# Always expose port 80 as Tensorkube forwards this port by default
EXPOSE 80

# Command to run the FastAPI app with Uvicorn
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]

Step 2: Start the dev container

Navigate to the directory containing your project files and run the following command:

tensorkube dev --cloud aws start --gpu-type v100 --port 8080

Arguments

  • --cloud aws: The cloud provider where the dev container will be started. Currently, only aws and lambda are supported.
  • --gpu-type A100: The type of GPU to use for the dev container. You can choose from H100,A100, V100, L40S, A10G, L4, T4 and V100. Specify None if you don’t need a GPU.
  • --port 8080: The port on which you can access the devcontainer from your local machine.8080 is the default port.

This command will:

  1. Automatically start an instance with the specified GPU type(V100 in this case) in the cloud.
  2. Create necessary VPCs, security groups, and key pairs for secure access.
  3. Sync your local codebase from the folder with this instance.
  4. Start a Docker container with hot-reloading enabled and forwarded to your port (8080 in this case)
  5. Once started, any changes made locally will be reflected instantly in the running container without needing to restart it.
  6. You can access this container using curl http://localhost:8080/gpus.

You can see the logs from your container in your terminal and access the container server at http://localhost:<port>. Try making changes to your main.py file and see the changes reflected in real-time in the running container.

If you stop the process by pressing Ctrl+C, the dev container will be stopped but the instance will continue to run. To avoid incurring charges, make sure to stop or delete the dev container when you’re done testing.

Step 3: List all the running dev containers

You can list all the running dev containers using the following command:

tensorkube dev --cloud aws list

Step 4: Stop and restart the dev container

When you’re done testing but want to retain the state of your dev container for later use:

tensorkube dev --cloud aws stop

This pauses the running dev container and stops the instance but keeps its state intact so it can be resumed later without losing progress. Your Docker build will be cached, so you don’t have to rebuild the container from scratch.

To restart the dev container:

tensorkube dev --cloud aws start --gpu-type v100 --port 8080

Step 5: Delete the dev container

When you’re done with the dev container and want to delete it:

tensorkube dev --cloud aws delete

This command purges all resources associated with the dev container from your cloud account (including instances).

Conclusion

Dev containers provide an efficient way to test ML workloads that require GPUs by using cloud-based instances with hot-reloading. Tensorkube enables you to test your GPU code in real-time without the need to restart containers, making development faster and more efficient.