Guidelines and best practices for deploying applications on Tensorkube
hf_transfer
library provides an efficient way to handle model transfers
from Hugging Face Hub to your deployment environment. Leveraging hf_transfer
will allow you to optimize your ML model deployments and ensure faster startup times.
We recommend you download your model during your app startup instead of baking it into your Docker image as the speedup achieved because of hf_transfer
and a smaller Docker image,
easily offsets any slowdown that happens due to model downloading.
hf_transfer
?hf_transfer
hf_transfer
is extremely easy. All you need to do is install the hf-transfer
python package and set the HF_HUB_ENABLE_HF_TRANSFER
environment variable to 1
in your deployment.
This can be achieved using the commands
nvidia-smi
commandnvidia-smi
.
This happens because GPU device files typically belong to the root user and a specific group. The NVIDIA Management Library (NVML)
requires specific permissions to initialize properly and without proper permissions, commands like nvidia-smi
will fail.