Finetune LoRA adapters for popular models using axolotl styled declarative configs
Model | GPU Requirements |
---|---|
Llama 3.1 70B | 4x L40S (Recommended) |
Llama 3.1 8B | 1-2x A10G |
keda
environment:
Access to Llama 3.1
Set huggingface token
WRITE
token from your huggingface profile and store it as a secret in Tensorfuse using the command below.HUGGING_FACE_HUB_TOKEN
as Tensorfuse assumes the same.READ
token.Set your W&B authentication token
float32
format. To store the adapter weights in bfloat16
format instead, set the store_weights_as_bf16
flag to True
.get_job_status
function. The function returns the status of the job as QUEUED
, PROCESSING
, COMPLETED
, or FAILED
.
tensorkube-keda-train-bucket
. All your training lora adapters will reside here. We construct adapter id from your job-id
and the type of gpus used for training so your adapter urls would look like this:-
s3://<bucket-name>/lora-adapter/<job_name>/<job_id>
fine-tuning-job
and job-id unique_id
, trained on 4
gpu of type l40s
WRITE
token as the HUGGING_FACE_HUB_TOKEN
secret.hf_org_id
parameter in the create_fine_tuning_job
function.{HF_ORG_IF}/{job_name}_{job_id}
format. So for the above example the adapter
would get uploaded to {ORG_ID_HERE}/fine-tuning-job_unique_id
. The repo would be private
by default.
--env default
flag in the secret creation command.tensorkube deployment list
.curl
. This will query the base model without any adapters.