Finetuning
Use axolotl styled configs to finetune models on your AWS account
Tensorfuse supports axolotl styled declarative configs for finetuning models on your AWS account. This guide explains how to finetune models.
In this example, we will train a fine-tuned LoRA adapter for the LLaMA-3.1-8B
model on a SQL dataset.
This guide is intended for users who want to perform one-off training runs and experiment with different hyperparameters. If you are looking to deploy a production-ready model, please refer to the programmatic access guide here.. We only support highly tested configurations for programmatic access. If you would like us to add a new configuration, please reach out to us at [email protected].
Fine-tuning involves three steps:
- Dataset Preparation: Prepare a dataset in your S3 bucket. Ensure the dataset is accessible to the IAM user who created the TensorKube stack.
- Create a Hugging Face secret: Use your Hugging Face token, ensuring it has access to the model you want to fine-tune.
- Prepare a
config.yaml
file: Define training parameters. Refer to the Axolotl documentation for a list of supported parameters.
Using random Axolotl configurations can lead to OOM errors and other compatibility issues. Use log inspection tools to debug any issues. If you need support, contact us at [email protected]. Well-tested configurations are available in the programmatic access guide.
Dataset Preparation
We will use a SQL dataset for this guide. The dataset follows the ChatML format. Supported dataset formats can be found here. Each dataset should be in JSONL format. Below is an example datapoint:
Upload this dataset to your S3 bucket and get the bucket path. In this example it looks like this s3://testing-prod-123456789012/cli_sql_dataset.jsonl
.
Create a huggingface secret
Use your Hugging Face token to create a secret. Ensure the token provides access to the model you wish to fine-tune.
Verify that the secret was created successfully by running tensorkube list secrets
.
Prepare a config.yaml file
Here is an example configuration file for finetuning a lora adapter on llama-3.1-8B model. The following configuration file is tested to work
on a single A10G GPU. If you want to run it on a different hardware, please experiment with the micro_batch_size
and gradient_accumulation_steps
.
Finetuning
Use the tensorkube train create
command to initiate the fine-tuning process. The following options are supported:
To start fine-tuning, run the following command:
This command initiates fine-tuning on a single A10G GPU. Monitor the logs to ensure the training runs successfully.
Checking status
You can run tensorkube train list
to check the status of the training job.
Checking logs
You can also check the logs of the training job by running tensorkube train logs --job-id <JOB_ID>
to check the logs of the training process.
The following options are supported:
You can take the following command as an example to check the logs of the training job:
Checking the adapter
Look for the tensorkube-train-bucket-<unique_id>
bucket in your S3 console. All your training LoRA adapters will reside here. The adapter ID is constructed from your job-id
. Your adapter URLs will look like this:
The last portion of your url is essentially ax-{job-id}
. You can download the adapter from the S3 console and use it for inference or use it on lorax
.