WOLFcon 2024 - Understanding and Using AI Workflows with FOLIO

23 September 2024


Training Large Language Models (LLMs)

One method for customizing and providing better context when using LLMs is to train the model on your data. There are a number methods for fine-tuning LLMs including low-rank adaptation or LoRA1, that allow you to fine-tune a small number of parameters of a model's many parameters without needing a large number of GPUs.

OpenAI allows you to train one of their models through their API and custom GPTs. Google Gemni offers fine-tuning through the Gemini API as does Anthropic as explained in their documentation.

Training LLaMA models locally on a personal computer is possible depending on the resources of your local computer. Unfortunately, the process isn't easy or straight-forward and require running Python code. If you have an Apple computer, the mlx-lm package, which uses Apple Silicon GPU, can be used for fine-tuning open-source models like LLaMA. Another possibility is using HuggingFace's peft package that along with HuggingFace's transformers, can be used to fine-tune models.

Steps for Fine-tuning

OpenAI provides directions for fine-tuning2 a ChatGPT model that are general enough for most fine-tuning tasks:

  1. Prepare and upload training data
  2. Train a new fine-tuned model
  3. Evaluate results and go back to step 1 if needed
  4. Use your fine-tuned model

Create Inventory Instance Training Set

From the FOLIO community's Quesnelia Bugfest instance. The training set consists of the denormalize Instance along with a text prompt.

Depending on the model and service, you may need reformat the training set to match the expected inputs to the model.

For OpenAI, the expected format are JSON objects, typically saved one per line in the JSON-L format, that would like the following for the training set:

{"messages": [{"role": "system", "content": "As an expert cataloger, you will create the FOLIO inventory JSON record."}, 
              {"role": "user", "content": "For Vacancies in certain judgeships. Published in 1935 by [publisher not identified], Washington Subjects are Courts, Judges, United States--Officials and employees, Electronic books."},
              {"role": "assistant", "content": "{'title': 'Vacancies in certain judgeships.', 'contributors': [], 'subjects': [{'value': 'Courts'}, {'value': 'Judges'}, {'value': 'United States--Officials and employees'}, {'value': 'Electronic books'}], 'classifications': [], 'publication': [{'publisher': '[publisher not identified]', 'place': 'Washington', 'dateOfPublication': '1935', 'role': 'Publication'}], 'instanceTypeText': 'text', 'modeOfIssuanceText': 'single unit'}}"}]}

For training using mlx-lm, the data format is simpler than the OpenAI format but still uses JSON-L:

{"text": "As an expert cataloger, you will create the FOLIO inventory JSON record. Q:For Statistical applications for chemistry, manufacturing and controls (CMC) in the pharmaceutical industry /Richard K. Burdick [and others]. Published in 2017 by Springer, Cham A:{'title': 'Statistical applications for chemistry, manufacturing and controls (CMC) in the pharmaceutical industry /Richard K. Burdick [and others].', 'contributors': [], 'publication': [{'publisher': 'Springer', 'place': 'Cham :', 'dateOfPublication': '2017', 'role': None}], 'instanceTypeText': 'text', 'modeOfIssuanceText': 'Monograph'}"}

Training

To train a ChatGPT model after creating a training set, follow these steps:

  1. Upload a training file to the Files API
  2. Create a fine-tune job either through the OpenAI User Interface or through Python or Node.js SDK.

There are different methods for fine-tuning local LLaMA models. With Apple's Python mlx-lm, follow these steps in LoRA documentation.

On other platforms and assuming you have a supported GPU, unsloth is a good option. You can use unsloth on public cloud providers as well. Huggingface's peft is another good choice and includes the loralib package by the authors of the original LoRA paper3. Both need to be used with Pytorch to fine-tune models as well.

Using a GPT4ALL Model

Once you have a fine-tuned model and you have installed GPT4ALL, you can use that model
on a Mac by copying the fine-tuned models to this location:

/Users/{user-name}/Library/Application Support/nomic.ai/GPT4All/

Running the Model in Inference Mode