Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is a Natural Language Processing (NLP) technique to enhance and localize the use of generative based Large Language Models (LLMs) by including specific external content in the LLM's context window.

RAG is often employed to improve the accuracy and reliability of LLMs by better contextualizing the generative abilities of these models.

Key features of RAG:

First proposed in a 2020 paper¹
Provides a means to ground LLM output
Allows LLMs to act as a narrative interface for document corpora or data repositories
Enables LLMs to cite specific documents or data, reducing the chance of incorrect answers

Embeddings and Vector Databases

While the RAG technique doesn't require the use of embeddings or vector databases, adding an information retrieval component to your RAG system can improve overall performance of compound AI systems. To create a vector database, first the documents or data is converted into a mathematical representation as a text embedding, or set of numeric vectors, that are used for matching and creating relationships.

These text embeddings are then stored in a database or datastore and then can be queried by first converting a query into a text embedding and then finding the stored embeddings that most closely match the query. The resulting embeddings are then converted back to textual documents and then sent as a part of the context window to the LLM.

Application to FOLIO

Creating a vector database of FOLIO JSON documents offers a number of advantages to AI workflows; both in the generative aspects of creating new documents like invoices or inventory records from text or automatic prompts, and minimize errors in the LLMs output.

Workshop Exercise

Download the following zip file and extract to a local directory named folio-docs
Open gpt4all
Click on the LocalDocs button on left-hand side
In the upper-right corner select the green button, + Add Collection
Name the new collection and browse to the folio-docs folder you created in step 1
Create collection
Create a new Chat and select the collection to query the model using RAG

WOLFcon 2024 - Understanding and Using AI Workflows with FOLIO

23 September 2024

Retrieval Augmented Generation (RAG)

Key features of RAG:

Embeddings and Vector Databases

Application to FOLIO

Workshop Exercise

Navigation