AIllamaindexChatGPT

How I "trained" ChatGPT to my own documents in 10 mins

By Johannes Hayer
Picture of the author
Published on
robot learning a document

Imagine being able to seamlessly integrate your personal data storage with a powerful language model such as OpenAI's GPT, allowing you to interact with it through programming queries. This is precisely what LlamaIndex offers. By leveraging LlamaIndex, developers can bypass the need for time-consuming fine-tuning and excessive prompts. It uses the so called Retrieval Augmented Generation (RAG) to achieve this.

Untitled
Untitled
Untitled
Untitled

This is what LLama Index can do for you

LlamaIndex serves as a comprehensive "data framework" designed specifically for building applications using language models (LLMs). It offers a range of essential tools to support your development process, including:

  1. Data Connectors: LlamaIndex provides connectors that enable you to seamlessly integrate your existing data sources and formats, such as APIs, PDFs, documents, and SQL databases.
  2. Data Structuring: With LlamaIndex, you can easily structure your data using documents and nodes. A Document is a generic container around any data source - for instance, a PDF. A Node represents a “chunk” of a source Document, whether that is a text chunk, an image, or other
  3. Advanced Retrieval and Query Interface: LlamaIndex offers a powerful retrieval and query interface that enhances your LLM interactions. Simply input your desired prompt, and LlamaIndex retrieves relevant context and augments the output with enriched knowledge.

Now, let's take a look at an example of how Llama Index to chat with our own documents.

Implementation - Let’s go!

First things first, let's install all the dependencies we need for this project

Create a new folder called bot. In there create the requirements.txt with this content:

streamlit
openai
llama-index==0.8.4
nltk
pypdf

No install all of these dependencies with

pip3 install -r requirements.txt

Let's create a file called chatbot.py.

We will import all relevant libraries from llama_index, OpenAI, and Streamlit. Then we will define some global variables, such as where the documents are stored, the model we wanna use and so on.

Make sure that you have created a .env file and placed your OpenAI API key inside it.

OPENAI_API_KEY=YOUR_KEY

import os

import openai
import streamlit as st
from dotenv import load_dotenv
from llama_index import SimpleDirectoryReader, VectorStoreIndex, ServiceContext, Document
from llama_index.llms import OpenAI

load_dotenv()

# Constants
TEMPERATURE = 0
PATH_TO_DOCS = "./resources"
MODEL = "gpt-3.5-turbo"

# Global variables
chat_engine = None
llm_predictor = None

@st.cache_resource(show_spinner=False)
def construct_index():
    documents = SimpleDirectoryReader(
        input_dir=PATH_TO_DOCS, recursive=True
    ).load_data()
    service_context = ServiceContext.from_defaults(llm=OpenAI(
        model=MODEL, temperature=TEMPERATURE))
    index = VectorStoreIndex.from_documents(
        documents, service_context=service_context)
    return index

def handle_user_input():
    global chat_engine
    # Prompt for user input and save to chat history
    if prompt := st.chat_input("Your question"):
        st.session_state.messages.append({"role": "user", "content": prompt})

        # Display the prior chat messages
        for message in st.session_state.messages:
            with st.chat_message(message["role"]):
                st.write(message["content"])

        # If last message is not from assistant, generate a new response
        if st.session_state.messages[-1]["role"] != "assistant":
            with st.chat_message("assistant"):
                with st.spinner("Searching..."):
                    response = chat_engine.chat(prompt)
                    st.write(response.response)
                    message = {"role": "assistant",
                               "content": response.response}
                    # Add response to message history
                    st.session_state.messages.append(message)

def init_openai():
    openai.api_key = os.getenv("OPENAI_API_KEY")

def build_ui():
    global chat_engine
    st.set_page_config(
        page_title="Your personal cloud assistence",
        page_icon=":clouds:",
    )

    # Initialize the chat messages history
    if "messages" not in st.session_state.keys():
        st.session_state.messages = [
            {
                "role": "assistant",
                "content": "Hi there! I'm your personal cloud assistant. How can I help you?",
            }
        ]

def init_chat_engine():
    global chat_engine
    index = construct_index()
    chat_engine = index.as_chat_engine(chat_mode="context")

def main():
    init_openai()
    init_chat_engine()
    build_ui()
    handle_user_input()

if __name__ == "__main__":
    main()

What's going on here? Let's break it down

Data ingestion with Data Loaders

@st.cache_resource(show_spinner=False)
def construct_index():
    documents = SimpleDirectoryReader(
        input_dir=PATH_TO_DOCS, recursive=True
    ).load_data()
    service_context = ServiceContext.from_defaults(llm=OpenAI(
        model=MODEL, temperature=TEMPERATURE))
    index = VectorStoreIndex.from_documents(
        documents, service_context=service_context)
    return index
  1. To ingest the entire folder of desired data, Llama provides the SimpleDirectoryReader, which selects the appropriate file reader based on the file extensions. We create a new instance of this reader, where we provide the directory path and set the recursive lookup option. Then, we call the load_data() function to retrieve the documents.

  2. With the retrieved documents, we can construct an instance of the ServiceContext, which is a collection of resources used during a RAG pipeline's indexing and querying stages. ServiceContext allows us to adjust settings such as the LLM and embedding model. Here, we pass the global variables defined at the top as parameters for easier modification and better overview.

  3. With the ServiceContext object, we can finally create the index using VectorStoreIndex.from_documents(…), which structures the data in a way that helps the model quickly retrieve context from the data.

    Finally, we return the index in order to create the chat engine.

Chat Engine

def init_chat_engine():
    global chat_engine
    index = construct_index()
    chat_engine = index.as_chat_engine(chat_mode="context")

Let's create a new function to initiate the chat engine. We reference the chat_engine to use the global variable. Next, we get our index with construct_index(). Now with the returned index, we can call the as_chat_engine(…) function on it to create a llama chat engine.

How to use the chat engine

def handle_user_input():
    global chat_engine
    # Prompt for user input and save to chat history
    if prompt := st.chat_input("Your question"):
        st.session_state.messages.append({"role": "user", "content": prompt})

        # Display the prior chat messages
        for message in st.session_state.messages:
            with st.chat_message(message["role"]):
                st.write(message["content"])

        # If last message is not from assistant, generate a new response
        if st.session_state.messages[-1]["role"] != "assistant":
            with st.chat_message("assistant"):
                with st.spinner("Searching..."):
                    **response = chat_engine.chat(prompt)**
                    st.write(response.response)
                    message = {"role": "assistant",
                               "content": response.response}
                    # Add response to message history
                    st.session_state.messages.append(message)

The function first declares the chat_engine variable as a global variable, (since we want to use the global variable chat_engine that we have previously initialized)

Next, the function prompts the user for input using the st.chat_input() function from the Streamlit library. If the user provides input, the input is saved to the chat history by adding a new message object to the st.session_state.messages list. This message object includes the role of the message (either "user" or "assistant") and the content of the message.

The function then displays the previous chat messages by iterating over the st.session_state.messages list and using the st.chat_message() and st.write() functions to show each message.

If the last message in the chat history is not from the assistant, the function generates a new response by calling the chat() method of the chat_engine object. This method takes the user's input as an argument and returns a response object containing the generated response.

Finally, the function displays the generated response using the st.write() function and saves the response to the chat history by adding a new message object to the st.session_state.messages list.

The Reward

Let's start our streamlit app with: streamlit run chatbot.py I have indexed the AWS Whitepaper wellarchitected-iot-lens 2023 edition. Lets ask a question about firmware updates.

this is an excerpt from the IoT whitepaper
this is an excerpt from the IoT whitepaper

The answer - nice work!

Untitled
Untitled

If you found this content helpful ⇢

Credits:

https://blog.streamlit.io/build-a-chatbot-with-custom-data-sources-powered-by-llamaindex/

By Caroline FrascaKrista Muir and Yi Ding

Stay Tuned

Subscribe for development and indie hacking tips!