In a previous blog entry, we used langchain to make a Q&A bot out of the content of your website.

The Github repository which contains the code of the previous as well as this blog entry can be found here.

It was trending on Hacker news on March 22nd and you can check out the disccussion here.

This blog posts builds on the previous entry and makes a chatbot which you can interactively ask questions similar to how ChatGPT works.

We already created the relevant document embeddings of our website content and saved it in a file called faiss_store.pkl, so we’ll assume that we already have that one.

Framing our chatbot

Given a question from the user, we use the previous conversation and that question to make up a standalone question.

This is necessary, so the previous context is taken into account.

To do so, we use the CONDENSE_QUESTION_PROMPT template below.

In addition, we need have a template which we can use to prime the chatbot for the topics it should answer to, so in my case I primed it to answer to machine learning and technical questions in its template:

from langchain.prompts.prompt import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import ChatVectorDBChain

_template = """Given the following conversation and a follow up question,
rephrase the follow up question to be a standalone question.
Chat History:
Follow Up Input: {question}
Standalone question:"""
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)

template = """You are an AI assistant for answering questions about machine learning
and technical blog posts. You are given the following extracted parts of 
a long document and a question. Provide a conversational answer.
If you don't know the answer, just say "Hmm, I'm not sure.".
Don't try to make up an answer. If the question is not about
machine learning or technical topics, politely inform them that you are tuned
to only answer questions about machine learning and technical topics.
Question: {question}
Answer in Markdown:"""
QA = PromptTemplate(template=template, input_variables=["question", "context"])

def get_chain(vectorstore):
    llm = OpenAI(temperature=0)
    qa_chain = ChatVectorDBChain.from_llm(
    return qa_chain

Running the chatbot

Now we can write a small app wich uses our templates and our previously generated embeddings (faiss_store.pkl):

import pickle

if __name__ == "__main__":
    with open("faiss_store.pkl", "rb") as f:
        vectorstore = pickle.load(f)
    qa_chain = get_chain(vectorstore)
    chat_history = []
    print("Chat with the bot:")
    while True:
        print("Your question:")
        question = input()
        result = qa_chain({"question": question, "chat_history": chat_history})
        chat_history.append((question, result["answer"]))
        print(f"AI: {result['answer']}")

This works by taking in the question and the previous chat history and then makes a first attempt to rephrase it to a question taking the context into account. Then it searches your FAISS store to find documents which are semantically similar to the question as they are potential candidates to answer it.

Finally, the most relevant document excerpts together with the question are sent to the OpenAI API to retrieve the answer.

For more details about the approach and the full source code, check out my previous blog post as well as this Github repository.