Creating a custom chatbot with OpenAI

Rahul Agarwal
5 min readMay 23, 2023
California super bloom in the SF Bay Area
California super bloom in the SF Bay Area

My social feeds are full of generative AI so with fomo in mind here is an attempt to create my own chatbot, using OpenAI APIs, and augmented with custom data.

There are way too many options, so this exercise has been about figuring out the tools to pick and stitching them together. I only understand the basics so this is a hello world example.

Overview

The fundamental problem is how can I have ChatGPT answer questions it does not know about? As you would expect, the custom data needs to be made available, indexed, and then incorporated into the queries. The following shows these steps.

Prerequisite

Indexing custom data

This process is known as creating embeddings and the output is some very large vector files. Note: For this demo I am using public data from VMware public docs. It will goto OpenAI so be careful what you use. Even though its public, ChatGPT does not have this so works well for my example. I am using some PDF files but it can be any format.

This is where two tools, LlamaIndex and LangChain come in. The names are very descriptive and as they imply, the first helps to create the index and the seconds chains it with OpenAI to produce the conversation! That is good enough to know for now.

See build_index_openai.py where LlamaIndex invokes OpenAI embeddings calls and then saves the files. The source PDFs are under local-data and the index output gets stored in local-index as vector files.

Creating embeddings from documents
Creating embeddings from documents
Embeddings vector
Embeddings vector

Chat Bot

A simple command line would have sufficed, but given how simple some of these libraries are, I have a chat interface using Gradio. See local-index-chat-openai.py It loads the local index built earlier, and using LangChain combines with OpenAI to provide a conversational interface.

Chains involved in responding to a prompt
Chains involved in responding to a prompt
Chatbot using custom data embeddings
Chatbot using custom data embeddings

In your terminal you will see how it executes and when it decides to use a “tool” which is your local index.

> Entering new AgentExecutor chain...
Thought: Do I need to use a tool? No
AI: Hello Rahul! How can I assist you today?

> Finished chain.

> Entering new AgentExecutor chain...
Thought: Do I need to use a tool? Yes
Action: Local Index
Action Input: "cpn vmware"
Observation:
VMware Cloud Partner Navigator is a unified software-as-a-service (SaaS) delivery platform for Cloud Services Provider partners in the Partner Connect Program. VMware Cloud Partner Navigator simplifies the management and delivery of multi-cloud services and the consumption of infrastructure-as-a-service (IaaS), across an expanding set of cloud endpoints and cross-cloud services. With VMware Cloud Partner Navigator, Cloud Services Providers can bring their VMware cloud estate under a single management umbrella and streamline business operations by getting a unified view and single-sign-on experience across all supported VMware Cloud services and offerings, with the capability to provision cloud resources to end customers from public or private clouds and the option to expand cloud infrastructure on demand.

> Finished chain.

> Entering new AgentExecutor chain...
Thought: Do I need to use a tool? Yes
Action: Local Index
Action Input: "vmware payment method change"
Observation:
To change the payment method for your Organization, navigate to Billing & Subscriptions > Manage Payment Methods and then add or change a payment method prior to the end of your billing period. You can also select Add Payment Methods and then Confirm to link a PBI account, fund, or credit card in the next screen. To change the default payment method for your Organization, navigate to Manage Payment Methods > Default Payment Method and click Change Default Payment Method. Select an available payment method listed and click Confirm. You can also select Add Payment Methods and then Confirm to link a PBI account, fund, or credit card in the next screen.

> Finished chain.

There is a lot more to learn, but I hope this helps you to get started!

OpenAI APIs

I work on cloud commerce so I’m always interested in how services are monetized. OpenAI has a simple on-demand/pay-as-you-go model and they meter by the response generated — tokens in the case of text. A token is not necessarily a word delimited by whitespace. They use something called byte pair encoding (BPE) as the tokenization mechanism. So the more tokens you receive the more it will cost. Within a few minutes the usage gets rated and displayed so that is impressive. I have not received an invoice yet but given the token counting complexity I wonder how they deal with billing disputes.

OpenAI API usage cost
OpenAI API usage cost

If these topics interest you then reach out to me and I will appreciate any feedback. If you would like to work on such problems you will generally find open roles as well! Please refer to LinkedIn.

--

--