Private Chat with GPT-4: A Case Study

Brief

This case study presents a solution to leverage advanced AI models in corporate environments while addressing data privacy concerns and cost issues. By implementing a private chat system using Open WebUI, LiteLLM, and Azure OpenAI, we created a secure, cost-effective platform that allows employees to utilize state-of-the-art AI models without compromising sensitive information. The solution resulted in significant cost savings, improved productivity, and maintained data compliance, demonstrating a successful integration of AI technology in a corporate setting.

Overview

When ChatGPT came out, there were several headlines about how allegedly internal data was leaked by the employees at companies such as Samsung.

The story: https://www.forbes.com/sites/siladityaray/2023/05/02/samsung-bans-chatgpt-and-other-chatbots-for-employees-after-sensitive-code-leak/

It feels unfair to call it a leak because the data was not shared with anyone outside the company. It was just used to train a model that could generate text that was indistinguishable from the real thing.

Having your conversations used for training is perhaps especially a deal breaker in corporate environments.

I felt the same around my colleagues. Many have tried and generally agreed it was impressive and had potential but were hesitant to use it for anything serious.

Furthermore, the best value came from GPT 4 which would mean flat 25USD/monthly per head. Imagine the expense of having that for a team of 100 people. Would then the cost be justified by the value? Unlikely.

Problem

It's pretty easy to see two problems.

  1. We have a novel technology that hopefully carries as much potential as hype, however those who would benefit the most are also at the highest risk of using it. Policies like GDPR or confidential information neither cannot be legally sent out, neither it is safe to do so.

  2. The cost of using the best models is prohibitive for most companies, versus the value it brings.

How to leverage the bleeding edge while respecting the data?

Solution

To be honest, I came up with the answer before I asked the question.

As a developer, I wanted both - the best models and easy to switch in case a new SOTA comes out. Of course, without breaking my or my employer's bank.

It wasn't long before I came across HuggingFace's Chat UI and Portkey Gateway, which as two lego pieces, fit each other perfectly. After a weekend of hacking, I had a solution that was both secure and cost-effective. That was towards end of 2023. At the time of writing, I have upgraded to Open WebUI and LiteLLM, which make the staying up to date a breeze.

As an employee handling sensitive data, I wanted a custom data handling agreement. Fortunately it was relatively easy to receive that with Azure OpenAI. And later, with Gemini on Vertex AI.

Let's break down what exactly these lego pieces are.

Open WebUI

It's a modern web interface that closely mimics ChatGPT but is open source. It is built on top of OpenAI's APIs. In a nutshell, one needs to only to start it, put in the API key and it's ready to go. It's also possible to run it on a local machine, which is a huge plus for security.

Hugging Face's Chat UI is a great interface. I felt it was more tailored to different needs than the use case I was consuming for, hence the switch.

LiteLLM

LiteLLM is an OpenAI compatible proxy. In simple language, it takes OpenAI's standard and translates it to one of many supported providers. That means that I can easily prompt Claude 3 from Anthropic or Gemini 1.5 from Google by adding the respective API keys and updating the endpoints.

Worth a mention that Portkey Gateway does exactly the same and is a great alternative should you want to host it on the edge, such as Cloudflare Workers. I chose LiteLLM as it allows me to pipe all API keys into its configuration file once and then the only thing I need to change is the model property, unlike Portkey Gateway, where it's completely unaware on what requests will be coming - that means the request has to include special headers with endpoint, api keys etc.

Azure OpenAI

While not necessary for majority of my questions, there were use cases where, as a Danish company adhering to European standards, we had to respect both the GDPR and ideally not send our data outside EU. The ability to opt out from monitoring and data persistence offered by a custom agreement with Microsoft Azure was a huge plus. It's one more lego piece that fits perfectly.

How It Was Done

Once everything was connected, we had a chat interface hosted on our servers and accessible through internal subdomain. Users like me received ability to hot swap the models when I would prompt them with my coding challenges, and users wanting to handle confidential data could easily choose one of Azure OpenAI models.

It could also be said that, even though it is no longer the case, in early days having access to GPT 4 was more exclusive and Azure has granted us that early 2023. That gave the operating edge, especially in regards to the color pages project.

Implementation Challenges

While setting up this private chat system, we encountered several challenges:

  1. Integration Complexity: Initially, connecting all the components (Open WebUI, LiteLLM, and Azure OpenAI) required careful configuration and troubleshooting to ensure seamless communication between the different parts of the system.

  2. User Authentication: Implementing a secure authentication system to ensure only authorized employees could access the chat interface took additional time and resources.

  3. Model Selection: Choosing the right models to offer a balance between performance and cost-effectiveness required extensive testing and evaluation.

  4. Data Compliance: Ensuring that our setup complied with GDPR and other data protection regulations required careful review of each component's data handling practices and close collaboration with our legal team.

  5. User Training: Introducing the new system to employees and providing guidance on its proper use, especially regarding sensitive information, required developing comprehensive training materials and conducting sessions.

We overcame these challenges through a combination of thorough research, collaboration with IT and legal departments, and iterative testing and refinement of the system. The result was a robust, secure, and user-friendly private chat solution that met our organization's needs.

DIY Approach

If you want to replicate what I did, here's a simplified process:

  1. Get Open WebUI: Download the latest release from the GitHub repository.
  2. Set Up LiteLLM: Follow the instructions on the LiteLLM GitHub repository.
  3. Configure Azure OpenAI: Reach out to Microsoft Azure to set up a custom data handling agreement. It's a one-time process and can be done through the Azure Portal.
  4. Connect the Dots: Update the LiteLLM configuration file with your Azure OpenAI API key and endpoint. Start the Open WebUI and point it to the LiteLLM endpoint. You're ready to go!

Final thoughts

As I am writing this on my last month at Flügger, I have come to learn that it is perhaps the most loved little piece of software I set up. It was clearly communicated that we have to ensure that it remains up and running, as I am happy to say that it has been running without a hitch for the past 6 months.

Since it's take off we have had more than 234 chats, and have processed more than 1.5 million tokens. These numbers may be lesser than reality as deleted chats are gone. This is calculation of the stats stored in database across those that were not deleted.

There are 12 people using.

We saved around 555USD versus if everyone had their own paid subscription. Plus, that doesn't take into account ability to switch between powerful models such as Gemini 1.5 Pro featuring 2 Million tokens window or Claude 3 Opus which is exceptional at its writing capabilities.

Links to aforementioned things

Chat UIs:

Proxies:

Azure opt out from data collection: