Using LLMs in the real world: preventing hallucinations

robot-human-imagining

LLMs (Large Language Models) like ChatGPT might be the most impactful innovation since the internet. For the first time we have a technology that can bridge the understanding gap between humans and computers.

However, their Achilles heel can be their propensity to make stuff up. Increasingly people talk about this behavior as "hallucinations" - because of the intensity of the vision and the conviction with which LLMs deliver answers that turn out to be false. When you're designing real products for real people, this turns out to be a major confidence-underminer.

We know that lots of tech and product leaders are scrambling to work out how to add AI features to their products, and for many, hallucinations are a big uncertainty. Read on to learn more about how (and whether) to handle them.

Do hallucinations matter?

At Harriet, our product is a virtual HR operations assistant who lives in Slack and makes your organization run like a well-oiled machine. Individual team members ask Harriet both for information on their company ("I left my laptop on the bus, what do I need to do to claim on insurance?") and to do things for them ("I'm renting a new apartment, can I get copies of my payslips for the last three months").

It's actually fine if Harriet can't help and says that. What's problematic is if she offers to do something that she can't or says she's done something that she hasn't.

We've found that users who get hallucinatory responses invariably learn not to trust the information that Harriet provides. So for us, hallucination prevention is a big concern.

This doesn't necessarily apply to all AI products. In some cases you might not care too much about hallucinations:

  • When the user is sufficiently well-trained to recognise a bad response and can just hit "try again"
  • Where facts are not important, e.g. ideation ("generate me 10 great names for our new onion-cinnamon-raisin bagel")

This is intended as a gentle introduction to the topic for product and tech leaders rather than an in-depth tech how-to guide. If that's more the vibe you'd like to have in future, please tell us by messaging [email protected].

Think of it as telling you what you want to hear rather than what you need to know.

What causes hallucination

To fix an issue, we need to understand what causes it. Hallucinations are a by-product of how LLMs function. Their job is to guess the next word you'd want to see. They're not wired to know what's true or false, they just make educated guesses. Think of it as telling you what you want to hear rather than what you need to know.

When you're chatting with users, there are normally three ways that hallucinations get introduced:

1. Built-in model deficiencies

User: "How many planets are in our solar system?" LLM: "Nine planets."
(The correct answer, after Pluto was demoted, is eight. NB a real LLM wouldn't get this wrong.)

On one level, LLMs are a compression method that allows us to retrieve information that's not explicitly present in the input (shout-out to the great Ted Chiang essay ChatGPT is a blurry JPEG of the web). Some information just isn't there in the first place.

2. Confusion from previous user inputs

User 1: "I believe there are 10 planets." [Later in the conversation...]

User 2: "How many planets again?" LLM: "There are 10 planets."
(LLM remembered the incorrect number mentioned earlier.)

3. Confusion from the modelโ€™s previous responses
User: "Tell me the names of some fictional planets." LLM: "Alderaan, Gallifrey, and Krypton." [Later...] User: "List real planets." LLM: "Alderaan, Gallifrey, Krypton..."
(LLM is mistakenly referencing its previous fictional planet list.)

In general, the tools we can use to solve for (1) also help (2) and (3). However we have access to additional techniques to help us with chat-history-derived issues. We'll deal with those in a future blog post.

Different Chat Environments: Closed vs. Open

There are broadly two different chat environments:

  • General knowledge (open). In this situation you want to extract information from the model without putting it there in the first place.
  • Summarization/question answering over a document corpus (closed). Here you typically want to get a LLM to summarize information in response to a specific prompt.

Several tricks have been tried that apply to both.

  • Making the model think twice: Just like you'd double-check an answer.
  • Telling it clearly: Like saying, "Don't guess, just use the facts." This sort of prompt engineering is widely used in e.g. Langchain.

We have found that these techniques do not work robustly. For other use cases or user experiences, they might be fine, but on their own, errors sneak in.

In fact, we have found that nothing is 100% successful for where you are relying on the LLM's general knowledge. There's a lot of work ongoing around fine-tuning models to give better references, but we're not terribly optimistic.

The good news is that when you are supplying the LLM with context, there are techniques that work. The one we've found most successful is to use a second LLM to double-check the first:

LLM-checking

You can extend this approach to make sure that the LLM provides appropriate references to information that it's based its answer on.

What's next?

Using LLMs is like chatting with a super-smart friend who occasionally tells tall tales. While they're great, it's essential to be aware of their quirks. By understanding where they can trip up and using clever fixes, like a buddy system with another LLM, we can build extraordinary user experiences.

Over the next weeks we're going to dive into related topics - chat history segmentation, referencing, testing, data scrubbing and more. Follow the Harriet page on LinkedIn to get notified when we publish something new!