Researchers Uncover Reasons for Hallucinations from Generative AI

A mural depicting two abstracted faces in colorful shapes, gazing in the same direction toward the left of the frame. — Image by creator rawpixel.com, under a CC0 1.0 Universal license.

In mid-September, researcher Leon Chlon and his colleagues, Ahmed Karim and Maggie Chlon, published a paper arguing that large language models (LLMs) sometimes invent facts because of flaws in the way they compress information. The problem comes partly from the order in which models read prompts. Key details that appear late in a prompt can slip down the list of priorities or be missed altogether, leading the system to fill in gaps with guesswork, the team wrote.

The study introduces a new way to pinpoint when a model doesn’t have enough information to answer, and is more likely to compose hallucinations. On his Substack, Chlon suggested a practical fix: reshuffling the same prompt in different orders. By averaging the model’s answers, researchers can detect when confidence is real and when it’s just a side effect of prompt wording or order.

Using an example in medical diagnosis, Chlon demonstrated how an LLM misfires if symptoms appear late in a prompt, a phenomenon he called the Extraction Pathway Problem.
Google Staff Architect Mohammad Ghodratigohar analyzed Chlon’s paper in a YouTube video, demonstrating how to introduce technical, code-level solutions to predict and reduce hallucinations in a model.

[Back to AI Hub Home Page]