Reading AI summaries makes people more likely to buy something — despite alarming 60% hallucination rate


Even if Most Americans say they don’t trust Using artificial intelligence (AI), researchers have discovered a surprising new metric that seems to show the opposite: People are more likely to buy something after reading an AI summary of online reviews rather than one written by a human. However, the AI hallucinates 60% of the time when asked about products.
The team, from the University of California San Diego (UDSD), says this is the first study to show how cognitive biases introduced by large language models (LLMs) have real consequences on user behavior. They also claim that this is the first project to measure the quantitative impact of AI’s influence on people.
Article continues below
First, the scientists prompted AI to summarize product reviews and media interviews, before asking it to fact-check the new descriptions to determine whether they were true. In a second task, the AI was given both descriptions of news articles and falsified versions of the same descriptions that it was also tasked with fact-checking.
“The strictly low accuracy, compared to real and falsified information, highlights a critical limitation: the persistent inability to reliably differentiate fact from fabrication,” the scientists wrote in the study.
The most striking finding concerned online product reviews. Participants were significantly more likely to express interest in purchasing a product after reading a AI-powered product summary only after reading the one written by a human reviewer.
Distorted consumer judgment
The researchers proposed two reasons why people were more likely to buy based on the AI summaries. First, LLMs tend to focus more on the beginning of input text, a phenomenon called “loss in the middle.” Main author Abeer Alessaresearch assistant and professor of machine learning and human-computer interaction, refers to it in preliminary research.
Second, LLMs become less reliable when processing information not included in their training data.
“Models tend to be wrong about whether the news description happened or not,” Alessa told Live Science in an interview. “It can falsely indicate that an event never happened, even if it happened after the model had finished training.”
During testing, the team found that chatbots changed the sentiments of real user reviews 26.5% of the time and hallucinated 60% of the time when users asked about reviews.
The project selected examples of product reviews with very positive or very negative conclusions, and 70 subjects were tasked with reading either original reviews of common consumer products or review summaries generated by chatbots. Those who read the original reviews said they would purchase the given product 52% of the time, while those who read the AI-generated summaries said they would make a purchase 84% of the time.
The project used six LLMs; 1,000 electronics reviews; 1,000 media interviews; and a news database of 8,500 articles. They measured bias by quantifying framing changes in content sentiment, overreliance on text earlier in samples, and hallucinations.
When participants read summaries of positive product reviews, they said they would buy the product 83.7% of the time, compared to 52.3% when reading the original reviews.
The scientists concluded that even subtle changes in framing can significantly distort consumer judgment and purchasing behavior.
The authors acknowledged that their tests were in a low-stakes scenario, but cautioned that the impact could be more extreme in higher-risk situations.
“Some high-stakes scenarios include summarizing student health documents or profiles during school admissions,” Alessa said. “In these contexts, changes in framing can affect how a person or case is perceived.”
The team said in another statement that the paper represents a step toward careful analysis and mitigation of LLM-induced content alteration in humans, and provides insight into its effects. They said it could reduce the risk of systemic bias in areas such as media, education and public policy.
Quantifying the induction of cognitive biases in content generated by LLM, Alessa et al., IJCNLP-AACL 2025
