Massive study detects AI fingerprints in millions of scientific papers

https://www.profitableratecpm.com/f4ffsdxe?key=39b1ebce72f3758345b2155c98e6709c
Researchers discover how AI influenced the choice of words in learned publications

Words showing increased frequency in 2024. (A) Frequencies in 2024 and frequency reports (R). The two axes are on a newspaper scale. Only a subset of points is labeled for visual clarity. The dotted line shows the threshold defining the excess words (see text). The words with r> 90 are indicated at R = 90. Excess words have been manually annotated in content (blue) and style words (orange). (B) the same but with a frequency difference (Δ) like the vertical axis. The words with Δ> 0.05 are represented at Δ = 0.05. Credit: Scientific advances (2025). DOI: 10.1126 / SCIADV.ADT3813

There is a good chance that you have met without knowing the convincing online content which was created, entirely or in part, by a version of a large language model (LLM). Like these IA resources, like Chatgpt and Google Gemini, become more competent to generate an almost human writing writing, it has become more difficult to distinguish pure human writing from the content that has been modified or entirely generated by LLM.

This increase in dubious authorship has raised concerns in the university community that the content generated by AI has quietly collapsed in publications evaluated by peers.

To shed light on how the widespread LLM content is in academic writing, a team of American and German researchers analyzed more than 15 million biomedical summaries on PubMed to determine whether the LLM had a detectable impact on specific words in journals.

Their investigation revealed that since the emergence of LLM, there has been a corresponding increase in the frequency of certain choices of stylist in university literature. These data suggest that at least 13.5% of the articles published in 2024 have been written with a certain amount of LLM treatment. The results appear in the open access newspaper Scientific advances.

Since the release of Chatgpt less than three years ago, the prevalence of artificial intelligence (AI) and LLM content on the web has exploded, which raises concerns about the accuracy and integrity of certain research.

The previous efforts to quantify the increase in LLM in academic writing, however, were limited by their dependence on the sets of text generated by man and LLM. This configuration, the authors note: “… Can introduce biases, because it requires hypotheses on the models that scientists use for their writing assisted by LLM, and how they invite them exactly.”

In order to avoid these limitations, the authors of the latest study have rather examined the changes in the excessive use of certain words before and after the public release of Chatgpt to discover the revealing trends.

The researchers modeled their investigation into previous research of public health COVID-19, which was able to deduct the impact of COVID-19 on mortality by comparing excessive deaths before and after the pandemic.

By applying the same approach before and after, the new study analyzed the models of use of excess words before the emergence of LLM and after. The researchers found that after the release of LLMS, there was a significant change in the excessive use of “Content words” excessive use of words choice “stylistic and flowery”such as “present”, “pivot” and “grappling”.

By manually affecting parts of the word for each excess word, the authors determined that before 2024, 79.2% of the choices of excess words were names. In 2024, there was a clearly identifiable change. 66% of the choices of excess words were verbs and 14% were adjectives.

The team also identified notable differences in the use of LLM between research areas, countries and places.

Written for you by our author Charles Blue, edited by Andrew Zinin – This article is the result of meticulous human work. We are counting on readers like you to keep independent scientific journalism alive. If this report matters to you, please consider a donation (especially monthly). You will get a without advertising count as a thank you.

More information:
Dmitry Kobak et al, plunging into writing assisted by LLM in biomedical publications by excess vocabulary, Scientific advances (2025). DOI: 10.1126 / SCIADV.ADT3813

© 2025 Science X Network

Quote: A massive study detects IA fingerprints in millions of scientific articles (2025, July 6) recovered on July 6, 2025 from https://phys.org/news/2025-07-massive-ai-fingerprints-millions-sienific.html

This document is subject to copyright. In addition to any fair program for private or research purposes, no part can be reproduced without written authorization. The content is provided only for information purposes.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button