GPT-5’s modest gains suggest AI progress is slowing down


GPT-5 is the latest version of the Grand Language Model of Openai
Images Cheng Xin / Getty
The last step in AI is not so much a giant jump as a provisional mixture. Openai has published its new IA model, GPT-5, two years after the deployment of the GPT-4, whose success has pushed the chatpt towards world domination. But despite the promises of a similar capacity jump, the GPT-5 seems to show little improvement compared to other leading AI models, suggesting that industry may need a new approach to build smarter AI systems.
Openai’s own declarations greet GPT-5 as a “significant jump for intelligence” of the previous models of the company, showing apparent improvements in programming, mathematics, writing, health information and visual understanding. He also promises less frequent hallucinations, that is to say when an AI has false information as true. On an internal reference measuring “performances on complex and economically precious knowledge work”, OpenAi says that the GPT – 5 is “comparable to or better than the experts in about half of the cases … In all tasks covering more than 40 professions, including law, logistics, sales and engineering”.
However, GPT-5’s performance on public references is not considerably better than the management of models from other AI companies, such as Claude d’Anthropic or the Gemini de Google. It improved on GPT-4, but the difference for many benchmarks is smaller than the GPT-3 jump to GPT-4. Many Chatgpt customers have not been impressed either, examples of GPT-5 not meeting apparently simple requests receiving generalized attention on social networks.
“Many people hoped there would be a breakthrough, and it is not a breakthrough,” said Mirella Lapata at the University of Edinburgh, in the United Kingdom. “It’s an upgrade, and it’s a bit progressive.”
The most complete measures of GPT-5’s performance come from Openai itself, because it only has full access to the model. Few details on the internal reference index have been made public, explains Anna Rogers to the IT University of Copenhagen in Denmark. “Therefore, it is not something that can be seriously discussed as a scientific affirmation.”
In a press briefing before the launch of the model, Altman said that “GPT-5 is the first time that it really looks like talking to an expert in any subject, like an expert in the doctorate.” But this is not supported by references, explains Rogers, and we do not know how a doctorate is more generally related to intelligence. “Very intelligent people do not necessarily have doctoral students, and having such a diploma does not necessarily guarantee high intelligence,” explains Rogers.
The apparently modest improvements of GPT-5 could be a sign of broader difficulties for AI developers. Until recently, we thought that such important language models (LLM) become more capable with more training and computer power data. It seems that this is no longer confirmed by the results of the latest models, and companies have failed to find better conceptions of AI systems than those that have propelled Chatgpt. “Everyone has the same recipe at the moment and we know what is the recipe,” explains Lapata, referring to the pre-training models with a large amount of data, then making adjustments with post-training processes thereafter.
However, it is difficult to say how close LLMs are to stagnate because we do not know exactly how models like GPT-5 are designed, explains Nikos Aletras at the University of Sheffield, in the United Kingdom. “Try to make generalizations on [whether] Large tongue models have reached a wall could be premature. We cannot really make these complaints without any information on technical details. »»
Openai has worked on other ways to make its product more effective, such as the new GPT-5 routing system. Unlike previous instances of Chatgpt, where people can choose the AI model to be used, GPT-5 now analyzes requests and directs them to a specific model that will use an appropriate amount of calculation power.
This approach could be adopted more widely, explains Lapata. “Reasoning models use a lot [computation]And it takes time and money, “he said.” If you can answer it with a smaller model, we will see more in the future. “But this decision has angry certain Chatgpt customers, which prompted Altman to say that the company plans to improve the routing process.
There are more positive signs for the future of AI in a separate Openai model which has obtained scores of gold medals in mathematical and elite coding competitions in the last month, which the best models of AI could not do a year ago. Although the details of how models work are rare again, Openai employees have said that its success suggests that the system has more general reasoning capacities.
These competitions are useful for testing models on the data they have not seen during their training, explains Aletras, but they are always close tests of intelligence. The increase in the performance of a model in a field could also worsen other people, explains Lapata, which can be difficult to follow.
An area where GPT -5 has improved considerably is its price, which is now much cheaper than other models – the best model of Claude d’Anthropic, for example, costs approximately 10 times more to process the same number of requests at the time of writing. But that could present its own long -term problems, if Openai’s income does not cover the major costs to which they have signed up to build and manage new data centers. “The price is crazy. It’s so cheap that I don’t know how they can afford it, ”explains Lapata.
Competition between the best models of AI is fierce, in particular while waiting for the first model to prevent others will take the major part of the market share. “All these large companies, they try to be the only winner, and it’s difficult,” explains Lapata. “You are a winner for three months.”
Subjects:




