OpenAI claims that its new flagship mannequin, GPT-5, marks “a significant step along the path to AGI” – that’s, the factitious normal intelligence that AI bosses and self-proclaimed consultants typically declare is across the nook.
In line with OpenAI’s personal definition, AGI could be “a highly autonomous system that outperforms humans at most economically valuable work”.
Setting apart whether or not that is one thing humanity ought to be striving for, OpenAI CEO Sam Altman’s arguments for GPT-5 being a “significant step” on this route sound remarkably unspectacular.
He claims GPT-5 is healthier at writing pc code than its predecessors. It’s mentioned to “hallucinate” a bit much less, and is a bit higher at following directions – particularly after they require following a number of steps and utilizing different software program. The mannequin can be apparently safer and fewer “sycophantic”, as a result of it is not going to deceive the consumer or present doubtlessly dangerous data simply to please them.
Altman does say that “GPT-5 is the first time that it really feels like talking to an expert in any topic, like a PhD-level expert”. But it nonetheless doesn’t have a clue about whether or not something it says is correct, as you’ll be able to see from its try beneath to attract a map of North America.
Sam Altman: With GPT-5, you'll have a PhD-level professional in any space you needMe: Draw a map of North America, highlighting nations, states, and capitalsGPT 5:
*Sam Altman forgot to say that the PhD-level professional used ChatGPT to cheat on all their geography lessons… pic.twitter.com/9L9VodXll1
— Luiza Jarovsky, PhD (@LuizaJarovsky) August 10, 2025
It additionally can not be taught from its personal expertise, or obtain greater than 42% accuracy on a difficult benchmark like “Humanity’s Last Exam”, which accommodates onerous questions on every kind of scientific (and different) material. That is barely beneath the 44% that Grok 4, the mannequin lately launched by Elon Musk’s xAI, is claimed to have achieved.
The principle technical innovation behind GPT-5 appears to be the introduction of a “router”. This decides which mannequin of GPT to delegate to when requested a query, basically asking itself how a lot effort to put money into computing its solutions (then enhancing over time by studying from suggestions about its earlier selections).
The choices for delegation embrace the earlier main fashions of GPT and in addition a brand new “deeper reasoning” mannequin known as GPT-5 Considering. It’s not clear what this new mannequin truly is. OpenAI isn’t saying it’s underpinned by any new algorithms or skilled on any new information (since all obtainable information was just about getting used already).
One may subsequently speculate that this mannequin is basically simply one other method of controlling current fashions with repeated queries and pushing them to work more durable till it produces higher outcomes.
What LLMs are
It was again in 2017 when researchers at Google came upon {that a} new sort of AI structure was able to capturing tremendously complicated patterns inside lengthy sequences of phrases that underpin the construction of human language.
By coaching these so-called massive language fashions (LLMs) on massive quantities of textual content, they might reply to prompts from a consumer by mapping a sequence of phrases to its most probably continuation in accordance with the patterns current within the dataset. This strategy to mimicking human intelligence grew to become higher and higher as LLMs have been skilled on bigger and bigger quantities of knowledge – resulting in techniques like ChatGPT.
In the end, these fashions simply encode a humongous desk of stimuli and responses. A consumer immediate is the stimulus, and the mannequin may simply as effectively look it up in a desk to find out the perfect response. Contemplating how easy this concept appears, it’s astounding that LLMs have eclipsed the capabilities of many different AI techniques – if not by way of accuracy and reliability, actually by way of flexibility and usefulness.
The jury should be out on whether or not these techniques may ever be able to true reasoning, or understanding the world in methods much like ours, or maintaining observe of their experiences to refine their behaviour accurately – all arguably needed substances of AGI.
Within the meantime, an business of AI software program firms has sprung up that focuses on “taming” normal goal LLMs to be extra dependable and predictable for particular use instances. Having studied write the best prompts, their software program may immediate a mannequin a number of occasions, or use quite a few LLMs, adjusting the directions till it will get the specified end result. In some instances, they could “fine-tune” an LLM with small-scale add-ons to make them more practical.
OpenAI’s new router is in the identical vein, besides it’s constructed into GPT-5. If this transfer succeeds, the engineers of firms additional down the AI provide chain can be wanted much less and fewer. GPT-5 would even be cheaper to customers than its LLM opponents as a result of it could be extra helpful with out these elaborations.
On the identical time, this might be an admission that we have now reached some extent the place LLMs can’t be improved a lot additional to ship on the promise of AGI. In that case, it can vindicate these scientists and business consultants who’ve been arguing for some time that it received’t be attainable to beat the present limitations in AI with out transferring past LLM architectures.
Previous wine into new fashions?
OpenAI’s new emphasis on routing additionally harks again to the “meta reasoning” that gained prominence in AI within the Nineties, based mostly on the concept of “reasoning about reasoning”. Think about, for instance, you have been making an attempt to calculate an optimum journey route on a posh map. Heading off in the proper route is straightforward, however each time you think about one other 100 options for the rest of the route, you’ll doubtless solely get an enchancment of 5% in your earlier best choice. At each level of the journey, the query is how far more considering it’s value doing.
This type of reasoning is vital for coping with complicated duties by breaking them down into smaller issues that may be solved with extra specialised elements. This was the predominant paradigm in AI till the main focus shifted to general-purpose LLMs.
It’s attainable that the discharge of GPT-5 marks a shift within the evolution of AI which, even when it’s not a return to this strategy, may usher in the long run of making ever extra sophisticated fashions whose thought processes are not possible for anybody to grasp.
Whether or not that might put us on a path towards AGI is difficult to say. However it may create a chance to maneuver in the direction of creating AIs we are able to management utilizing rigorous engineering strategies. And it’d assist us keep in mind that the unique imaginative and prescient of AI was not solely to copy human intelligence, but in addition to raised perceive it.
Michael Rovatsos, Professor of Synthetic Intelligence, College of Edinburgh
This text is republished from The Dialog beneath a Artistic Commons license. Learn the unique article.