AI’s Heavy Bet on Transformer Models: A Risky Gamble for AGI?

Big AI companies are pouring nearly all of their research and development budgets into pre-trained transformer models, betting that these systems can achieve human-level general intelligence. This strategy relies on backpropagation, the standard algorithm for training deep neural networks. However, Ben Goertzel, who coined the term "AGI" in his 2005 book Artificial General Intelligence (co-written with DeepMind founder Shane Legg), remains skeptical.

"The commercial AI industry is just betting everything on copying GPT in various permutations, which in my view is a waste of resources because all these LLMs are kind of doing about the same thing."

"When something works, everyone wants to double and triple down on what worked," he says. But this concentration of resources around a single paradigm may be risky."

Why Transformer Models May Not Be the Path to AGI

Transformer models require billions in compute for training and vast resources to operate. While labs have seen intelligence gains from scaling compute and data, those gains are becoming increasingly expensive. As models grow larger, the cost-benefit ratio may no longer justify the investment.

Goertzel argues that scale alone won’t achieve AGI without the right algorithms. He highlights a key limitation: transformer models cannot continually learn from new experiences like humans do. Instead, they reset to baseline parameters with each interaction, failing to meaningfully retain prior knowledge.

Exploring Alternatives to Transformers

Researchers at Google DeepMind, Microsoft, and Ilya Sutskever’s Safe Superintelligence are investigating alternative neural network architectures that support continual learning. Goertzel praises DeepMind’s diversity, noting its expertise in alternate AI paradigms.

The current AI landscape prioritizes refining existing methods over pursuing fundamentally different architectures—despite their potential to better enable human-level generalization.

Sakana AI’s Multi-Agent System: A Step Beyond Transformers?

Last week, Tokyo-based startup Sakana AI launched its beta product, Sakana Fugu, following its 2023 founding by Llion Jones, one of the nine inventors of transformer models, and former Google DeepMind researcher David Ha. Fugu is a multi-agent orchestration system designed to coordinate multiple frontier foundation models, including those from OpenAI, Google, and Anthropic.

Could AGI Emerge Within Years?

Goertzel remains optimistic about AGI’s near-term potential but emphasizes that it will likely require moving beyond scaled LLMs. "DeepMind has incredible diversity within their AI team," he says, "and possesses a ‘deep bench’ of experience with alternate AI paradigms."

The question remains: Will the AI industry’s heavy investment in transformers delay or accelerate the arrival of true AGI?