DeepSeek exposed a fundamental AI scaling myth

If scale is the answer, what is the question?

Tech CEOs have long argued that for exponential improvements and emergent properties to develop in AI, the increased scale of computing power is all you need. This narrative has led to billions of dollars of investment in semiconductors. But in recent weeks, Chinese AI company DeepSeek claims to have produced models akin to OpenAI with a fraction of the compute and cost. Was the Tech CEO narrative about scale simply self-serving all along, and at its core, little more than a money grab? Dr Danyal Akarca here argues that scale is not sufficient to take AI to the new heights we have been promised.

 

It is hard to express the extent to which the last year has seen a seismic shift in AI, reshaping both the capabilities of these systems and the world’s perception of what is possible. With perhaps some exceptions, it is fair to say these systems have exceeded many lofty expectations. Public opinion and understanding are evolving at rapid pace. Governments are priming themselves for a new future.

___

I believe that the core motif at the centre of this rapid recalibration of perception is our understanding of scale: what to scale, why scale and how to scale intelligence.

___

At the same time, the economic landscape surrounding the building and deployment of AI is increasingly murky. As I write, NVIDIA is recovering from a 17% decrease in their market valuation—worth just shy of $600 billion—triggered by the (delayed) realisation that DeepSeek’s, China’s most famous AI company, breakthrough model reported required orders of magnitude less capital to build. This is the largest single-day drop by any US listed company in history. Commentators are now questioning the central economic assumptions about the cost of intelligence.

I believe that the core motif at the centre of this rapid recalibration of perception is our understanding of scale: what to scale, why scale and how to scale intelligence. I think it is increasingly clear that our relationship with what scaling means will effectively come to define the current era of AI acceleration.

related-video-image SUGGESTED VIEWING AI and the end of humanity With Liv Boeree, Güneş Taylor, Joscha Bach, Eliezer Yudkowsky, Scott Aaronson

 

Acing the Bitter Lesson

We now know that there is no intelligence without – at least some necessary level of – scale. This was Richard Sutton’s Bitter Lesson: the most significant and lasting progress in AI will come from methods leveraging computation and general-purpose learning, not human-crafted solutions (as what used to be the dominant view). This lesson has been shown to be unanimously powerful. Indeed, the fundamental paradigm of intelligence today is fuelled by this core belief – scaling the right architecture with enough compute and data will deliver intelligence on tap. Our global society is currently going through one of the largest capital allocation projects in history to fuel this vision ad infinitum.

This prompts some very fundamental questions about the future of our society. But here I will not focus on the broad implications for society. In the spirit of the current pandemonium due to the disruption caused by DeepSeek, I am going to focus on what it means for progress in AI moving forward.

 

Emergence with scale

Continue reading

Enjoy unlimited access to the world's leading thinkers.

Start by exploring our subscription options or joining our mailing list today.

Start Free Trial

Already a subscriber? Log in

Latest Releases
Join the conversation