Sekėjai

Ieškoti šiame dienoraštyje

2025 m. vasario 22 d., šeštadienis

Why AI Spending Isn't Going to Slow Down --- Soaring demand for reasoning models will consume electricity, microchips and data-center real estate for the foreseeable future


"Despite a brief period of investor doubt, money is pouring into artificial intelligence from big tech companies, national governments and venture capitalists at unprecedented levels. To understand why, it helps to appreciate the way that AI itself is changing.

The technology is shifting away from conventional large language models and toward reasoning models and AI agents.

Training conventional large language models -- the kind you've encountered in free versions of most AI chatbots -- requires vast amounts of power and computing time. But we're rapidly figuring out ways to reduce the amount of resources they need to run when a human calls on them.

Reasoning models, which are based on large language models, are different in that their actual operation consumes many times more resources, in terms of both microchips and electricity.

Since OpenAI previewed its first reasoning model, called o1, in September, AI companies have been rushing to release systems that can compete. This includes DeepSeek's R1, which rocked the AI world and the valuations of many tech and power companies at the beginning of this year, and Elon Musk's xAI, which just debuted its Grok 3 reasoning model.

DeepSeek caused a panic of sorts because it showed that an AI model could be trained for a fraction of the cost of other models, something that could cut demand for data centers and expensive advanced chips. But what DeepSeek really did was push the AI industry even harder toward resource-intensive reasoning models, meaning that computing infrastructure is still very much needed.

Owing to their enhanced capabilities, these reasoning systems will likely soon become the default way that people use AI for many tasks. OpenAI Chief Executive Sam Altman said the next major upgrade to his company's AI model will include advanced reasoning capabilities.

Why do reasoning models -- and the products they're a part of, like "deep research" tools and AI agents -- need so much more power? The answer lies in how they work.

AI reasoning models can easily use more than 100 times as much computing resources as conventional large language models, Nvidia's vice president of product management for AI, Kari Briski, wrote in a recent blog post. That multiplier comes from reasoning models spending minutes or even hours talking to themselves -- not all of which the user sees -- in a long "chain of thought." The amount of computing resources used by a model is proportional to the number of words generated, so a reasoning model that generates 100 times as many words to answer a question will use that much more electricity and other resources.

Things can get even more resource-intensive when reasoning models access the internet, as Google's, OpenAI's and Perplexity's "deep research" models do.

These demands for computing power are just the beginning. As a reflection of that, Google, Microsoft and Meta Platforms are collectively planning to spend at least $215 billion on capital expenditures -- much of that for AI data centers -- in 2025. That would represent a 45% increase in their capital spending from last year.

To demonstrate the projections of future AI demand, we can lay out a simple equation.

The first value in our equation is the amount of computing resources needed to process a single token of information in an AI like the one that powers ChatGPT.

In January, it appeared that the cost per token -- in both computing power and dollars -- would crash in the wake of the release of DeepSeek R1, the Chinese AI model. DeepSeek, with its accompanying paper, showed it was possible to both train and deliver AI in a way that was radically more efficient than the approaches previously disclosed by American AI labs.

On its face, this would seem to indicate that AI's future demand for computing power would be some fraction of its current amount -- say, a tenth, or even less. But the increase in demand from reasoning models when they are answering queries could more than make up for that.

To look at in the most simplistic way, if new, more efficient AI models based on the insights that went into DeepSeek slash demand for computing power for AI by a tenth, but reasoning models become the standard and increase demand for those models by a factor of 100, that's still a 10-fold increase in future demand for power for AI.

This is just the starting point. As businesses are discovering that the new AI models are more capable, they're calling on them more and more often. This is shifting demand for computing capacity from training models toward using them -- or what's called "inference" in the AI industry.

Tuhin Srivastava, CEO of Baseten, which provides AI computing resources to other companies, says that this swing toward inference is already well under way.

His customers consist of tech companies that use AI in their apps and services, such as Descript, which allows content creators to edit audio and video directly from a transcript of a recording, and PicnicHealth, a startup that processes medical records. Baseten's customers are finding that they need more AI processing power as demand for their own products rapidly grows, says Srivastava.

"For one customer, we brought their costs down probably 60% six months ago, and within three months, they were already consuming at a higher level than they were consuming initially," he adds.

All of the big AI labs at companies like OpenAI, Google and Meta are still trying to best one another by training ever-more-capable AI models. Whatever the cost, the prize is capturing as much of the still-nascent market for AI as possible.

"I think it's entirely possible that frontier labs need to keep pumping in staggering amounts of money in order to push the frontier forward," says Chris Taylor, CEO of Fractional AI, a San Francisco-based startup that helps other software companies build and integrate custom AIs. His company, like Baseten and many others in the blossoming AI ecosystem, relies on those cutting-edge models to deliver results for its own customers.

Over the next couple of years, new innovations and more AI-specific microchips could mean systems that deliver AI to end customers become 1,000 times more efficient than they are today, says Tomasz Tunguz, a venture capitalist and founder of Theory Ventures. The bet that investors and big tech companies are making, he adds, is that over the course of the coming decade, the amount of demand for AI models could go up by a factor of a trillion or more, thanks to reasoning models and rapid adoption.

"Every keystroke in your keyboard, or every phoneme you utter into a microphone, will be transcribed or manipulated by at least one AI," says Tunguz. And if that's the case, he adds, the AI market could soon be 1,000 times larger than it is today." [1]

1. EXCHANGE --- Keywords: Why AI Spending Isn't Going to Slow Down --- Soaring demand for reasoning models will consume electricity, microchips and data-center real estate for the foreseeable future. Mims, Christopher.  Wall Street Journal, Eastern edition; New York, N.Y.. 22 Feb 2025: B2.

 

Komentarų nėra: