Sekėjai

Ieškoti šiame dienoraštyje

2024 m. gruodžio 21 d., šeštadienis

The Next Great Leap in AI Is Behind Schedule and Crazy Expensive --- OpenAI faces an existential question: If the smartest minds working on artificial intelligence can't make ChatGPT better, is AI itself doomed to disappoint?


"OpenAI's new artificial-intelligence project is behind schedule and running up huge bills. It isn't clear when -- or if -- it'll work. There may not be enough data in the world to make it smart enough.

The project, officially called GPT-5 and code-named Orion, has been in the works for more than 18 months and is intended to be a major advancement in the technology that powers ChatGPT. OpenAI's closest partner and largest investor, Microsoft, had expected to see the new model around mid-2024, say people with knowledge of the matter.

OpenAI has conducted at least two large training runs, each of which entails months of crunching huge amounts of data, with the goal of making Orion smarter. Each time, new problems arose and the software fell short of the results researchers were hoping for, people close to the project say.

At best, they say, Orion performs better than OpenAI's current offerings, but hasn't advanced enough to justify the enormous cost of keeping the new model running. A six-month training run can cost around half a billion dollars in computing costs alone, based on public and private estimates of various aspects of the training.

OpenAI and its brash chief executive, Sam Altman, sent shock waves through Silicon Valley with ChatGPT's launch two years ago. AI promised to continually exhibit dramatic improvements and permeate nearly all aspects of our lives. Tech giants could spend $1 trillion on AI projects in the coming years, analysts predict.

The weight of those expectations falls mostly on OpenAI, the company at ground zero of the AI boom.

The $157 billion valuation investors gave OpenAI in October is premised in large part on Altman's prediction that GPT-5 will represent a "significant leap forward" in all kinds of subjects and tasks.

GPT-5 is supposed to unlock new scientific discoveries as well as accomplish routine human tasks like booking appointments or flights. Researchers hope it will make fewer mistakes than today's AI, or at least acknowledge doubt -- something of a challenge for the current models, which can produce errors with apparent confidence, known as hallucinations.

AI chatbots run on underlying technology known as a large language model, or LLM. Consumers, businesses and governments already rely on them for everything from writing computer code to spiffing up marketing copy and planning parties. OpenAI's is called GPT-4, the fourth LLM the company has developed since its 2015 founding.

While GPT-4 acted like a smart high-schooler, the eventual GPT-5 would effectively have a Ph.D. in some tasks, a former OpenAI executive said. Earlier this year, Altman told students in a talk at Stanford University that OpenAI could say with "a high degree of scientific certainty" that GPT-5 would be much smarter than the current model.

There are no set criteria for determining when a model has become smart enough to be designated GPT-5. OpenAI can test its LLMs in areas like math and coding. It's up to company executives to decide whether the model is smart enough to be called GPT-5 based in large part on gut feelings or, as many technologists say, "vibes."

So far, the vibes are off.

OpenAI and Microsoft declined to comment for this article. In November, Altman said the startup wouldn't release anything called GPT-5 in 2024.

From the moment GPT-4 came out in March 2023, OpenAI has been working on GPT-5.

Longtime AI researchers say developing systems like LLMs is as much art as science. The most respected AI scientists in the world are celebrated for their intuition about how to get better results.

Models are tested during training runs, a sustained period when the model can be fed trillions of word fragments known as tokens. A large training run can take several months in a data center with tens of thousands of expensive and coveted computer chips, typically from Nvidia.

During a training run, researchers hunch over their computers for several weeks or even months, and try to feed much of the world's knowledge into an AI system using some of the most expensive hardware in far-flung data centers.

Altman has said training GPT-4 cost more than $100 million. Future AI models are expected to push past $1 billion. A failed training run is like a space rocket exploding in the sky shortly after launch.

Researchers try to minimize the odds of such a failure by conducting their experiments on a smaller scale -- doing a trial run before the real thing.

From the start, there were problems with plans for GPT-5.

In mid-2023, OpenAI started a training run that doubled as a test for a proposed new design for Orion. But the process was sluggish, signaling that a larger training run would likely take an incredibly long time, which would in turn make it outrageously expensive. And the results of the project, dubbed Arrakis, indicated that creating GPT-5 wouldn't go as smoothly as hoped.

OpenAI researchers decided to make some technical tweaks to strengthen Orion. They also concluded they needed more diverse, high-quality data. The public internet didn't have enough, they felt.

Generally, AI models become more capable the more data they gobble up. For LLMs, that data is primarily from books, academic publications and other well-respected sources. This material helps LLMs express themselves more clearly and handle a wide range of tasks.

For its prior models, OpenAI used data scraped from the internet: news articles, social-media posts and scientific papers.

To make Orion smarter, OpenAI needs to make it larger. That means it needs even more data, but there isn't enough.

"It gets really expensive and it becomes hard to find more equivalently high-quality data," said Ari Morcos, CEO of DatologyAI, a startup that builds tools to improve data selection. Morcos is building models with less -- but much better -- data, an approach he argues will make today's AI systems more capable than the strategy embraced by all top AI firms like OpenAI.

OpenAI's solution was to create data from scratch.

It is hiring people to write fresh software code or solve math problems for Orion to learn from. The workers, some of whom are software engineers and mathematicians, also share explanations for their work with Orion.

Many researchers think code, the language of software, can help LLMs work through problems they haven't already seen.

Having people explain their thinking deepens the value of the newly created data. It's more language for the LLM to absorb; it's also a map for how the model might solve similar problems in the future.

"We're transferring human intelligence from human minds into machine minds," said Jonathan Siddharth, CEO and co-founder of Turing, an AI-infrastructure company that works with OpenAI, Meta and others.

In AI training, Turing executives said, a software engineer might be prompted to write a program that efficiently solves a complex logic problem. A mathematician might have to calculate the maximum height of a pyramid constructed out of one million basketballs. The answers -- and, more important, how to reach them -- are then incorporated into the AI training materials.

OpenAI has worked with experts in subjects like theoretical physics, to explain how they would approach some of the toughest problems in their field. This can also help Orion get smarter.

The process is painfully slow. GPT-4 was trained on an estimated 13 trillion tokens. A thousand people writing 5,000 words a day would take months to produce a billion tokens.

OpenAI also started developing what is called synthetic data, or data created by AI, to help train Orion. The feedback loop of AI creating data for AI can often cause malfunctions or result in nonsensical answers, research has shown.

Scientists at OpenAI think they can avoid those problems by using data generated by another of its AI models, called o1, people familiar with the matter said.

OpenAI's already-difficult task has been complicated by internal turmoil and near-constant attempts by rivals to poach its top researchers, sometimes by offering them millions of dollars.

Last year, Altman was abruptly fired by OpenAI's board of directors, and some researchers wondered if the company would continue. Altman was quickly reinstated as CEO and set out to overhaul OpenAI's governance structure.

More than two dozen key executives, researchers and longtime employees have left OpenAI this year, including co-founder and Chief Scientist Ilya Sutskever and Chief Technology Officer Mira Murati. This past Thursday, Alec Radford, a widely admired researcher who served as lead author on several of OpenAI's scientific papers, announced his departure after about eight years at the company.

By early 2024, executives were starting to feel the pressure. GPT-4 was already a year old and rivals were starting to catch up. A new LLM from Anthropic was rated by many in the industry as better than GPT-4. Several months later, Google launched the most viral new AI application of the year, called NotebookLM.

As Orion stalled, OpenAI started developing other projects and applications. They included slimmed-down versions of GPT-4 and Sora, a product that can produce AI-generated video.

That led to fighting over limited computing resources between teams working on new products and Orion researchers, according to people familiar with the matter.

Competition among AI labs has grown so fierce that major tech companies publish fewer papers about recent findings or breakthroughs than is typical in science. As money flooded the market two years ago, tech companies started viewing the results of this research as trade secrets that needed guarding. Some researchers take this so seriously they won't work on planes, coffee shops or anyplace where someone could peer over their shoulder and catch a glimpse of their work.

That secretive attitude has frustrated many longtime AI researchers, including Yann LeCun, chief AI scientist at Meta. LeCun said work from OpenAI and Anthropic should no longer be viewed as research, but as "advanced product development."

"If you're doing it on a commercial clock, it's not called research," said LeCun on the sidelines of a recent AI conference, where OpenAI had a minimal presence. "If you're doing it in secret, it's not called research."

In early 2024, OpenAI prepared to give Orion another try, this time armed with better data. Researchers launched a couple of smaller-scale training runs over the first few months of the year to build up confidence.

By May, OpenAI's researchers decided they were ready to attempt another large-scale training run for Orion, which they expected to last through November.

Once the training began, researchers discovered a problem in the data: It wasn't as diversified as they had thought, potentially limiting how much Orion would learn.

The problem hadn't been visible in smaller-scale efforts and only became apparent after the large training run had already started. OpenAI had spent too much time and money to start over.

Instead, researchers scrambled to find a wider range of data to feed the model during the training process. It isn't clear if this strategy proved fruitful.

Orion's problems signaled to some at OpenAI that the more-is-more strategy, which had driven much of its earlier success, was running out of steam.

OpenAI isn't the only company worrying that progress has hit a wall. Across the industry, a debate is raging over whether improvement in AIs is starting to plateau.

Sutskever, who recently co-founded a new AI firm called Safe Superintelligence or SSI, declared at a recent AI conference that the age of maximum data is over. "Data is not growing because we have but one internet," he told a crowd of researchers, policy experts and scientists. "You can even go as far as to say that data is the fossil fuel of AI."

And that fuel was starting to run out.

Their struggles on Orion led OpenAI researchers to a new approach to making an LLM smarter: reasoning. Spending a long time "thinking" could allow LLMs to solve difficult problems they haven't been trained on, researchers say.

Behind the scenes, OpenAI's o1 offers several responses to each question and analyzes them to find the best one. It can perform more complex tasks, like writing a business plan or creating a crossword puzzle, while explaining its reasoning -- which helps the model learn a little bit from each answer.

Researchers at Apple recently released a paper that argues reasoning models, including versions of o1, were most likely mimicking the data they saw in training rather than actually solving new problems.

The Apple researchers said they found "catastrophic performance drops" if questions were changed to include irrelevant details -- like tweaking a math problem about kiwis to note that some of the fruits were smaller than others.

In September, OpenAI launched a preview of its o1 reasoning model and released the full version of o1 earlier this month.

All that added brainpower is expensive. OpenAI is now paying to generate multiple answers to a single query, instead of just one.

In a recent TED talk, one of OpenAI's senior research scientists played up the advantages of reasoning.

"It turned out that having the bot think for just 20 seconds in a hand of poker got the same boost in performance as scaling up the model by 100,000x and training for 100,000 times longer," said Noam Brown, the OpenAI scientist.

A more advanced and efficient reasoning model could form the underpinnings of Orion. OpenAI researchers are pursuing that approach and hoping to combine it with the old method of more data, some of which could come from OpenAI's other AI models. OpenAI could then refine the results with material generated by people.

On Friday, Altman announced plans for a new reasoning model smarter than anything the company has released before. He didn't say anything about when, or whether, a model worthy of being called GPT-5 is coming." [1]

1. EXCHANGE --- The Next Great Leap in AI Is Behind Schedule and Crazy Expensive --- OpenAI faces an existential question: If the smartest minds working on artificial intelligence can't make ChatGPT better, is AI itself doomed to disappoint? Seetharaman, Deepa.  Wall Street Journal, Eastern edition; New York, N.Y.. 21 Dec 2024: B1.

Komentarų nėra: