Sekėjai

Ieškoti šiame dienoraštyje

2025 m. sausio 28 d., antradienis

Do China’s A.I. Advances Mean U.S. Technology Controls Have Failed? They Do.

 

"DeepSeek’s A.I. models show that China is making rapid gains in the field, despite American efforts to hinder it.

The United States has worked steadily over the past three years to limit China’s access to the cutting edge computer chips that power advanced artificial intelligence systems. Its aim has been to slow China’s progress in developing sophisticated A.I. models.

Now a Chinese firm, DeepSeek, has created that very technology. In recent weeks, DeepSeek released multiple A.I. models and a chatbot whose performance rivals that of the best products made by American firms, all while using far fewer of the high-cost A.I. chips that companies typically need. Over the weekend, DeepSeek’s chatbot shot to the top of Apple’s App Store charts as people downloaded it around the world.

The development has raised big questions about export controls built by the United States in recent years. The Biden administration set up a system of global rules and steadily expanded them to try to keep advanced A.I. technology — particularly chips made by Nvidia — out of Chinese hands. They were concerned that technology would give China an edge not just economically, but also militarily.

DeepSeek’s development has provoked a fierce debate over whether U.S. technology controls have failed. Here’s what to know.

DeepSeek’s innovations suggest the Biden administration may have acted too slowly to keep up with private companies sidestepping its controls.

DeepSeek has said that its most recent model was trained on Nvidia H800s. This is an A.I. chip that Nvidia developed specifically for the Chinese market after export controls were first imposed, and that caused a fair amount of drama in Washington.

When the United States put restrictions on Nvidia’s most advanced chips in 2022, Nvidia quickly adapted by creating slightly downgraded chips that fell just under the threshold the government had set. These chips were technically legal for Chinese companies to use, but allowed them to achieve practically the same results.

This angered Biden officials, and they moved to restrict the new chips as well. But the government moved slowly, and it took them about a year to ban the H800 and other downgraded chips. In the meantime, Chinese companies stockpiled a lot of them.

It’s not clear how DeepSeek obtained its Nvidia H800s, but it would have been legal for the company to buy them in late 2022 or 2023. Now, however, such purchases would not be.

“You can’t control what’s already there,” said Jimmy Goodrich, a senior adviser for technology analysis at the RAND Corporation. “Had the Biden administration more quickly responded and limited the H800 to China, there’s no doubt DeepSeek would have been more challenged in putting this model out.”

DeepSeek also spent years building up its chip supply before Washington’s controls took effect. By 2021, DeepSeek was one of just a handful of Chinese companies that had acquired at least 10,000 Nvidia A100s, the advanced chip Nvidia released in 2020, according to an interview with Liang Wenfeng, the founder of DeepSeek, in the Chinese media outlet 36Kr.

The U.S. has also struggled to stamp out chip smuggling.

There’s no evidence that DeepSeek has used smuggled chips. But many Chinese A.I. companies have. Alexandr Wang, the chief executive of the A.I. training giant Scale AI, told The New York Times that Chinese companies had far more high-end chips than U.S. restrictions allowed, and that DeepSeek probably had about 50,000 Nvidia advanced H100 processors, “which they obviously can’t talk about.”

Both Nvidia and the U.S. government have argued that the scale of smuggling was limited. But The Times last year reported an active trade in China in restricted A.I. technology. In a bustling market in Shenzhen, in southern China, chip vendors reported engaging in sales involving hundreds or thousands of restricted chips.

Representatives of 11 companies said they sold or transported banned Nvidia chips — including A100s and H100s, the company’s most advanced at the time — and The Times found dozens more businesses offering them online. One vendor in Shenzhen showed a reporter screenshots arranging deliveries of servers containing more than 2,000 of Nvidia’s most advanced chips, a transaction totaling $103 million.

Since then, more reports have emerged documenting large-scale smuggling, particularly through other countries in Asia.

The Biden administration released a sweeping regulation this month that aims to deal with the smuggling issue, by setting caps on the number of chips that Nvidia can sell to every country worldwide.

It remains to be seen what the Trump administration will do about it. In a trade executive order President Trump signed on his first day in office, however, he ordered his officials to review the U.S. export control system, including “how to identify and eliminate loopholes in existing export controls.”

U.S. controls appear to have encouraged Chinese ingenuity — but they have also clearly held back China’s A.I. development.

American technology restrictions appear to have accelerated the efforts of Chinese researchers to try to do more with less.

The most notable thing about DeepSeek’s model is that, according to the company, it was developed with just a fraction of the high-priced chips that Western companies have used to make similar technology. DeepSeek’s engineers said they used only about 2,000 Nvidia chips, whereas most top companies have trained chatbots using 16,000 chips or more. Nvidia’s shares plunged sharply on Monday on fears that technology companies will be able to do cutting-edge A.I. in the future while paying Nvidia far less.

Jeffrey Ding, a professor at George Washington University who studies emerging technologies, said that most global companies have been using ever-larger amounts of computing power and data to improve A.I. performance. But DeepSeek and other Chinese firms had been “forced to go down this other pathway to find out whether we can get good enough performance with lower training costs and less compute,” he said.

The implications of cheaper models like DeepSeek’s could be profound. With DeepSeek openly sharing details about how it built its model, companies in China and around the world will be able to replicate its low-cost approach.

That means “it will be much cheaper and could be far less energy intensive for anyone to build and run A.I., from U.S. hyperscalers to Midwestern small businesses, North Korean hackers and Russia’s military,” said Martin Chorzempa, a senior fellow at the Peterson Institute for International Economics.

Still, China would likely be much further ahead in A.I. without the export controls. In interviews, DeepSeek’s founder has acknowledged that the lack of access to computing power was a limitation for the company.

Unlike American A.I. companies, DeepSeek will not be able to legally purchase the newest generation of A.I. chips that Nvidia is rolling out right now, which multiplies the speed and performance of the previous chips.

“Anyone worried about what DeepSeek can do today would be more worried if they had done it with access to the far superior computing resources their U.S. competitors have,” Mr. Chorzempa said.

DeepSeek’s success suggests that Silicon Valley’s lead on A.I. has shrunk, despite efforts by Washington to limit Chinese access to the advanced chips. But it’s notable that DeepSeek is still building its models on Nvidia chips — not on the rival A.I. chips that the Chinese technology firm Huawei is trying to develop.

Some Chinese computer engineers have suggested it would be possible to run the latest DeepSeek model on a larger number of less advanced chips, including those made by Huawei, even though Huawei’s A.I. chips are much lower performing.

But no Chinese company is yet able to make advanced A.I. chips that rival Nvidia’s, or makes the type of complex machinery needed to make those chips. “The only advantage the United States still has over China at this moment is in hardware,” Mr. Goodrich said." [1]

Conclusion: We can all achieve reasonably good A.I. results with less training costs and less computation. This is a huge win for the world and a defeat for Biden, who tried to block progress in the world.

1. Do China’s A.I. Advances Mean U.S. Technology Controls Have Failed?. Swanson, Ana; Tobin, Meaghan.  New York Times (Online) New York Times Company. Jan 28, 2025.

 

Iracionalios baimės reakcija į „DeepSeek“ dirbtinį intelektą


 

 "Kas tai mate ateinant? Ne Volstritas, kuris pirmadienį išpardavė technologijų akcijas po savaitgalio žinios, kad labai sudėtingas Kinijos dirbtinio intelekto (AI) modelis "DeepSeek" konkuruoja su "Didžiųjų Tech" sukurtomis sistemomis, tačiau jo kūrimas kainuoja tik dalį. Jų plataus užmojo, ir ne tik akcijų srityje.

 

 Technologijų reikalaujantis Nasdaq krito 3,1%, o tai lėmė 16,9% Nvidia akcijų kritimas. „Nvidia“ dominuoja pažangių AI lustų rinkoje. Nuo 2023 m. pradžios jos akcijos išaugo daugiau, nei 10 kartų – iki pirmadienio rinkos vertė siekė daugiau, nei 3,3 trilijonus dolerių, technologijų milžinams paskelbus apie dideles AI išlaidas.

 

 Įveskite „DeepSeek“, kuri praėjusią savaitę išleido naują R1 modelį, kuris teigia esąs toks pat pažangus, kaip „OpenAI“ matematikos, kodo ir samprotavimo užduočių srityje.

 

Modelį apžiūrėję technikos guru sutiko. Vienas ekonomistas paklausė R1, kiek Donaldo Trumpo siūlomi 25% tarifai paveiks Kanados BVP, ir per 12 sekundžių išspjovė atsakymą, artimą pagrindinio banko skaičiavimui. Kartu su išsamiais veiksmais, R1 psgrindė atsakymą.

 

 Dar labiau stebina tai, kad „DeepSeek“ treniruotėms reikėjo daug mažiau lustų, nei kitiems pažangiems AI modeliams, todėl jų kūrimas kainavo tik 5,6 mln. dolerių, Kiti pažangūs modeliai kainuoja apie 1 mlrd. dolerių.

 

Rizikos kapitalistas Marcas Andreessenas tai pavadino „AI Sputnik momentu“, ir jis gali būti teisus.

 

 „DeepSeek“ ginčija prielaidas dėl skaičiavimo galios ir išlaidų, reikalingų dirbtinio intelekto pažangai. „OpenAI“, „Oracle“ ir „SoftBank“ praėjusią savaitę pateko į antraštes, kai paskelbė apie bendrą įmonę „Stargate“, kuri investuos iki 500 mlrd. dolerių į AI infrastruktūros kūrimą. Šiais metais „Microsoft“ AI duomenų centrams planuoja išleisti 80 mlrd. dolerių.

 

 Generalinis direktorius Markas Zuckerbergas penktadienį sakė, kad „Meta“ šiais metais AI projektams išleis apie 65 mlrd. dolerių. „Meta“ tikisi iki šių metų pabaigos turėti 1,3 mln. pažangių lustų. Pranešama, kad DeepSeek modeliui sukurti prireikė vos 10 000.

 

 „DeepSeek“ proveržis reiškia, kad šiems technologijų milžinams gali nereikėti tiek daug išleisti savo dirbtinio intelekto modeliams apmokyti. Tačiau tai taip pat reiškia, kad šios įmonės, ypač „Google“ „DeepMind“, gali prarasti savo technologinį pranašumą. „Google“ akcijos pirmadienį nukrito 4 proc.

 

„DeepSeek“ modelis yra atvirojo kodo, o tai reiškia, kad kiti kūrėjai gali tikrinti ir dirbti su jo kodu bei kurti savo programas.

 

 Tai galėtų padėti daugiau mažų įmonių prieiti prie AI įrankių už nedidelę uždarojo kodo modelių, tokių, kaip „OpenAI“ ir „Anthropic“, kuriuos remia „Amazon“, kainos. Tokios uždarojo kodo sistemos turi pranašumų, ypač privatumo ir nacionalinio saugumo požiūriu. Tačiau atvirasis kodas gali paskatinti daugiau bendradarbiavimo ir eksperimentavimo.

 

 Pastebėtina, kad „DeepSeek“ yra startuolis, kurį įkūrė Kinijos rizikos draudimo fondų prekiautojas Liang Wenfeng. Amerikiečiai mano, kad Kinijos ekonomika veikia iš viršaus į apačią, ir didžioji dalis jos taip ir yra. Tačiau jos augimą per pastaruosius kelis dešimtmečius, ypač technologijų srityje, paskatino verslininkai. „Alibaba“, „Tencent“ ir „ByteDance“ kažkada buvo startuoliai, kurie dabar konkuruoja su JAV technologijų milžinais.

 

 Tai dar viena priežastis, kodėl JAV neišvengia spąstų manydamos, kad jos turi imituoti Kinijos pramonės politiką, kad pavyktų dirbtinio intelekto lenktynėse. Dviejų partijų Senato AI ataskaitoje praėjusį pavasarį buvo raginama Kongresui skirti 32 mlrd. dolerių. Koks tai būtų pinigų švaistymas.

 

 ---

 

 „DeepSeek“ pateisina prezidento Trumpo sprendimą atšaukti Bideno vykdomąjį įsakymą, kuris suteikė vyriausybei pernelyg didelę AI kontrolę. Įmonės, kuriančios dirbtinio intelekto modelius, keliančius „rimtą pavojų“ nacionaliniam saugumui, ekonominiam saugumui arba visuomenės sveikatai ir saugai, mokydamos savo modelius, būtų turėjusios pranešti reguliavimo institucijoms ir pasidalinti „raudonosios komandos saugos testų“ rezultatais.

 

 J. Bidenas teigė, kad tokie testai reikalingi, kad būtų pašalintas šališkumas, apribojimai ir klaidos. Tačiau atvirojo kodo modeliai leidžia visuomenei peržiūrėti ir išbandyti sistemas.

 

Kai kurie pažymėjo, kad „DeepSeek“ neatsako į klausimus, susijusius su Pekinui politiškai jautriomis temomis.

 

 „DeepSeek“ taip pat turėtų paskatinti Vašingtono respublikonus permąstyti savo antimonopolines manijas, susijusias su didelėmis technologijomis. Biurokratai nepajėgūs prižiūrėti tūkstančių dirbtinio intelekto modelių, o didesnis reguliavimas sulėtintų naujoves ir JAV įmonėms būtų sunkiau konkuruoti su Kinija. Kaip rodo DeepSeek, Dovydas gali konkuruoti su Galijatais. Tegul žydi tūkstančiai Amerikos AI gėlių." [1]

1.  The DeepSeek AI Freakout. Wall Street Journal, Eastern edition; New York, N.Y.. 28 Jan 2025: A18.

The DeepSeek AI Freakout


"Who saw that coming? Not Wall Street, which sold off tech stocks on Monday after the weekend news that a highly sophisticated Chinese AI model, DeepSeek, rivals Big Tech-built systems but cost a fraction to develop. The implications are likely to be far-reaching, and not merely in equities.

The tech-heavy Nasdaq fell 3.1%, driven by a 16.9% dive in Nvidia shares. Nvidia dominates the market in advanced AI chips. Its stock had surged more than 10-fold since early 2023 -- achieving a more than $3.3 trillion market valuation until Monday -- as tech giants announced hefty outlays on AI.

Enter DeepSeek, which last week released a new R1 model that claims to be as advanced as OpenAI's on math, code and reasoning tasks. Tech gurus who inspected the model agreed. One economist asked R1 how much Donald Trump's proposed 25% tariffs will affect Canada's GDP, and it spit back an answer close to that of a major bank's estimate in 12 seconds. Along with the detailed steps R1 used to get to the answer.

More startling, DeepSeek required far fewer chips to train than other advanced AI models and thus cost only an estimated $5.6 million to develop. Other advanced models cost in the neighborhood of $1 billion. Venture capitalist Marc Andreessen called it "AI's Sputnik moment," and he may be right.

DeepSeek is challenging assumptions about the computing power and spending needed for AI advances. OpenAI, Oracle and SoftBank last week made headlines when they announced a joint venture, Stargate, to invest up to $500 billion in building out AI infrastructure. Microsoft plans to spend $80 billion on AI data centers this year.

CEO Mark Zuckerberg on Friday said Meta would spend about $65 billion on AI projects this year and build a data center "so large that it would cover a significant part of Manhattan." Meta expects to have 1.3 million advanced chips by the end of this year. DeepSeek's model reportedly required as few as 10,000 to develop.

DeepSeek's breakthrough means these tech giants may not have to spend as much to train their AI models. But it also means these firms, notably Google's DeepMind, might lose their first-mover, technological edge. Google shares fell 4% on Monday. DeepSeek's model is open-source, meaning that other developers can inspect and fiddle with its code and build their own applications with it.

This could help give more small businesses access to AI tools at a fraction of the cost of closed-source models like OpenAI and Anthropic, which Amazon has backed. There are advantages to such closed-source systems, especially for privacy and national security. But open-source can foster more collaboration and experimentation.

It's notable that DeepSeek is a startup founded by Liang Wenfeng, a Chinese hedge fund trader. Americans think of China's economy as run top-down, and much of it is. But its growth over the last few decades, especially in tech, has been spurred by entrepreneurs. Alibaba, Tencent and ByteDance were all once startups that now rival U.S. tech giants.

This is another reason for the U.S. to avoid the trap of thinking it must imitate Chinese industrial policy to succeed in the AI race. A bipartisan Senate AI report last spring called for Congress to pass $32 billion a year in "emergency" spending for non-defense AI, supposedly to better compete with China. What a waste of money that would be.

---

DeepSeek is vindicating President Trump's decision to rescind a Biden executive order that gave government far too much control over AI. Companies developing AI models that pose a "serious risk" to national security, economic security, or public health and safety would have had to notify regulators when training their models and share the results of "red-team safety tests."

Mr. Biden said such tests are needed to eliminate biases, limitations and errors. But open-source models allow the public to review and test systems. Some have pointed out that DeepSeek doesn't answer questions on subjects that are politically sensitive to Beijing.

DeepSeek should also cause Republicans in Washington to rethink their antitrust obsessions with big tech. Bureaucrats aren't capable of overseeing thousands of AI models, and more regulation would slow innovation and make it harder for U.S. companies to compete with China. As DeepSeek shows, it's possible for a David to compete with the Goliaths. Let a thousand American AI flowers bloom." [1]

1.  The DeepSeek AI Freakout. Wall Street Journal, Eastern edition; New York, N.Y.. 28 Jan 2025: A18.