Sekėjai

Ieškoti šiame dienoraštyje

2025 m. sausio 30 d., ketvirtadienis

DeepSeek AI Is the Competition America Needs


"The success of DeepSeek, the Chinese rival to American goliaths with radically more cost-effective artificial intelligence, reveals the futility of U.S. sanctions policies.

Under the Biden administration, the American government was captured by some of the world's most ham-handed national-security socialists, while the Chinese private sector under Xi Jinping commands some of the world's most nimble capitalists.

The entrepreneur behind DeepSeek's apparent breakthrough is Liang Wenfeng, who founded the High-Flyer hedge fund in 2015. Since DeepSeek's launch less than two years ago, the venture has received no further outside funding.

China has roughly nine times as many engineers as the U.S. and perhaps 15 times as many science and technology graduates.

 That means Mr. Liang had a cornucopia of technical talent at his disposal, all galvanized by the challenge of doing AI without violating U.S. restrictions on the memory bandwidth of their Nvidia graphics processing units. These chips, like the leading GPUs in U.S. AI data centers, are nearly all fabricated by Taiwan Semiconductor Manufacturing Co.

"Do more with less" is the Chinese entrepreneurial answer to American "Stargate" program socialism, mobilizing a half-trillion dollars to do more with more, as governments and politicians usually try to do.

By discrediting U.S. sanctions and subsidies, again, Chinese capitalists are performing a service for U.S. capitalism.

American entrepreneurs are hamstrung by a putative $6 trillion in global climate-change mandates and subsidies for obsolete technologies, such as windmills and solar panels, specified by zero-sum Green New Dealers.

The U.S. has been dissipating the bonanzas conferred in recent decades on our economy by Chinese manufacturing prodigies from Foxconn in Shenzhen and other Chinese fabricators. Chinese factories have been crucial to enabling American companies to command as much as 70% of global equity market capitalization, compared with 10% at best for China.

DeepSeek, by using microchips more efficiently, is similarly favorable to the U.S. economy. As my chip-guru colleague John Schroeter wrote in his newsletter -- and both Nvidia's Jensen Huang and Microsoft's Satya Nadella have said -- semiconductors are an example of the Jevons Paradox. William Stanley Jevons, a 19th-century British economist, discovered that when a resource is rendered more efficient, we use more of it, often so much more that total spending on the resource rises. When people used only fire for lighting, the world was a very dark place. Nobel laureate William Nordhaus has pointed out that as we progressed from candles to oil lamps to incandescent lights and now LEDs, the cost of lighting dropped by 99.97%, yet we buy more of it than ever.

Advancing at an even faster pace, the number of transistors a dollar buys has increased by several million percent in 70 years. At the same time, annual global spending on semiconductors has grown from less than a few hundred million dollars to nearly $700 billion. The cheaper computing became, the more it was demanded.

Today, the key breakthrough in technology isn't some ingenious trope in AI software but the emergence of an era altogether beyond microchips. Called wafer-scale integration, it obviates the usual data-center welter of chips and "chiplets" in plastic packages backed by snarls of wire and racks of computer servers. Instead, the new regime banishes chips and integrates the essence of an entire data center on a single 12-inch wafer. A wafer is a silicon slice that serves as the target for semiconductor lithography usually inscribing the design of thousands of separate chips. In wafer scale, by contrast, it is just one integral system.

Pioneering this breakthrough are U.S. companies such as Cerebras and Tesla. Cerebras, an AI computer innovator beyond chips, has demonstrated wafer-scale computing on about four trillion interconnected transistors. With finance from G42, a tech company in the United Arab Emirates, Cerebras had planned an initial public offering until it ran into resistance from the U.S. government based on possible links between China and the U.A.E.

The most advanced wafer-scale project is Tesla's Dojo system for AI training. It is based on the vast accumulation of video data from the cameras on Tesla's automobiles. This system is based not on chips or internet data, but on real sensory inputs and "training tiles," which are interconnected across entire wafers. Since large language models such as DeepSeek and ChatGPT use unreliable internet data, they are inherently less likely to achieve intelligence in the real world than the pixel processors on Tesla's Dojo tiles.

Working with Taiwan Semiconductor Manufacturing Co. to overthrow the existing data-center era, these ventures promise processing economies of a scale millions of times greater than anything contemplated at DeepSeek or other AI companies.

As outlined in a January 2024 article in the journal Nature, a team from Georgia Tech led by a Dutchman, Walter de Heer, achieved a further wafer-scale breakthrough using a layer of graphene atop a silicon carbide wafer. Because graphene, a two-dimensional carbon sheet, switches 1,000 times faster than silicon, Mr. de Heer's technology, the fruit of roughly 20 years of research, foreshadows a new epoch in the materials science behind information technology.

The chief obstacle to the success of such ventures is the U.S. national-security apparatus, which somehow imagines that by inflicting sanctions on China, it can help Americans. Beyond the huge challenges of replacing the existing paradigm of semiconductor fabrication, Mr. de Heer's main obstacle is his previous links with Tianjin University in China and his Chinese students at Georgia Tech. He is under investigation by a congressional committee on China for alleged links between his research and the Chinese military. Mr. de Heer said several of his students are back in China, collecting about $350 million in investments for a wafer-scale project.

Technology is the key adventure of human progress, and it is intrinsically global. The key test of the Trump administration will be whether it can come to terms with this fact of life and enterprise.

---

Mr. Gilder is author of "Gaming AI: Why AI Can't Think but Can Transform Jobs" and "Life After Capitalism: The Information Theory of Economics."" [1]

1. DeepSeek AI Is the Competition America Needs. Gilder, George.  Wall Street Journal, Eastern edition; New York, N.Y.. 30 Jan 2025: A17. 

„DeepSeek“ nėra vienintelė Kinijos įmonė, kuri tobulina dirbtinio intelekto modelius


 „Kai „DeepSeek“ sukrėtė technologijų pasaulį savo pigiu modeliu, jis taip pat atkreipė dėmesį į klestinčią Kinijos dirbtinio intelekto (AI) rinką – sektorių, kurį Kinijos vyriausybė nustatė, kaip nacionalinį prioritetą.

 

 „DeepSeek“ yra tik viena iš daugybės įmonių, kuriančių AI modelius ir programas. Štai kelių didžiausių Kinijos žaidėjų vadovas.

 

 Technologijų gigantai

 

 „Alibaba“: elektroninės prekybos milžinas teikia pokalbių pokalbių roboto paslaugą „Qwen“, kurią maitina keli AI modeliai, įskaitant kai kuriuos, skirtus sudėtingesnėms samprotavimo ir kodavimo užduotims atlikti [1]. Šią savaitę „Alibaba“ taip pat išleido „Qwen2.5-Max“ – dirbtinio intelekto modelį, kuris, jos teigimu, konkuruoja su pasauliniais lyderiais, įskaitant „DeepSeek“. Ji nepaaiškino, ar sukūrė modelį su mažomis sąnaudomis ir dideliu efektyvumu, kuriuo gyrėsi „DeepSeek“.

 

 Tencent: didžiausia Kinijos vaizdo žaidimų įmonė sukūrė kelias savo AI modelio Hunyuan versijas. Teigiama, kad vienos lapkritį išleistos versijos našumas panašus į Meta's Llama 3.1. Kai kurių tyrinėtojų teigimu, Tencent gali panaudoti maždaug dešimtadalį skaičiavimo galios, kurią Meta naudojo modeliui parengti. Bendrovė integruoja dirbtinio intelekto galimybes į savo WeChat programą, visur esančią platformą Kinijoje, teikiančią viską nuo pokalbių iki bankininkystės.

 

 Baidu: „Baidu“, kuri pirmą kartą pasirodė, kaip paieškos variklių įmonė, pirmoji Kinijoje pristatė „ChatGPT“ atitikmenį, pavadintą Ernie Bot. Jo technologijų vadovas lapkritį pareiškė, kad jo modelį naudoja 430 mln. vartotojų.

 

 „ByteDance“: „ByteDance“, kuri yra „TikTok“ savininkas, turi pokalbių robotą „Doubao“, kurį galima išversti, kaip pupelių bandelę. Remiantis, AI produktus sekančios, svetainės Aicpb.com duomenimis, programa buvo viena iš, dažniausiai atsisiunčiamų, pokalbių robotų Kinijoje.

 

 Startuoliai

 

 „DeepSeek“: bendrovė šį mėnesį nustebino pasaulinę technologijų bendruomenę po to, kai pranešė, kad parengė dirbtinio intelekto modelius, kurie užtikrina aukštą našumą už mažą kainą be pažangiausių lustų. Antradienį ji išleido multimodalinį modelį, pavadintą Janus Pro, kuris, anot jo, gali duoti rezultatus, panašius į Open-AI teksto į vaizdą modelio DALL.E 3.

 

 StepFun: Bendrovė, kurios vertė siekia apie 2 milijardus dolerių, turi modelį, kuris dabar pagal našumą yra įtrauktas į 10 geriausių pasaulyje Chatbot Arenoje. Bendrovė, kurią įkūrė buvęs vyresnysis „Microsoft“ mokslininkas, pagrindiniais investuotojais laiko „Tencent“ ir Šanchajaus vyriausybę.

 

 Moonshot AI: Aicpb.com duomenimis, Moonshot Kimi pokalbių robotas Kinijoje turi apie 13 milijonų vartotojų. Startuolis, kurio vertė siekia apie 3,3 milijardo dolerių ir kurį remia Alibaba ir Tencent, įkūrė jaunas Kinijos mokslininkas, dirbęs Meta ir Google. Šį mėnesį „Moonshot“ išleido daugiarūšį samprotavimo modelį, pavadintą k1.5, kuris, anot jo, pralenkė tokius didelius pavadinimus, kaip OpenAI GPT-4o ir Anthropic Claude3.5 Sonnetas pagal kai kuriuos pagrindinius etalonus, įskaitant matematikos iššūkį.

 

 MiniMax: MiniMax yra Šanchajuje įsikūręs startuolis, kurio vertė siekia 3 mlrd. dolerių. Jis išrado į Character.ai panašų kompanioninį pokalbių robotą Talkie, kuris šį mėnesį išpopuliarėjo JAV, paskelbė du atvirojo kodo modelius, kurie, kaip teigiama, yra palyginami su OpenAI GPT-4o ir Anthropic Claude3.5 Sonnetu, naudojant technologija, vadinama žaibo dėmesiu, kuri leidžia greičiau skaičiuoti.

 

 Zhipu: Zhipu, kurio vertė buvo maždaug 3 mlrd. dolerių. Zhipu šį mėnesį taip pat buvo įtrauktas į JAV prekybos juodąjį sąrašą, skirtą kurti dirbtinio intelekto sistemas, kurios galėtų būti naudojamos kariniams tikslams. Zhipu sakė, kad JAV žingsnis buvo nepagrįstas." [2]

 

 1. "Qwen2-VL: Norėdami pamatyti pasaulį aiškiau

 

 2024 m. rugpjūčio 29 d. – Modelis ne tik gali išspręsti problemas analizuodamas paveikslėlius, bet taip pat gali interpretuoti ir išspręsti sudėtingas matematines problemas.

 

 Licencija

 

 Tiek atvirojo kodo Qwen2-VL-2B, tiek Qwen2-VL-7B veikia Apache 2.0."

 

 

2.  „DeepSeek“ nėra vienintelė įmonė, kuri tobulina dirbtinio intelekto modelius. Huangas, Raffaele. Wall Street Journal, Rytų leidimas; Niujorkas, NY. 2025 m. sausio 30 d.: A6.

 



2. U.S. News: DeepSeek Isn't the Only Company Making Advances With AI Models. Huang, Raffaele.  Wall Street Journal, Eastern edition; New York, N.Y.. 30 Jan 2025: A6.

DeepSeek Isn't the Only Chinese Company Making Advances With AI Models


"When DeepSeek jolted the global tech world with its low-cost model, it also threw a spotlight on China's booming artificial-intelligence market, a sector that the Chinese government has identified as a national priority.

DeepSeek is just one of an array of companies developing AI models and applications. Here's a guide to a number of the biggest players in China.

The tech giants

Alibaba: The e-commerce giant provides conversational chatbot service Qwen, powered by multiple AI models, including some designed for more complex reasoning and coding tasks [1]. This week, Alibaba also released Qwen2.5-Max, an artificial-intelligence model it said was competitive with global leaders, including DeepSeek. It hasn't clarified whether it developed the model with the low cost and high efficiency that DeepSeek has boasted of.

Tencent: China's biggest videogame company has developed multiple versions of its AI model Hunyuan. It said one version released in November delivered performance comparable to Meta's Llama 3.1. According to some researchers, Tencent might use around a tenth of the computing power Meta used to train the model. The company is integrating AI capabilities into its WeChat app, the ubiquitous platform in China providing everything from chats to banking.

Baidu: Baidu, which first emerged as a search-engine company, was the first in China to launch a ChatGPT equivalent, called Ernie Bot. Its technology chief said in November that its model had 430 million users.

ByteDance: ByteDance, which is the owner of TikTok, has a chatbot called Doubao, which can be translated as bean bun. The app has been among the most downloaded chatbots in China, with around 60 million monthly active users, according to Aicpb.com, a website tracking AI products

The upstarts

DeepSeek: The company surprised the global tech community this month after it said it had trained AI models that delivered high performance at low cost without the most advanced chips. Tuesday, it released a multimodal model, called Janus Pro, that it said could produce results comparable to Open-AI's text-to-image model DALL.E 3.

StepFun: The company, valued at around $2 billion, has a model that is now ranked for performance among the top 10 in the world in Chatbot Arena. Founded by a former senior Microsoft scientist, the company counts Tencent and the Shanghai government as key investors.

Moonshot AI: Moonshot's Kimi chatbot has around 13 million users in China, according to Aicpb.com. The startup, valued at around $3.3 billion and backed by Alibaba and Tencent, was founded by a young Chinese scientist who had stints at Meta and Google. This month, Moonshot released a multimodal reasoning model, called k1.5, that it said outperformed big names such as OpenAI's GPT-4o and Anthropic's Claude3.5 Sonnet on some major benchmarks, including a math challenge.

MiniMax: MiniMax is a Shanghai-based startup valued at $3 billion. It invented a Character.ai-like companion chatbot called Talkie, which has become popular in the U.S. This month, it published two open-source models that it claimed to be comparable to OpenAI's GPT-4o and Anthropic's Claude3.5 Sonnet, using a technique called Lightning Attention that allows faster computation.

Zhipu: Zhipu, valued at around $3 billion in its latest fundraising round in December, has invented a chatbot as well as a video-generating model called Ying that is similar to OpenAI's Sora. Zhipu was also included this month in a U.S. trade blacklist for developing AI systems that could have military uses. Zhipu said the U.S. move was baseless." [2]

 1. "Qwen2-VL: To See the World More Clearly

Aug 29, 2024 — The model is not only capable of solving problems by analyzing pictures but can also interpret and solve complex mathematical problems through .
   
 License

Both the open-source Qwen2-VL-2B and Qwen2-VL-7B are under Apache 2.0."

 



2. U.S. News: DeepSeek Isn't the Only Company Making Advances With AI Models. Huang, Raffaele.  Wall Street Journal, Eastern edition; New York, N.Y.. 30 Jan 2025: A6.