Sekėjai

Ieškoti šiame dienoraštyje

2026 m. sausio 19 d., pirmadienis

‘Another DeepSeek moment’: Chinese AI model Kimi K2 stirs excitement

 

“The latest version of the chatbot, developed by start-up Moonshot AI, is open for researchers to build on.

 

Excitement is growing among researchers about another powerful artificial intelligence (AI) model to emerge from China, after DeepSeek shocked the world with its launch of R1 in January.

 

The performance of Kimi K2, launched on 11 July by Beijing-based company Moonshot AI, matches or surpasses that of Western rivals, as well as some DeepSeek models, across various benchmarks, according to the firm. In particular, it seems to excel at coding and scoring high in tests such as LiveCodeBench.

 

Scientists flock to DeepSeek: how they’re using the blockbuster AI model

 

As with DeepSeek’s models, Kimi K2 is open-weight, meaning it can be downloaded and built on by researchers for free. It can be accessed through an application programming interface (API) for a fraction of the price of leading proprietary models, such as Claude 4 from Anthropic in San Francisco, California.

 

“The community can freely use it, fine-tune it and build on it without training their own model from scratch,” says Adina Yakefu, an AI researcher at the open-science platform Hugging Face in New York City. Just one day after its launch, Kimi K2 was downloaded at a rate higher than that for any other model on the platform, Hugging Face data show. Its release is “another ‘DeepSeek moment’”, Yakefu says.

 

Unlike many other powerful models, K2 is not a ‘reasoner’ — a model trained to approach queries using step-by-step logic. Instead, it specializes in being an agentic large language model (LLM), meaning that it promises to carry out multi-step tasks using a variety of tools, such as browsing the web or calling on mathematics software. Some models, including some versions of ChatGPT, can already do this, but they are proprietary. AI researchers are still checking whether they can replicate examples of agentic behaviour that Moonshot AI says Kimi K2 can do.

The next top model

 

Having a second impressive model emerge from China in six months suggests that the feat was not an anomaly. “The DeepSeek R1 release earlier this year was more of a prequel than a one-off fluke in the trajectory of AI,” wrote Nathan Lambert, a machine-learning researcher at the Allen Institute for AI in Seattle, Washington, in his newsletter, Interconnects. Kimi K2 is “the new best open model in the world”, he posted on the social-media site Bluesky.

 

Moonshot AI, founded in March 2023, is a start-up organization that, until now, has been little known in the West. But its Kimi chatbot, based on a previous LLM, was already the third-most-used in China by November, according to Counterpoint, a marketing-research firm in Hong Kong. Chinese technology giants Alibaba and Tencent are reportedly among its investors.

 

NEWS 10 July 2025 OpenAI’s o3 tops new AI league table for answering scientific questions

 

Kimi K2 is as hefty as its backers, with one trillion parameters — the adjustable values that denote the strength of associations in the model. This many parameters would be very challenging for smaller laboratories to run, says Lambert. However, K2 activates only 32 billion parameters at a time, using a ‘mixture of experts’ architecture that allows it to use just the relevant parts of the model for each task, which helps to temper the amount of computing power it requires.

 

As well as coding, Kimi K2 seems to have a flair for writing. Some AI commenters on the social-media platform X praised its writing style for sounding unlike that of a typical AI. The model currently tops the leaderboard on the Creative Writing v3 benchmark, which tests criteria such as the authenticity of characters and avoidance of cliches, and the EQ-bench 3, which examines models’ emotional intelligence in role-play scenarios.

Not so science-y

 

But K2 does not excel at every task. On SciMuse — a benchmark that evaluates how well AIs predict which ideas human researchers will find interesting — it came in behind cutting-edge Gemini algorithms from Google and OpenAI’s suite of reasoning models, says Mario Krenn, who leads the Artificial Scientist Lab at the Max Planck Institute for the Science of Light in Erlangen, Germany.

 

Still, Moonshot AI is one of several Chinese firms deciding to publish their models openly, says Yakefu. The United States needs an open model of the calibre of those being produced by DeepSeek and Moonlight AI to counteract the country’s diminishing influence in open-source and academic communities, adds Lambert, something he refers to as the American DeepSeek Project.

 

“It’s very clear that a large number of top machine-learning researchers and engineers, with exceptional hardware, have been behind this effort,” says Krenn. “I wouldn’t be surprised if more will come [from China] in the next months.”” [1]

 

1. Nature 643, 889-890 (2025) By Elizabeth Gibney

Komentarų nėra: