Sekėjai

Ieškoti šiame dienoraštyje

2025 m. rugsėjo 26 d., penktadienis

Books: Machines That Learn To Learn


 

“These Strange New Minds

 

By Christopher Summerfield

 

Viking, 384 pages, $32

 

The Scaling Era

 

By Dwarkesh Patel with Gavin Leech

 

Stripe, 248 pages, $35

 

In recent weeks, Mark Zuckerberg has been making head-spinning offers to the artificial-intelligence researchers of rival firms to join Meta's new "superintelligence" lab. You may be wondering what superintelligence is, and why a purveyor of social media would offer to pay a scientist hundreds of millions of dollars. To hear insiders tell it, nothing less than the fate of humanity is in play. At those stakes, nine-figure nerds make perfect sense.

 

The story begins, as it always does, before modern times. As Christopher Summerfield reminds us in "These Strange New Minds: How AI Learned to Talk and What It Means," the ancients treated gods as the source of knowledge. The Greeks, in turn, debated whether wisdom came from reason (Plato) or experience (Aristotle). That debate was taken up some two millennia later in the field of computer science, where many early AI systems followed the logic diligently imbued by experts. But the world, being what it is, stubbornly defied such logic, presenting exceptions to every rule.

 

Experience-based approaches gradually took hold in the form of machine learning. Developers realized that if you present a machine-learning system with enough data, it will form its own nuanced understanding of how to, say, play chess or identify a cat.

 

Or use language. The current fervor in AI is over large language models (LLMs), the software engines that power chatbots such as ChatGPT, Claude, Gemini and Grok. Mr. Summerfield notes that the linguist Noam Chomsky has long argued that humans rely on innate wiring for grammar. It turns out LLMs do fine without it (albeit requiring much more training than a child).

 

LLMs learn by first processing lots of text and practicing to predict the next fragment -- an advanced form of autocomplete. Then they're trained to imitate more-curated data, follow instructions, or please human or AI evaluators. Recently, they've gone beyond language to process and generate multimedia and to use software tools, becoming semiautonomous "agents."

 

Mr. Summerfield, a cognitive neuroscientist at the University of Oxford and a staff research scientist at Google DeepMind, writes in genial, joke-laden prose, but includes the occasional barb. Some are aimed at critics who claim that LLM improvement is hitting a wall, or that LLMs don't really understand anything. Such detractors point to the models' obvious errors, such as the mangling of simple math problems, or note that LLMs think in word frequencies, with no experience navigating the physical world.

 

But AI's progress continues to surprise even experts, with models performing tasks that appear, to many reasonable observers, to require reasoning and common sense. In one example, an LLM explained to Mr. Summerfield why sunflowers could not grow on the moon: "There are no sunflowers or any other forms of life, plant or otherwise, on the moon. The moon lacks the atmosphere, water, and stable temperature required to support Earth-like forms." The author calls it "a watershed moment for humanity, in which the dream of automating knowledge seems to finally be within grasp." Some researchers have pejoratively called LLMs parrots, but the author prefers to use a different avian analogy: "If something swims like a duck and quacks like a duck, then we should assume that it probably is a duck."

 

Still, Mr. Summerfield acknowledges AI's stark differences from humans. LLMs don't have bodies, social relationships or (likely) consciousness. They don't read nonverbal cues or reliably grasp conversational context.

 

Ludwig Wittgenstein described language as a series of games, each with its own rules and goals: to instruct, to speculate, to amuse. LLMs sometimes play the wrong game, inventing, for instance, when they should be informing.

 

Some of Mr. Summerfield's barbs are aimed at AI accelerationists who dismiss the many potential hazards that lie ahead. As the author details, AIs can already confabulate, spread propaganda, invade privacy, seduce users, control killer drones and drive people to suicide. In the long term, they might take over corporations and governments (or entice us to hand over the reins) or even push us to extinction. Mr. Summerfield thinks that if disaster strikes, it will be the work not of one superintelligent agent but a swarm of smaller ones, which could cause flash crashes or other forms of mayhem. "It is almost always groups, and not individuals, that have been able to change the world," he writes. Preventing catastrophe, one assumes, will similarly require collective action. In that regard, we have room for improvement.

 

Some, including Richard Sutton, an eminent computer scientist, say that human replacement is not only inevitable but perhaps welcome. Mr. Sutton has thought about AI's ability to learn skills on its own. In 2019 he wrote that building AI based on "how we think we think" has not worked in the long run.

 

Instead, AI has improved by scaling the computation and focusing less on preprogrammed logic and more on learning.

 

Mr. Sutton's essay, "The Bitter Lesson," is reprinted in an appendix to Dwarkesh Patel's "The Scaling Era," co-authored with Gavin Leech.

 

Mr. Patel's book consists mainly of snippets of interviews conducted with AI insiders -- one woman and 19 men, including Mr. Zuckerberg -- on Mr. Patel's eponymous "Dwarkesh Podcast." In reading "The Scaling Era," at least three points stand out.

 

First, the trick to improving AI, particularly LLMs, is simple: scale.

 

Simple, however, does not mean easy. Training larger models for more time on more data raises technical challenges.

 

Demis Hassabis, the co-founder and chief executive of Google DeepMind, explains that each time you increase the size of a model 10-fold, "you have to adjust the recipe" -- meaning the "hyperparameters" such as how much the model should learn from each training example -- "and that's a bit of an art form."

 

It also raises resource challenges. Better models require more computer chips, more data, more power, more money. Mr. Patel's interviewees discuss the potential of data centers that cost a trillion dollars and require dedicated nuclear plants.

 

What much of the industry is aiming for is artificial general intelligence. AGI definitions vary, but it often refers to software that would match humans at a wide variety of tasks, such as running a business or designing and conducting scientific experiments. AGI would then presumably improve itself, quickly becoming superintelligent. Even without AGI, we're already finding ways for AI to design new algorithms and chips.

 

The second point that emerges from the book is that even if scaling were simple, what emerges is complex. In a discussion on alignment (the task of making AI ethical) and interpretability (the task of making it understandable), Dario Amodei, the CEO and co-founder of Anthropic (the maker of Claude), offers a clear-eyed perspective: "We really have very little idea what we're talking about." And that's about today's models. Superintelligent AI would be more opaque and capable of writing programs that exceed our grasp. "In those millions of lines of code," says Leopold Aschenbrenner, an AI researcher, "you don't know if it's hacking, exfiltrating itself, or trying to go for the nukes."

 

Third, when you combine AI's capability, complexity and consumption of resources, the problems quickly go from engineering to geopolitics. Mr. Aschenbrenner notes that if you build your data centers in the Middle East, foreign states might copy the trained models or usurp the hardware. Even data centers in the U.S. are at risk of being attacked, he says. He further imagines a country with a slight edge in AI inventing tiny drones that take out its adversaries' nuclear submarines.

 

"The Scaling Era" is a bit unwieldy, in both format and content. There are endnotes, footnotes and sidebar definitions, creating a staccato flow. The intended audience is unclear. Discussions of technical minutia sit alongside explanations of basic terms such as "learning." If you plan to read both books and are not an expert, start with Mr. Summerfield's to get your footing. Still, Mr. Patel is a smart and informed interviewer, nudging his sources to the edge of what they know or can say publicly.

 

Several participants offer estimates for the arrival of AGI, with a cluster of guesses suggesting sometime around 2028. I would have appreciated more on how these participants, and Mr. Patel, define and measure AGI. Otherwise, a date is meaningless. Another nitpick: Both Mr. Patel and Mr. Summerfield describe our mushy brains using the digital metrics of bits and operations per second, which don't apply and create a false hope of comparison. Fortunately, both authors describe how AI can be, in relation to humans, simultaneously supersmart and superstupid. Intelligence has many domains; IQ isn't all.

 

We once looked to the gods for knowledge. We're now using our knowledge to build something resembling gods. Whether we make a monotheistic one -- a single server farm to rule us all -- or a bunch of infighting demideities, marvelous and flawed in their own ways, is hard to predict. All Mr. Zuckerberg knows is that the value of a high priesthood is priceless.

 

---

 

Mr. Hutson is the author of "The 7 Laws of Magical Thinking: How Irrational Beliefs Keep Us Happy, Healthy, and Sane."” [1]

 

1. REVIEW --- Books: Machines That Learn To Learn. Hutson, Matthew.  Wall Street Journal, Eastern edition; New York, N.Y.. 13 Sep 2025: C7.  

Komentarų nėra: