Sekėjai

Ieškoti šiame dienoraštyje

2025 m. rugpjūčio 20 d., trečiadienis

Beyond the Turing Test: AI Research Seeks Answer to the Question of Whether Machines Can Think


“The most-read article in the renowned philosophy journal Mind to date is 75 years old and was written by a mathematician. The article is titled "Computing Machinery and Intelligence," and its author: Alan Turing. In this essay, the computer science pioneer outlined the imitation game, long known as the Turing Test, which essentially examines whether humans can recognize whether they are chatting with another human or a computer program. This test has long occupied the AI world, and just in time for the anniversary, cognitive researchers Cameron Jones and Ben Bergen from the University of San Diego in California have now announced in a preprint that it has been achieved: two language models have passed the Turing Test: GPT-4.5 clearly – it passed as a human in 73 percent of conversations – while Llama-3.1 not quite as clearly, with only 56 percent, but still. The old chat program ELIZA only reached 25 percent.

 

Turing had already expected this success at the turn of the millennium, but he had also estimated that a digital computer with one gigabyte of memory would be sufficient. The excitement surrounding the breakthrough is limited. Almost everyone has had experience with chatbots and has become accustomed to not being entirely sure who or what they are dealing with. Studies sometimes even show that people consider computer poems to be particularly human because they are so beautifully emotional, and not as strenuous as the works of human artists, who are constantly trying out new and unusual things. The research from California is therefore less of a sensation than the carefully documented proof of the expected: If you prompt language models well, for example, tell them to take on the role of a tech-savvy nineteen-year-old who uses a bit of slang, but not too much, who doesn't put periods after sentences and occasionally asks questions, then humans have a hard time recognizing the program as a program.

 

Now, as always, when a program is claimed to have having passed the Turing Test, the debate revolves around what it actually tests: The intelligence of the program? Its ability to bluff? The naiveté of humans? Or their perceptions of the response behavior of their fellow humans? There is debate about whether the prompts are fair or whether humans have helped too much. What is not discussed is that the actual goal Turing wanted to achieve with this test procedure has not been achieved to this day: to answer the question of whether machines can think.

 

Turing's main goal with the test was to avoid lengthy discussions about the meaning of "think" and "machine" and to make the question of thinking manageable through a clear criterion. If a program passed the test, it would be considered a "thinking machine." Of course, one cannot be truly certain, but that is no different for humans: Assuming that fellow humans think is merely a "polite agreement," he wrote. By the year 2000, he assumed, it would have become common practice to speak of thinking machines.

 

The developers of major language models do indeed like to refer to the latest versions as reasoning models, i.e., models for reasoning, or simply as "deep thinkers" now. However, reservations about describing these models as capable of reasoning remain strong, even among experts. This may have to do with wounded pride and the attempt to maintain a boundary between humans and machines. A motive that, according to Turing, should be responded to with consolation, not arguments. But it may also have to do with the fact that, despite passing the Turing tests, the differences between humans and machines remain profound, and the terms used create confusion that is both unnecessary and promotionally effective.

 

"The term 'reasoning' originates from a 2022 paper in which Jason Wei and others introduced a special format for prompting," explains computer scientist Katharina Anna Zweig from the University of Kaiserslautern-Landau. "Instead of giving a machine just one task, you give it several similar tasks and their solutions, and you also describe the intermediate steps." From this, the machine learns to perform other tasks as well, and to solve them with the help of intermediate steps. "However, these do not make sense in terms of content, as they do with humans, but rather purely syntactically. Thus, there is no deliberation behind them, but rather a recognition of linguistic patterns," says Zweig. When such intermediate steps are performed, we now speak of reasoning models.

 

"Reasoning uses a term that usually characterizes human intelligence," states Ute Schmid, Professor of Cognitive Systems at the University of Bamberg and Director at the Bavarian Institute for Digital Transformation. "However, these methods don't claim to model human cognitive processes. Instead, they're about making certain aspects of human intelligence at least partially solvable by computers." Reasoning often involves deriving further rules or facts based on given rules and facts. "After a long focus on purely data-driven machine learning methods, we're currently seeing a renaissance of such approaches. That's a very interesting development," says Schmid.

 

So, more of an imitation of cognitive abilities than thinking? The Konstanz philosopher Wolfgang Spohn, following Turing, relies on behavior as the decisive criterion, but sees the question of thinking as long ago dissolved: "Because we can think, we master a myriad of complex activities, different people in varying degrees, usually better than other animals, but by no means always. In some areas, machines are now better than us, in others they are still terribly bad. But because of this diversity of expressions, the question of what thinking is doesn't make much sense; it breaks down into hundreds of individual questions." In psychology, it has therefore long been customary to speak of cognitive abilities rather than thinking.

 

Despite passing the Turing Tests, the debate about thinking machines or those that merely imitate cognitive abilities continues merrily. It almost completely obscures the fact that, in his essay, Turing described two paths to intelligent machines besides the imitation game: One could either start with abstract activities like chess, or equip a machine with sensory organs and teach it like a child.

 

Perhaps AI research has taken a tricky shortcut with large-scale language models. These models do realize a kind of learning, but without a body. Perhaps that's why they have such difficulty understanding the world the way we do. Researchers have long been working on Turing's original vision under the name "embodied AI": with children's robots, simulated kindergartens, sandbox friends, and everything that goes with it.

 

Where this leads remains to be seen. In any case, Turing's essay remains visionary even 75 years later and despite having passed the test.” [1]

 

1.  Jenseits des Turing-Tests: Die KI-Forschung sucht Antwort auf die Frage, ob Maschinen denken können. Frankfurter Allgemeine Zeitung; Frankfurt. 23 July 2025: N4. MANUELA LENZEN

Komentarų nėra: