Sekėjai

Ieškoti šiame dienoraštyje

2024 m. gegužės 31 d., penktadienis

The New ChatGPT Offers a Lesson in A.I. Hype: Tech Fix


"When OpenAI unveiled the latest version of its immensely popular ChatGPT chatbot this month, it had a new voice possessing humanlike inflections and emotions. The online demonstration also featured the bot tutoring a child on solving a geometry problem.

To my chagrin, the demo turned out to be essentially a bait and switch. The new ChatGPT was released without most of its new features, including the improved voice (which the company told me it postponed to make fixes). 

The ability to use a phone’s video camera to get real-time analysis of something like a math problem isn’t available yet, either.

Amid the delay, the company also deactivated the ChatGPT voice that some said sounded like the actress Scarlett Johansson, after she threatened legal action, replacing it with a different female voice.

For now, what has actually been rolled out in the new ChatGPT is the ability to upload photos for the bot to analyze. 

Users can generally expect quicker, more lucid responses. 

The bot can also do real-time language translations, but ChatGPT will respond in its older, machine-like voice.

Nonetheless, this is the leading chatbot that upended the tech industry, so it was worth reviewing. After trying the sped-up chatbot for two weeks, I had mixed feelings. It excelled at language translations, but it struggled with math and physics. All told, I didn’t see a meaningful improvement from the last version, ChatGPT-4. I definitely wouldn’t let it tutor my child.

This tactic, in which A.I. companies promise wild new features and deliver a half-baked product, is becoming a trend that is bound to confuse and frustrate people. The $700 Ai Pin, a talking lapel pin from the start-up Humane, which is funded by OpenAI’s chief executive, Sam Altman, was universally panned because it overheated and spat out nonsense. Meta also recently added to its apps an A.I. chatbot that did a poor job at most of its advertised tasks, like web searches for plane tickets.

Companies are releasing A.I. products in a premature state partly because they want people to use the technology to help them learn how to improve it. In the past, when companies unveiled new tech products like phones, what we were shown — features like new cameras and brighter screens — was what we were getting. With artificial intelligence, companies are giving a preview of a potential future, demonstrating technologies that are being developed and working only in limited, controlled conditions. A mature, reliable product might arrive — or might not.

The lesson to learn from all this is that we, as consumers, should resist the hype and take a slow, cautious approach to A.I. We shouldn’t be spending much cash on any underbaked tech until we see proof that the tools work as advertised.

The new version of ChatGPT, called GPT-4o (“o” as in “omni”), is now free to try on OpenAI’s website and app. Nonpaying users can make a few requests before hitting a timeout, and those who have a $20 monthly subscription can ask the bot a larger number of questions.

OpenAI said its iterative approach to updating ChatGPT allowed it to gather feedback to make improvements.

“We believe it’s important to preview our advanced models to give people a glimpse of their capabilities and to help us understand their real-world applications,” the company said in a statement.

(The New York Times sued OpenAI and its partner, Microsoft, last year for using copyrighted news articles without permission to train chatbots.)

Here’s what to know about the latest version of ChatGPT.

Geometry and Physics

[Video: Watch on YouTube.]

To show off ChatGPT-4o’s new tricks, OpenAI published a video featuring Sal Khan, the chief executive of the Khan Academy, the education nonprofit, and his son, Imran. With a video camera pointed at a geometry problem, ChatGPT was able to talk Imran through solving it step by step.

Even though ChatGPT’s video-analysis feature has yet to be released, I was able to upload photos of geometry problems. 

ChatGPT solved some of the easier ones correctly, but it tripped up on more challenging problems.

For one problem involving intersecting triangles, which I dug up on an SAT preparation website, the bot understood the question but gave the wrong answer.

Taylor Nguyen, a high school physics teacher in Orange County, Calif., uploaded a physics problem involving a man on a swing that is commonly included on Advanced Placement Calculus tests. ChatGPT made several logical mistakes to give the wrong answer, but it was able to correct itself with feedback from Mr. Nguyen.

“I was able to coach it, but I’m a teacher,” he said. “How is a student supposed to pick out those mistakes? They’re making this assumption that the chatbot is right.”

I did notice that ChatGPT-4o succeeded at some division calculations that its predecessors did incorrectly, so there are signs of slow improvement. But it also failed at a basic math task that past versions and other chatbots, including Meta AI and Google’s Gemini, have flunked at: the ability to count. When I asked ChatGPT-4o for a four-syllable word starting with the letter “W,” it responded, “Wonderful.”

OpenAI said it was constantly working to improve its systems’ responses to complex math problems.

Mr. Khan, whose company uses OpenAI’s technology in its tutoring software Khanmigo, did not respond to a request for comment on whether he would leave ChatGPT the tutor alone with his son.

Reasoning

OpenAI also highlighted that the new ChatGPT was better at reasoning, or using logic to come up with responses. So I ran it through one of my favorite tests: I asked it to generate a Where’s Waldo? puzzle. When it showed an image of a giant Waldo standing in a crowd, I said that the point is that he’s supposed to be hard to find.

The bot then generated an even larger Waldo.

Subbarao Kambhampati, a professor and researcher of artificial intelligence at Arizona State University, also put the chatbot through some tests and said he saw no noticeable improvement in reasoning compared with the last version.

He presented ChatGPT a puzzle involving blocks:

If block C is on top of block A, and block B is separately on the table, can you tell me how I can make a stack of blocks with block A on top of block B and block B on top of block C, but without moving block C?

The answer is that it’s impossible to arrange the blocks under these conditions, but, just as with past versions, ChatGPT-4o consistently came up with a solution that involved moving block C. With this and other reasoning tests, ChatGPT was occasionally able to take feedback to get the correct answer, which is antithetical to how artificial intelligence is supposed to work, Mr. Kambhampati said.

“You can correct it, but when you do that you’re using your own intelligence,” he said.

OpenAI pointed to test results that showed GPT-4o scored about two percentage points higher at answering general knowledge questions than previous versions of ChatGPT, illustrating that its reasoning skills had slightly improved.

Language

OpenAI also said the new ChatGPT could do real-time language translation, which could help you converse with someone speaking a foreign language.

I tested ChatGPT with Mandarin and Cantonese and confirmed that it was OK at translating phrases, such as “I’d like to book a hotel room for next Thursday” and “I want a king-size bed.” But the accents were slightly off. (To be fair, my broken Chinese is not much better.) OpenAI said it was still working to improve accents.

ChatGPT-4o also excelled as an editor. When I fed it paragraphs that I wrote, it was fast and effective at removing excessive words and jargon. 

ChatGPT’s decent performance with language translation gives me confidence that this will soon become a more useful feature.

Bottom Line

A major thing OpenAI got right with ChatGPT-4o is making the technology free for people to try. Free is the right price: Since we are helping to train these A.I. systems with our data to improve, we shouldn’t be paying for them.

The best of A.I. has yet to come, and it might one day be a good math tutor that we want to talk to. But we should believe it when we see it — and hear it." [1]

1. The New ChatGPT Offers a Lesson in A.I. Hype: Tech Fix. Chen, Brian X.  New York Times (Online) New York Times Company. May 31, 2024.

Kasdien mus guodžia Lietuvos bankų propaganda: Infliacija nugalėta, rytoj Europos centrinis bankas tris kartus atpigins paskolas, pirksime butus, kaip išprotėję


 

 Lygiai taip pat, kaip prie vieno jūros gėrybių restorano reklama: „Nemokami krabai bus rytoj...“

 

 „Vartotojų kainos per metus iki gegužės pakilo 2,6 proc., šiek tiek daugiau, nei tikėtasi.

 

 Metinis infliacijos tempas eurą naudojančiose šalyse gegužę šiek tiek paspartėjo, o tai lėmė paslaugų ir maisto kainų šuolis.

 

 Vartotojų kainos euro zonoje per metus iki gegužės pakilo 2,6 proc., o balandį – 2,4 proc.

 

 Bendras infliacijos lygis buvo šiek tiek didesnis, nei tikėjosi ekonomistai. Tas pats pasakytina ir apie pagrindinę infliaciją, kuri pašalina nepastovias maisto ir energijos kainas, kuri gegužę siekė 2,9 proc., palyginti su 2,7 proc. balandžio mėn.

 

 Gegužės mėnesio skaičiai parodė pirmąjį šių metų bendros ir pagrindinės infliacijos pakilimą, išryškindami sunkumus, su kuriais susiduria Europos centrinio banko politikos formuotojai, siekdami savo tikslo sumažinti infliaciją iki 2 proc. Infliacija viršijo 10 procentų aukščiausią tašką 2022 m.

 

 Trijų didžiausių regiono ekonomikų – Vokietijos, Prancūzijos ir Ispanijos – metinė infliacija gegužės mėnesį paspartėjo.

 

 E.C.B. nuo rugsėjo mėnesio išlaikė savo pagrindinę palūkanų normą, vadinamą indėlių palūkanų norma, 4 proc., aukščiausią per visą istoriją.

 

 „Europos centrinis bankas bus atsargus ir greičiausiai nesumažins palūkanų normų liepos mėn. posėdyje, atsižvelgiant į trumpalaikį deinfliacijos, ypač paslaugų sektoriaus, sustabdymą ir stiprius darbo užmokesčio duomenis“, – Riccardo Marcelli Fabiani iš Oxford Economics." [1]

 

Ponas Fabiani žino. Ponas Fabiani ima ir man.

 

1. Inflation Ticks Up in the Eurozone. Eddy, Melissa.  New York Times (Online) New York Times Company. May 31, 2024.

The propaganda of Lithuanian banks comforts us every day: Inflation has been defeated, tomorrow the European Central Bank will make loans cheaper three times, we will buy apartments like crazy


 The same way as an add on one seafood restaurant: "Free crabs will be tomorrow..."

"Consumer prices rose 2.6 percent in the year through May, slightly higher than expected.

The annual rate of inflation in the countries that use the euro accelerated slightly in May, driven by a jump in the cost of services and food.

 Consumer prices in the eurozone rose 2.6 percent in the year through May, compared with 2.4 percent in April.

The headline inflation rate was a bit higher than economists expected. The same was true for core inflation, which strips out volatile food and energy prices, which came in at 2.9 percent in May, versus 2.7 percent in April.

The numbers for May showed the first uptick in overall and core inflation this year, highlighting the difficulties policymakers at the European Central Bank face in the final stretch of reaching their aim to bring inflation down to 2 percent. Inflation peaked above 10 percent in 2022.

Three of the area’s largest economies, Germany, France and Spain, all saw annual inflation speed up in May.

The E.C.B. has kept its main rate, known as the deposit rate, at 4 percent, the highest in its history, since September.

“The European Central Bank will be cautious and is unlikely to lower interest rates at the July meeting, given the momentary interruption of disinflation, especially in services, and the strong wage data,” Mr. Fabiani said." [1]

Mr. Fabiani knows. Mr. Fabiani takes for me too.

1. Inflation Ticks Up in the Eurozone. Eddy, Melissa.  New York Times (Online) New York Times Company. May 31, 2024.