Sekėjai

Ieškoti šiame dienoraštyje

2025 m. vasario 3 d., pirmadienis

How to Reduce AI Chatbot Hallucinations: Some mistakes are inevitable. But there are ways to make it less likely a chatbot will make stuff up

 

"You can't stop an AI chatbot from sometimes hallucinating -- giving misleading or mistaken answers to a prompt, or even making things up. But there are some things you can do to limit the amount of faulty information a chatbot gives you in response to your request.

AI hallucinations arise from a couple of things, says Matt Kropp, chief technology officer at BCG X, a unit of Boston Consulting Group. One is that the data on which an AI chatbot was trained contained conflicting, incorrect or incomplete information about the subject you're asking about. You can't do anything about that. The second is that "you haven't specified enough of what you want," Kropp says -- and that is something you can address.

Below are some techniques that experts say can minimize -- though not eliminate -- hallucinations.

Give the AI detailed instructions

Tell the AI exactly what you are seeking. If your prompt gives it too much freedom to root around its database, it's more likely to respond with erroneous or fabricated information.

"You want detailed instructions, you want precise language, but you also have to make sure that it is concise in that everything in that prompt is directly relevant to the query," says Darin Stewart, an analyst at technology advisory firm Gartner.

When shopping for a car recently I asked an AI for help. It gave me useful comparisons of the size, price and features of SUVs I was interested in. But the miles-per-gallon rating for one of the cars seemed awfully high. Digging a little deeper, I figured out the AI gave me the rating for a diesel-engine vehicle -- a version not sold in the U.S.

Another AI told me about features of an SUV that differed from what I read on the carmaker's website -- the chatbot had based its answer on a model from a number of years ago, not the 2025 one.

To prevent these mistakes, I should have told the AIs I wanted information confined to U.S.-model vehicles and ones currently on the market.

Structure your query in steps

Experts say you should construct your query in the form of small, direct questions instead of a single, open-ended one. Ask these questions one after the other, a process called iterative prompting. This can keep the AI from generating falsehoods as well as produce more-useful results.

"I think about my interactions with the [AI] models not as a one-shot question and answer, but rather as a dialogue," says Kropp. "You're building up context."

If you're in the market, say, for a new dishwasher, don't simply ask "what dishwasher should I buy," but instead start the prompt this way: "I need a new dishwasher. What are the major features I should consider?"

After it answers this question, you could respond with questions such as: "Which brands are known for reliability and which should I avoid? How much should I spend? Are higher-priced models worth the extra cost?"

To further guide the AI you could ask it to build its response in a formal manner, such as:

"Structure your answer this way: an introduction, your key findings, the pros and cons of the various models and your conclusion. Be sure to provide supporting evidence for each of your findings."

When I gave these prompts to an AI it created a comprehensive, 500-word analysis of what dishwashers I should consider and which to stay away from. And as far as I could tell it didn't hallucinate.

Direct the AI to known sources

Tell the AI to use certain types of sources, which may keep it from using sketchy, biased or incorrect material.

In my car search, AIs at times gave citations for their answers to random people writing on car-fan websites and Reddit. Some of those answers seemed uninformed, misleading or too glowing.

Aside from the quality of the sources, citations can be unreliable. They may or may not mean an AI explicitly used that information in its responses, experts say. In fact, AIs don't generally know where the material in their answers came from. Moreover, AIs have been known to cite documents, research or other sources that don't exist.

I repeated my query but told the AIs to stick to professional reviews and named a few sources to use, including Consumer Reports, Car and Driver magazine and the car shopping site Edmunds. I'm not sure if they used them -- experts told me my listing of sources could have guided the AIs to similar, though not identical, material. But the result was more-informed answers that appeared to have few questionable assertions.

Researchers at Johns Hopkins University have found a way to send an AI directly to certain source material in its database. Simply starting a question with the phrase "According to Wikipedia, what is. . ." for general queries or "According to PubMed, tell me about. . ." for health-related ones prompted an AI to quote directly from those sources, the researchers said in a report.

Tell the AI not to make things up

This may sound like a teacher instructing a recalcitrant pupil not to cheat, but some experts say you should instruct the chatbot to "say I don't know" or "don't make up an answer" if it's unsure of something. That might keep it from fabricating a response despite the fact that the evidence for the answer in its database is murky.

"You're actually giving it permission to do something it's not really trained to do, which is to say 'I'm wrong' or 'I don't know,'" Kropp says.

'Meta-prompt' the AI

Here's another trick to improve your questions: Tell the AI to write them. The technique, called meta-prompting, sounds odd -- like asking students to write the questions for their exam. But experts say it can work.

As an example, I asked OpenAI's ChatGPT to give me the wording for a meteorological question. Note that in the sample question it created it tells itself that it is an expert, a technique called giving it a persona. Research has shown that this method can boost the quality, and reduce the errors, of responses.

My prompt to ChatGPT:

"Please create a meta-prompt for this question: Why is it that rain can fall to the ground when the air temperature is below freezing instead of turning into snow or sleet?"

ChatGPT responded:

"You are a meteorology expert. Explain why rain can fall to the ground when the air temperature is below freezing. Your explanation should be clear, concise, and aimed at a general audience with minimal prior knowledge of meteorology. Use simple language and provide examples if possible."

I then fed this prompt to ChatGPT and got back a nontechnical and informative explanation of why rain can fall in below-freezing weather.

Use "chain of thought" prompting

Another way to direct an AI is to tell it to answer a question by breaking it down into logical steps. The technique, called chain-of-thought prompting, can lead to more accurate responses, Google researchers found. It also allows you to examine the AI's thought process to look for errors.

Experts have devised complex ways to guide chain-of-thought reasoning, such as giving the AI a sample of the steps it should take. But an easier, though perhaps less effective, technique is to start your query with the words "Using chain of thought. . ." or "Let's think about the answer step by step. . ."

Tell the AI to double-check its work

In another odd twist, you can tell an AI to quiz itself about the accuracy of its responses. The technique, dubbed chain of verification, can reduce hallucinations, according to scientists at Facebook parent Meta who developed it.

Below is a simplified text for instructing an AI to perform the self-questioning, as written by a company called PromptHub. Simply copy and paste this entire block of text into an AI chatbot and add your question to the top.

"Here is the question: [Type your question here]

"First, generate a response.

"Then, create and answer verification questions based on this response to check for accuracy. Think it through and make sure you are extremely accurate based on the question asked.

"After answering each verification question, consider these answers and revise the initial response to formulate a final, verified answer. Ensure the final response reflects the accuracy and findings of the verification process."

When I asked ChatGPT and Google's Gemini a question using this format they came back with the verification questions they asked themselves about their initial response, their answers to those questions and then their revised final response.” [1]

1. Artificial Intelligence (A Special Report) --- How to Reduce AI Chatbot Hallucinations: Some mistakes are inevitable. But there are ways to make it less likely a chatbot will make stuff up.. Ziegler,Bart.  Wall Street Journal, Eastern edition; New York, N.Y.. 03 Feb 2025: R1.

 

U.S.'s Lead Is Far From Guaranteed to Last --- History suggests laggards can catch up quickly. For investors, that means looking beyond America.

 

"In 1962, high import taxes made it impossible for Jim Marshall, a music-store owner in London, to meet demand for popular American-made Fender amplifiers, and he set out to create his own. The resulting Marshall amplifiers came out with a different sound but were the foundation for bands such as the Who and Led Zeppelin, which spread British hard rock all over the world.

Similar phenomena are now playing out, including China's ability to catch up with the U.S. in the artificial-intelligence race despite Biden-era policies aimed at capping what it can buy from U.S. semiconductor firms.

Investors trying to build a portfolio in an era of protectionism and heightened geopolitical competition should take note: Laggards can overcome constraints -- but need clear aims.

There could be important lessons for Europe's industrial sector, which on Friday was threatened with having further tariffs imposed on it by President Trump.

Wall Street is angsty about a string of Chinese AI firms, including DeepSeek, which, spurred by trade restrictions, have developed more efficient models to rival OpenAI and Alphabet. By Friday's close, the U.S.'s "Magnificent Seven" technology giants had shed $410 billion of value from a week earlier.

Meanwhile, the Stoxx Europe 600 has risen 6.3% this year, more than twice as much as the S&P 500, with its industrial subcomponent up 6.6%. Analysts are starting to wonder if this is akin to the final stretch of the dot-com bubble in 1999, which marked the start of a period of underperformance for U.S. equities relative to other developed markets.

"It's not hard to imagine the news about DeepSeek kick-starting a similarly sustained bout of U.S. underperformance," Thomas Mathews at Capital Economics told clients Friday, but he added it is "too soon" to make that call.

Similar to DeepSeek, the aftermath of the 2022 events in Ukraine has been another example of the limits of resource constraints: While Western nations thought that harsh sanctions would torpedo Russia's gross domestic product, the country returned to growth in 2023. Yes, Moscow has shifted commodity exports and high-tech imports to friendly countries, chiefly China, but it also has managed to foster domestic alternatives in areas such as computers and gas turbines.

That necessity is the mother of invention is hardly a new insight. The Great Depression, a time of scarcity, high tariffs and geopolitical tensions, gave us nylon, Nescafe instant coffee and the jet engine.

Germany's painful reforms in the 2000s were in part a response to an overvalued exchange rate both before and after introduction of the euro, which forced manufacturers to enhance productivity. By contrast, Italian industry was helped by devaluations in the 1980s and 1990s, but as a consequence got stuck in a medium-value segment that was soon filled by Chinese goods.

Now Europe is in need of another reboot. The export-led growth model centered on Germany's industrial core appears broken.

Europe doesn't have many tech firms, and the pivot toward making competitive electric vehicles amid structurally higher energy costs and reduced access to Chinese and American markets is floundering. In Britain, high electricity prices have contributed to a steep fall in export volumes since 2022.

Official data for the fourth quarter of 2024 released Thursday showed the eurozone's gross domestic product flatlining and the U.S.'s expanding by 0.6% from a quarter earlier.

However, the German DAX has surprisingly outperformed the S&P 500 over the past year, powered by its own Magnificent Seven: SAP, Deutsche Telekom, Allianz, Siemens, Siemens Energy, Munich Re and Rheinmetall.

Perhaps this reflects markets anticipating an economic recovery, which recent purchasing managers' surveys suggest could be near. Also, European industrial stocks had gotten ridiculously cheap: In October, their price/earnings ratios relative to U.S. peers hit an extreme discount.

American tech is now a bigger chunk of global stock-market capitalization than the entirety of European equities, Dhaval Joshi at BCA Research points out.

He thinks this is because of the "highly implausible" notion that the big winners of the earlier Web 2.0 revolution will emerge on top of the AI race, rather than going the way of the first digital giants such as Cisco and International Business Machines -- or, looking to former incumbents in other sectors, Kodak, Nokia and Blockbuster.

Europe already has innovators. Airbus, Novo Nordisk and ASML are leaders in aerospace, obesity drugs and photolithography for chips, respectively. Venture capital is propping up AI startups in France, such as Mistral AI.

Like Russia and China, Europe has deep technical expertise and a big internal market. Unlike them, it is already wealthy. If catching up to the U.S. becomes more of an urgent shared political project in the Trump era, Europe's ability to do so might prove underrated.

Last week, the European Union's "Competitive Compass" report confirmed officials' welcome shift toward cutting red tape, investing in innovation and introducing a "Made in Europe" preference in public procurement. Yet the EU's plans are sparse in detail and, crucially, lack the explicit commitment to beat the competition through any means necessary. This is something that Beijing's economic planning, Biden's industrial policies and Trump's tariffs -- as well as his $500 billion "Stargate" AI project -- do have, regardless of their individual merits.

Politics aren't yet playing the tune needed for a true European challenge. But investors shouldn't assume that export restrictions, tariffs or cheaper energy will keep U.S. equities forever on top.” [1]

1.  
U.S.'s Lead Is Far From Guaranteed to Last --- History suggests laggards can catch up quickly. For investors, that means looking beyond America.Sindreu, Jon.  Wall Street Journal, Eastern edition; New York, N.Y.. 03 Feb 2025: B9.