"Earlier this year, the PGA Tour's digital chief witnessed ChatGPT make the digital equivalent of a double bogey when the chatbot flubbed a question on basic golf lore: How many times has Tiger Woods won on the Tour?
Generative AI's foundation models can be trained on vast troves of data from the internet and other sources, but still lack deep, specific knowledge even on topics as mainstream as golf, Scott Gutterman, the Tour's senior vice president of digital and broadcast technology, realized.
"There's missing data, there's generalized data. Those things have just kind of led to generalized responses," Gutterman said.
As AI projects creep from the pilot-project stage into operations, corporate users are discovering that many AI models are about as useful out of the box as a new employee entering orientation.
Companies are finding it is critical to augment today's general models, like those offered by Anthropic or OpenAI, with more industry-specific or business-specific data if they are going to be useful. (News Corp, owner of The Wall Street Journal, has a content-licensing partnership with OpenAI.)
But that augmentation presents a spectrum of options, in which higher levels of accuracy and reliability also bring more costs and complexity, said Ritu Jyoti, general manager and group vice president of AI and data as well as global AI lead at research firm International Data Corp. And the augmentation works only if companies have an impeccable handle on their data, Jyoti said.
Yet another question is how much augmentation is sufficient to make models accurate and reliable enough for a specific use. A range of companies including consultants, cloud providers like Amazon Web Services and model makers like OpenAI itself are positioning themselves to help.
IDC estimates that worldwide spending on AI, including AI-enabled applications, infrastructure and related IT and business services, will more than double by 2028 to $632 billion.
The PGA Tour said it is now using an approach known as retrieval augmented generation, or RAG, to avoid any future AI errors. It taps into Claude on the Amazon Web Services infrastructure, then inputs organization-specific information, for instance, a 190-page document containing the Tour's rules. That way, it can ask a query and require the model to directly refer to the information in the document, rather than information culled from the internet.
All outputs are still reviewed by a human before they are delivered to players or customers, Gutterman said.
But the RAG approach can go only so far, according to Will McQueen, vice president and head of data assets and analytics at the agriculture division of pharmaceutical and biotech giant Bayer. "It's, for sure, limited," McQueen said.
It works well enough for certain low-stakes uses, like answering new engineers' questions in the onboarding process, he said. The stakes are much higher, though, when giving farmers advice on tending their crops. For that, the company might go to "fine-tuning," McQueen said, or training parts of the model with proprietary data to get a big leap in accuracy and relevance of responses.
But fine-tuning can be even more expensive, require more-specialized talent -- and still fall short of 100% accuracy, Jyoti said. To help, model makers like OpenAI now offer business customers assistance with model fine-tuning and customization.
The highest level of accuracy available today is only possible when a company trains its own models from the ground up, Jyoti said. But the cost and talent required for that approach are prohibitive to most enterprises, she said.
Custom building just a small language model, for instance, could run anywhere from $500,000 to millions of dollars, said Bayer's McQueen. Continuing maintenance adds another layer of expense.
Even so, Rocket Cos. sees potential in the approach, said Shawn Malhotra, chief technology officer. The lending giant is exploring using an AI model to automatically fill in portions of mortgage applications.
But there is a lot of nuance in the language of homeownership that foundation models on their own wouldn't typically understand, Malhotra said. "It's not simply just the name and number," he said. "You're worried about: What kind of dwelling is this? Is this a detached home?"
Building a small model can help the AI understand those nuances, Malhotra said.
"You may have to give examples of 'when a dwelling is described in the following way, it maps to this kind of property; when it's described in a different way, it's matched to a different kind of property'."
Companies are still learning how to make the various approaches work and figuring out which ones make sense for which situations.
"Most customers are using RAG in one or another fashion," said Sri Elaprolu, director at Amazon.com's AWS GenAI Innovation Center. "Some customers are starting down the fine-tuning route -- we're starting to see that volume increase, rapidly. And more and more customers are exploring what it means to pretrain a model."
Industry-specific models also are emerging to help solve some of the customization complexities.
Legal-tech company Luminance has an AI model that has been trained specifically on over 150 million legal documents over the past 10 years. Getting something wrong in a legal context could be disastrous, said Luminance CEO Eleanor Lightbody. "That's why having AI you've just trained on legal contracts is so, so important."
But the overall supply of these narrow AI models is still limited, said IDC's Jyoti. For now, the task of augmenting and customizing is falling directly to companies.
Like the PGA Tour, which found that ChatGPT would occasionally say Woods has 15 PGA Tour wins. In fact, he has 82.
"Tiger's won 15 majors. That is not the same thing as winning 82 PGA Tour events," said Gutterman. "Beginning to get the models to understand the difference between major wins and PGA Tour wins was something that we saw we would need to do."" [1]
1. AI Has a Lot of Learning to Do to Be Useful. Bousquette, Isabelle. Wall Street Journal, Eastern edition; New York, N.Y.. 15 Oct 2024: B.1.