"More than a dozen companies have popped up to offer services
aimed at identifying whether photos, text and videos are made by humans or
machines.
Andrey Doronichev was alarmed last year when he saw a video
on social media that appeared to show the president of Ukraine surrendering to
Russia.
The video was quickly debunked as a synthetically generated
deepfake, but to Mr. Doronichev, it was a worrying portent. This year, his
fears crept closer to reality, as companies began competing to enhance and
release artificial intelligence technology despite the havoc it could cause.
Generative A.I. is now available to anyone, and it’s
increasingly capable of fooling people with text, audio, images and videos that
seem to be conceived and captured by humans. The risk of societal gullibility
has set off concerns about disinformation, job loss, discrimination, privacy
and broad dystopia.
For entrepreneurs like Mr. Doronichev, it has also become a
business opportunity. More than a dozen companies now offer tools to identify
whether something was made with artificial intelligence, with names like
Sensity AI (deepfake detection), Fictitious.AI (plagiarism detection) and
Originality.AI (also plagiarism).
Mr. Doronichev, a Russian native, founded a company in San
Francisco, Optic, to help identify synthetic or spoofed material — to be, in
his words, “an airport X-ray machine for digital content.”
In March, it unveiled a website where users can check images
to see if they were made by actual photographs or artificial intelligence. It
is working on other services to verify video and audio.
“Content authenticity is going to become a major problem for
society as a whole,” said Mr. Doronichev, who was an investor for a
face-swapping app called Reface. “We’re entering the age of cheap fakes.” Since
it doesn’t cost much to produce fake content, he said, it can be done at scale.
The overall generative A.I. market is expected to exceed
$109 billion by 2030, growing 35.6 percent a year on average until then,
according to the market research firm Grand View Research. Businesses focused
on detecting the technology are a growing part of the industry.
Months after being created by a Princeton University
student, GPTZero claims that more than a million people have used its program
to suss out computer-generated text. Reality Defender was one of 414 companies
chosen from 17,000 applications to be funded by the start-up accelerator Y
Combinator this winter.
Copyleaks raised $7.75 million last year in part to expand
its anti-plagiarism services for schools and universities to detect artificial
intelligence in students’ work. Sentinel, whose founders specialized in
cybersecurity and information warfare for the British Royal Navy and the North
Atlantic Treaty Organization, closed a $1.5 million seed round in 2020 that was
backed in part by one of Skype’s founding engineers to help protect democracies
against deepfakes and other malicious synthetic media.
Major tech companies are also involved: Intel’s FakeCatcher
claims to be able to identify deepfake videos with 96 percent accuracy, in part
by analyzing pixels for subtle signs of blood flow in human faces.
Within the federal government, the Defense Advanced Research
Projects Agency plans to spend nearly $30 million this year to run Semantic
Forensics, a program that develops algorithms to automatically detect deepfakes
and determine whether they are malicious.
Even OpenAI, which turbocharged the A.I. boom when it
released its ChatGPT tool late last year, is working on detection services. The
company, based in San Francisco, debuted a free tool in January to help
distinguish between text composed by a human and text written by artificial
intelligence.
OpenAI stressed that while the tool was an improvement on
past iterations, it was still “not fully reliable.” The tool correctly
identified 26 percent of artificially generated text but falsely flagged 9
percent of text from humans as computer generated.
The OpenAI tool is burdened with common flaws in detection
programs: It struggles with short texts and writing that is not in English. In
educational settings, plagiarism-detection tools such as TurnItIn have been
accused of inaccurately classifying essays written by students as being
generated by chatbots.
Detection tools inherently lag behind the generative
technology they are trying to detect. By the time a defense system is able to
recognize the work of a new chatbot or image generator, like Google Bard or
Midjourney, developers are already coming up with a new iteration that can
evade that defense. The situation has been described as an arms race or a
virus-antivirus relationship where one begets the other, over and over.
“When Midjourney releases Midjourney 5, my starter gun goes
off, and I start working to catch up — and while I’m doing that, they’re
working on Midjourney 6,” said Hany Farid, a professor of computer science at
the University of California, Berkeley, who specializes in digital forensics
and is also involved in the A.I. detection industry. “It’s an inherently
adversarial game where as I work on the detector, somebody is building a better
mousetrap, a better synthesizer.”
Despite the constant catch-up, many companies have seen
demand for A.I. detection from schools and educators, said Joshua Tucker, a
professor of politics at New York University and a co-director of its Center
for Social Media and Politics. He questioned whether a similar market would
emerge ahead of the 2024 election.
“Will we see a sort of parallel wing of these companies
developing to help protect political candidates so they can know when they’re
being sort of targeted by these kinds of things,” he said.
Experts said that synthetically generated video was still
fairly clunky and easy to identify, but that audio cloning and image-crafting
were both highly advanced. Separating real from fake will require digital
forensics tactics such as reverse image searches and IP address tracking.
Available detection programs are being tested with examples
that are “very different than going into the wild, where images that have been
making the rounds and have gotten modified and cropped and downsized and
transcoded and annotated and God knows what else has happened to them,” Mr.
Farid said.
“That laundering of content makes this a hard task,” he
added.
The Content Authenticity Initiative, a consortium of 1,000
companies and organizations, is one group trying to make generative technology
obvious from the outset. (It’s led by Adobe, with members such as The New York
Times and artificial intelligence players like Stability A.I.) Rather than
piece together the origin of an image or a video later in its life cycle, the
group is trying to establish standards that will apply traceable credentials to
digital work upon creation.
Adobe said last week that its generative technology Firefly
would be integrated into Google Bard, where it will attach “nutrition labels”
to the content it produces, including the date an image was made and the
digital tools used to create it.
Jeff Sakasegawa, the trust and safety architect at Persona,
a company that helps verify consumer identity, said the challenges raised by
artificial intelligence had only begun.
“The wave is building momentum,” he said. “It’s heading
toward the shore. I don’t think it’s crashed yet.”"
Komentarų nėra:
Rašyti komentarą