Mokslas, studijos ir ekonomika: A freely available large language model

2024 m. liepos 27 d., šeštadienis

A freely available large language model

"As alter egos go, Augustus Caesar is not a bad one for Mark Zuckerberg, pontifex maximus of Meta, owner of the Facebook family of apps. Both men started their march to power as teenagers. Both stopped at nothing to build empires—though unlike the impetuous Mr Zuckerberg, Augustus’s motto was “make haste slowly”. Both gave the illusion of sharing power (Augustus with the Senate, Mr Zuckerberg with shareholders) while wielding it almost absolutely. The Roman emperor is Mr Zuckerberg’s role model. In a recent podcast he used the 200-year era of stability ushered in by Augustus to illustrate why he is making Meta’s generative artificial-intelligence (AI) models available in a way that, with some poetic licence, he calls open source.

On July 23rd Mr Zuckerberg issued a manifesto laying out in greater detail the business case for open-source AI.
That coincided with the release by Meta of Llama 3.1, a freely available large language model (LLM) whose most powerful version, it says, rivals the top offering from OpenAI, maker of ChatGPT.
Mr Zuckerberg said Meta’s intent was to liberate itself from the sort of gatekeepers that have constrained it in the past, such as Apple and its iPhones. That sounds sensible. It was lost on no one, though, that Meta is Llama’s sole gatekeeper.

Meta’s new model is certainly an attention-grabber. The biggest version has 405bn parameters (a common definition of LLM power), almost six times those in its predecessor. Mr Zuckerberg claimed that by next year the company’s models will reign supreme, throwing down the gauntlet to rivals like OpenAI that have taken a closed approach. As both open and closed models get bigger, the debate over which is better is developing an almost theological intensity. On one side are the open-source purists in favour of decentralised “little tech”. On the other are closed-source realists who argue that greater centralisation and control are better for safety and national security.

Mr Zuckerberg’s manifesto further stirs that debate. Though questions remain about how genuinely open Meta’s models are, and its commitment to the approach, he makes a good case.

As he points out, open-source software has an illustrious pedigree. In the 1990s Linux, an obscure operating system created by a university student, eventually became the industry standard for servers, thanks in part to the backing of IBM, a tech giant of its day. The beauty of Linux’s approach was that it provided full access to its source code, enabling developers to modify and improve it.
That differs subtly from Meta’s approach to AI.
Percy Liang, co-founder of Together AI, a cloud-computing startup that will use Llama 3.1, calls the tech giant’s models “open weights”, rather than open source. Meta makes available the numerical values used in its models, known as weights, but doesn’t reveal the data on which the models are trained, which is the equivalent of the source code. That may reduce the ability of developers to customise its models. It is better than nothing, though.

This also raises the question of whether or not Meta might change its approach, leaving developers that rely on its models high and dry. Meta is not a charity, and building LLMs can be costly. Investors have shown in their hostile reaction to Mr Zuckerberg’s metaverse ambitions that, despite his control of the company’s voting shares, he does not have a blank cheque to splurge on whatever he likes. If Meta does not get the commercial benefits it expects, it may be forced to reconsider its approach.

Openness, meanwhile, raises two big safety concerns. The first is harm-prevention. Though Meta has probed Llama 3.1 for dangers, the bigger models get, the more risk there is that they could go rogue or be misused. Once released, such models do not have a kill switch. That, in turn, raises the issue of liability. Who bears responsibility if these models fall into the hands of bad actors? Regulators are grappling with such questions; a clampdown could affect the long-term viability of open-source AI.

Mr Zuckerberg’s rebuttal starts with self-interest. Meta benefits from its interactions with the open-source community, which will suggest ways to make its models better, he argues. Better models should in turn help the firm improve the performance of the AI products it offers to users of Facebook, Instagram and WhatsApp, boosting engagement and profit. Meta’s business is based on advertising, rather than subscriptions, so it has no risk of cannibalising itself.
What is more, though he does not say this, making its large language models available for free helps commodify the industry, undercutting the prospects of rival tech giants. As with IBM, which backed Linux against Microsoft’s Windows, Meta’s megabucks and clout are giving open-source AI a tailwind. Big firms such as Nvidia, creator of generative-AI chips and supplier of related offerings, and Amazon Web Services, a cloud provider, are incorporating Llama 3.1 into their products.

Mr Zuckerberg also insists that it is safer to have power concentrated in the hands of the many rather than the few. When it comes to national security, closing models to prevent China getting its hands on them would be counter-productive, he writes. It would hurt American innovation, and China might be able to steal the secrets anyway.

Actium man

Mr Zuckerberg’s long-term bet is that openness will be good for the world as well as Meta. He has likened it to Augustus’s Pax Romana. After years of civil war in the wake of Julius Caesar’s death in 44BC, few in Ancient Rome could conceive of the idea of a prolonged period of peace and prosperity. Likewise, he reckons, few investors at this point are able to imagine the long-term potential of Meta giving away its crown jewels. Like his hero, he has battles to fight before his position is secure. But he is on the warpath." [1]

1. Augustus on the open-source warpath. The Economist; London Vol. 452, Iss. 9407, (Jul 27, 2024): 58.

Komentarų nėra:

Rašyti komentarą

Mokslas, studijos ir ekonomika

Sekėjai

Ieškoti šiame dienoraštyje

Subscribe Now: Feed Icon

Tinklaraščio archyvas

Apie mane

2024 m. liepos 27 d., šeštadienis

A freely available large language model

Komentarų nėra:

Translate