Sekėjai

Ieškoti šiame dienoraštyje

2026 m. gegužės 2 d., šeštadienis

Cancer’s cheat code


 

“CANCERS ARE real biological cheats. Whereas most of the cells in a healthy animal’s body get along by following the same set of genetic rules, cancer cells shamelessly ignore them. Healthy cells, for example, can replicate themselves only about 50 times before shutting down. Cancer cells, by contrast, carry a mutation that allows them to divide indefinitely. But recent work has revealed an entirely new level of oncological shenanigans. It now appears that many cancer cells have also stopped obeying Mendel’s laws of inheritance, explaining why many cancers are able to evolve resistance to chemotherapy drugs at seemingly supernatural rates.

 

These laws, worked out in the 19th century by Gregor Mendel, an Augustinian friar, describe how heritable traits pass down through the generations, setting limits on the ways in which children can differ from their parents. Mendel’s initial experiments were on peas in the monastery garden, but his laws have since been found to apply to everything from human height to disease resistance in individual cells.

 

As Paul Mischel of Stanford University describes in a paper in this week’s Cell, some cancer cells refuse to play along. His work reveals that in about 20% of human cancer samples some DNA escapes from the chromosomes to which it is normally bound and forms tiny, circular bodies of extra-chromosomal DNA (ecDNA) that get scattered throughout the nucleus of a cell. Thus scattered, they are no longer subject to the rigours of mitosis, the conventional process by which chromosomes divide into two identical copies, one for each daughter cell. This adds an element of unpredictability to how genes are inherited, allowing mutations to occur faster and on a more dramatic scale.

 

Such cellular skulduggery had previously been seen in bacteria and fungi, which use these tricks to develop resistance to drugs. It was not until Dr Mischel began looking into the subject in 2012, however, that cancer cells were found to be equally sneaky. Since then, he and his colleagues have found that ecDNA fragments overwhelmingly contain information on defence mechanisms that the cancer cell can use to rapidly replicate and to avoid being destroyed. This may be because cells carrying such ecDNA proliferate more easily. It certainly increases the chances of harmful new traits emerging faster than would be permitted by Mendel’s rules.

 

It also reveals a potential vulnerability. Dr Mischel worked in close collaboration with Howard Chang, chief scientific officer at AMGEN, a biotech company, to reveal that daughter cells can benefit from ecDNA only if these circular snippets are able to weave themselves back into their chromosomes after mitosis. The ecDNA does this with the help of constituent “anchor proteins” that return it to the chromosomes and specific DNA sequences that allow it to integrate back into them.

 

Dr Mischel views these sequences and the anchor proteins as prime targets for future treatment. “Introducing drugs that disable or destroy them ought to leave the ecDNA adrift and remove the advantages it brings to tumour cells,” he says. That work is in its infancy, although Dr Mischel says some suitable anchor proteins have already been identified. Clinical trials are pending.

 

As important as ecDNA may be as a mechanism for explaining the behaviour of some aggressive cancers, “It would be an oversimplification to say that it is the only factor,” says Lillian Siu, president of the American Association for Cancer Research and oncologist at the Princess Margaret Cancer Centre in Toronto. In her view, humdrum mutations caused by genome instability and defective DNA-repair jobs contribute to the appearance of ecDNA which, in turn, may enhance such instability. Even if disabling anchor proteins can slow the rapid evolution driven by ecDNA, the forces that cause it to appear in the first place are likely to persist.” [1]

 

1. Cancer’s cheat code. The Economist; London Vol. 459, Iss. 9495,  (Apr 18, 2026): 86.

Paskutinė vinis į bet kokio apsimetimo, kad egzistuoja taisyklėmis pagrįsta tvarka ar tarptautinė teisė, karstą.

„KAI AMERIKA ir Izraelis vasario 28 d. pradėjo karą, daugelis tikėjosi, kad Iranas sustabdys laivybą Hormūzo sąsiauryje. Mažai kas būtų numatęs, kad po mažiau nei dviejų mėnesių Donaldas Trumpas įves savo blokadą, nukreiptą į laivybą į Irano uostus ir pakrančių zonas bei iš jų. Ji įsigaliojo balandžio 13 d. Ponas Trumpas tikisi, kad ekonominis smaugimas gali priversti Iraną atverti sąsiaurį, kuriame bombos nesugedo. Tai rizika, kuri gali paaštrinti pasaulinę energetikos krizę ir sukelti naują eskalaciją.

 

Amerikos logika paprasta. Irano grasinimai smarkiai sumažino tanklaivių eismą per Hormūzą. Tačiau Iranas toliau eksportuoja savo naftą, nors ir mažesniais kiekiais. Jis taip pat leido kai kuriems laivams praplaukti, jei jie sumokės mokestį. Pono Trumpo žinia yra ta, kad jei neutralūs kroviniai negali netrukdomai praplaukti, Irano kroviniai taip pat negali. Karinis plano aspektas yra „visiškai įmanomas“, sako Markas Montgomery, atsargos užnugario narys. admirolas. Amerika gali gana lengvai įsilaipinti į laivus ir juos užgrobti. „Nebūtina pagauti kiekvieno laivo“, – priduria jis. „Tik tiek laivų, kad būtų galima perduoti žinią.“

 

Įspėjimas dėl naftos slėgio

 

Ekonominiai ir politiniai aspektai yra sudėtingesni. Tikriausiai tikslas yra nutraukti Irano ekonominę gelbėjimo liniją ir priversti režimą nusileisti taikos derybose, ypač dėl branduolinės programos. Teoriškai Iranas yra pažeidžiamas. Atsižvelgiant į dabartinius naftos atsargų lygius, jis gali būti priverstas apriboti gamybą per 10–20 dienų nuo visiškos, veiksmingos blokados, mano Ernestas Censier iš duomenų bendrovės „Vortexa“. „Žlungant Irano naftos eksportui, neliks pinigų importui, todėl aktyvumas smunka, valiuta patenka į devalvacijos spiralę ir prasideda hiperinfliacija“, – teigia Robin Brooks iš Amerikos analitinio centro „Brookings Institution“. „Neabejoju, kad tai geranoriškai pritrauks mulas prie derybų stalo.“

 

Kiti mažiau tikri. Iranas manė, kad jo naftos eksportas bus sutrikdytas, sako Esfandyar Batmanghelidj, analitinio centro „Bourse & Bazaar Foundation“ generalinis direktorius. Spausdindamas pinigus, parduodamas apie 100 mln. barelių naftos plaukiojančiose saugyklose prie Malaizijos ir Kinijos krantų ir gaudamas neoficialų kreditą iš importo tiekėjų, Iranas gali atlaikyti galbūt šešis mėnesius spaudimo.

 

Tai palieka du didelius klausimus. Pirmasis – poveikis energijos rinkoms, įskaitant ir Ameriką. Vien Irano produkcijos praradimas nėra katastrofiškas. Tačiau jį dar labiau apsunkina gerokai didesnis Persijos įlankos naftos tiekimas, įstrigęs dėl iš esmės uždaryto sąsiaurio. Kadangi paliaubos atrodo nestabilios, Iranas neturi daug paskatų atnaujinti praėjimą ir gali atnaujinti atakas prieš neutralius laivus.

 

Importuotojai būtų priversti sumažinti ir taip ribotas atsargas, todėl iki balandžio pabaigos „Brent“ naftos ateities sandorių kaina gali pakilti iki 150 USD už barelį. Įvertinus Irano smūgių į Persijos įlankos gamybos įrenginius, vamzdynus ir uostus riziką, taip pat Irano husių sąjungininkų Jemene galimus išpuolius prieš Raudonosios jūros laivybą, ši priemonė vargu ar išliks kelias savaites be naujos kainos kilimo. banga.

 

Antras klausimas – kurios šalys gali būti blokuojamos. Pavyzdžiui, Indija neigė mokėjusi mokestį už savo laivų praplaukimą, o tai, pasak D. Trumpo balandžio 12 d., būtų užkertantis kelią blokadai. Tačiau tą pačią dieną Amerikos centrinė vadovybė pareiškė, kad blokada bus vykdoma nešališkai – tai tarptautinės teisės reikalavimas – prieš laivus iš visų šalių, kurie praplaukė per Irano uostus ar pakrančių vandenis.

 

Tai apimtų ir Indijos laivus. Nafta, skirta Kinijai, Pakistanui ir Tailandui, taip pat išplaukė iš Hormūzo praėjus kelioms dienoms po paliaubų. Prieš tai Prancūzija ir Turkija, abi Amerikos sąjungininkės, buvo išsiuntusios savo laivus, matyt, gavusios Irano sutikimą. Amerikai gali pakakti įlaipinti tik kelis laivus, kad atgrasytų kitus nuo bandymų išsiveržti. Tačiau net ir tai galėtų supykdyti kai kurias draugiškas šalis. Kai kurie Amerikos pareigūnai mano, kad Kinija neprieštaraus blokadai, tačiau jos pripažinimas sukurtų pavojingą precedentą. Kinija jau seniai nerimauja dėl Malakos sąsiaurio blokados karo Ramiajame vandenyne atveju.

 

D. Trumpo sprendimas įvesti blokadą, kuris buvo priimtas po to, kai jis svarstė, kad jis gali „kartu“ kontroliuoti Hormūzą su Irano režimu – praktika, kuri apverstų tarptautinę teisę, reglamentuojančią tokius vandens kelius, rodo, kad laivybos laisvės principas patiria didžiulį spaudimą.

 

Kevinas Rowlandsas, kuris iki praėjusių metų vadovavo Karališkojo laivyno analitiniam centrui, o dabar redaguoja karinį žurnalą „RUSI Journal“, daro išvadą, kad tai „dar viena vinis į karstą bet kokiam apsimetimui, kad egzistuoja taisyklėmis pagrįsta tvarka ar tarptautinė teisė“.“ [1]

 

1. Double trouble. The Economist; London Vol. 459, Iss. 9495,  (Apr 18, 2026): 53, 54.

Kodėl aukštos kokybės duomenys dabar yra svarbesni, nei bet kada anksčiau, dirbtinio intelekto srityje


„Kalbos modeliai padarė didelę pažangą. Tačiau, norint atlikti dar sudėtingesnes užduotis ir tapti universalesniais, jiems reikia daugiau, nei vien skaičiavimo galios.

 

Nesvarbu, ar tai „ChatGPT“, „Gemini“, ar „Qwen“ – visi šie kalbos modeliai galiausiai yra pagrįsti ta pačia technologija. Juos skiria naudojami mokymo duomenys. Surinktų, filtruojamų ir generuojamų mokymo duomenų tipas lemia kalbos modelio kokybę: kaip patikimai jis atkuria faktus, kaip gerai atlieka užduotis ir kada jam pasireiškia haliucinacijos.

 

Trumpai tariant: kalbos modeliai labai gerai veikia užduotims, kuriose yra gerų mokymo duomenų. Užduotims be gerų mokymo duomenų jie greitai sugenda.

 

Kalbos modeliai šiuo metu tobulėja pirmiausia nustatant jų silpnąsias vietas ir specialiai renkant arba generuojant duomenis, kad būtų galima užpildyti šias spragas. Pavyzdžiui, kalbos modeliai dabar yra ekspertų lygio matematikos ir programavimo srityse – abiejose srityse jie turėjo didelių trūkumų vos prieš dvejus metus.

 

Kaip tai veikia, tampa aišku, kai nagrinėjami mokymo duomenys. Primename: kalbos modeliai turi tikslą – išmokti numatyti kitą žodį tekste, pirmiausia naudojant didžiulius duomenų rinkinius, o vėliau, tikslinimui, naudojant konkrečius pavyzdžius, tokius, kaip klausimai ir atitinkami atsakymai. Tokio paprasto ėjimo, kaip sugeneruoti kito žodžio garsus, pakanka sugeneruoti ištisus tekstus, atsakyti į klausimus ir užprogramuoti.

 

Šio pradinio mokymo tekstai daugiausia gaunami iš interneto. Norėdami juos gauti, dirbtinio intelekto kūrėjai pirmiausia surenka visus prieinamus tekstus. Šie tekstai yra neįtikėtinai įvairūs: Vikipedija, naujienų straipsniai, moksliniai straipsniai, forumų diskusijos ir, žinoma, daug reklamos. Tačiau didelė dalis prieinamų tekstų yra nesuprantami, prastos kokybės ir tiesiogiai netinkami mokymui. Jie identifikuojami – kartais naudojant mažus kalbos modelius – ir filtruojami. Lieka nedidelė interneto dalis, bet vis tiek didžiulis teksto kiekis, prilygstantis šimtams milijonų knygų.

 

Tada tekstai yra vertinami: tekstai, kurie mokymo duomenyse dažnai pasirodo labai panašiomis formomis, yra sumažinami, o kiti labai aukštos kokybės tekstai dubliuojami ir naudojami tolesniam mokymui. Tada tekstas mokymo duomenyse pasirodo kelis kartus. Tai subtilus procesas. Viena vertus, tokie faktai, kaip „Berlynas yra sostinė Vokietijoje“ turėtų pasirodyti kelis kartus, kad modelis išmoktų ir teisingai atkurtų tokius faktus. Kita vertus, kūrėjai nori išvengti pernelyg dažno tekstų pasirodymo identiškomis arba labai panašiomis formomis, nes kitaip kalbos modelis linkęs atkurti šiuos tekstus pažodžiui.

 

Pavyzdžiui, yra straipsnių iš „New York Times“, kuriuos „OpenAI“ kalbos modelis GPT-4 atkuria beveik pažodžiui po minimalių raginimų. Taip atsitinka, kai tokie tekstai labai dažnai pasirodo mokymo duomenyse. Šis pastebėjimas, beje, yra „New York Times“ vykdomo ieškinio prieš „OpenAI“ ir „Microsoft“ dėl autorių teisių pažeidimo pagrindas.

 

Kai mašinos generuoja duomenis

 

Bet kodėl kalbos modeliai apskritai gali atlikti tokias įvairias užduotis? Kodėl jie gali apibendrinti tekstus, išmokti programuoti ir atsakyti į klausimus tiesiog numatydami kitą žodį tekstuose iš interneto?

 

Internete yra tiek daug tekstų, kad net ir retų formatų, tokių, kaip klausimų ir atsakymų poros, taip pat tekstų ir jų atitinkamų santraukų, pasitaiko daug. Tačiau tokių pavyzdžių santykinė dalis yra labai maža. Todėl po šio pradinio mokymo modelis dažnai atsako į klausimą ne atsakymu, bet su kitu klausimu, nes yra daug svetainių, kurios susideda tik iš klausimų, pavyzdžiui, viktorinų svetainės ar praktikos pratimai.

 

Norint interneto duomenimis apmokytą kalbos modelį paversti naudingu asistentu, kuris atsako į klausimus ir vykdo instrukcijas, jis yra tiksliai sureguliuojamas. Paprasčiausias ir efektyviausias būdas tai tiksliai suderinti yra mokyti duomenimis, kurie rodo norimą elgesį, pavyzdžiui, į klausimus duodant atitinkamus atsakymus. Tokiu būdu modelis išmoksta atsakyti į klausimą, o ne pats jį užduoti.

 

Pirmojoje kalbos modelių kartoje žmonės atliko svarbų vaidmenį, generuojant tokius duomenis. Jie rašė atsakymus į klausimus ir įvertino skirtingus atsakymus, kaip geresnius arba blogesnius, leisdami modeliui sužinoti, kuriuos atsakymus žmonės renkasi.

 

Sintetiniai duomenys, t. y. pačių kalbos modelių generuojami duomenys, tampa vis svarbesni mokymui, nes daugiau duomenų paprastai yra naudinga. Ir kadangi puikūs internete prieinami tekstai jau yra plačiai naudojami mokymui – daugelis tekstų iš interneto yra žemos kokybės tekstai dažnai atmetami mokymo tikslais. Tačiau tokie tekstai gali būti puikus pagrindas generuoti sintetinius duomenis, kuriuos vėliau galima naudoti mokymui. Kalbos modeliai naudojami tokiems prastos kokybės ar vidutiniškiems duomenims paversti aukštos kokybės duomenimis.

 

Kaip ir žmonės, kalbos modeliai geriau mokosi, kai mato informaciją, pateiktą skirtingais variantais. Todėl gali būti veiksminga generuoti skirtingas teksto versijas, naudojant kalbos modelį ir naudoti jas mokymui. Tokie sintetiniai duomenys vaidina vis svarbesnį vaidmenį, mokant kalbos modelius.

 

Mokymasis, naudojant sintetinius duomenis yra labai efektyvus. Sintetiniai duomenys taip pat gali būti naudojami kitų modelių galimybėms atkartoti. Pavyzdžiui, jei tokia įmonė, kaip „OpenAI“ išleidžia naują, aukštos kokybės, kalbos modelį, kitos įmonės galėtų jį naudoti duomenims, kurie pagerina jų pačių modelius, generuoti, net jei „OpenAI“ paslaugų teikimo sąlygos tai aiškiai draudžia.

 

Įtikinamas pavyzdys yra „Deepseek V3“. Tai labai geras kalbos modelis, kurį Kinijos įmonė „Deepseek“ nemokamai pateikė 2024 m. gruodžio mėn. „V3“ greitai pateko į antraštes, nes „Deepseek“ inžinieriams pavyko gana pigiai apmokyti labai gerą modelį. Dėl to dirbtinio intelekto lustų bendrovės „Nvidia“ akcijų kaina sausio mėnesį vos per vieną dieną smuko 17 procentų. Pagrindinė Kinijos sėkmės priežastis yra ta, kad „Deepseek“ darbuotojai dirbo su labai aukštos kokybės duomenimis.

 

Geresni duomenys leidžia apmokyti tokį pat gerą modelį su mažesne skaičiavimo galia – taigi ir mažesnėmis sąnaudomis.

 

Paklaustas „koks modelis esate?“, V3 atsako: „Aš esu dirbtinio intelekto kalbos modelis, vadinamas „ChatGPT“, sukurtas „OpenAI“, o tai rodo, kad kai kurie mokymo duomenys yra gauti iš „OpenAI“ modelių.

 

Mūsų pačių tyrimas patvirtina šią prielaidą: V3 reaguoja į daugelį raginimų taip, kad juos labai sunku atskirti nuo GPT-4 atsakymų, o tai rodo, kad kai kurie „Deepseek“ mokymo duomenys buvo sugeneruoti, naudojant GPT-4.

 

Taip taip pat gali būti dėl to, kad „Deepseek“ modelis buvo apmokytas, naudojant duomenis iš interneto, nes internete jau 2024 m. buvo daug „OpenAI“ modelių sugeneruotų tekstų.

 

Minties žingsnių analizė

 

Vienas iš svarbiausių pokyčių kalbos modelių srityje per pastaruosius pusantrų metų buvo jų mokymas atlikti ilgus mąstymo procesus. Dauguma modelių dabar turi tokią „mąstymo“ funkciją: susidūręs su klausimais, kuriuos reikia apmąstyti, modelis pirmiausia atlieka mąstymo veiksmus, o tada pateikia atsakymą, pagrįstą šiais veiksmais. Pavyzdžiui, „OpenAI“ O1 modelis, „Google“ „Gemini Thinking“ ir „Deepseek“ R1 modelis. Tokie mąstymo veiksmai yra labai naudingi, atsakant į sudėtingesnius klausimus.

 

Tai taip pat galima iliustruoti trumpu pavyzdžiu. Klausimas: Ona turi tris kriaušes ir nusiperka dar dvi – kiek ji jų turi tada? Mąstymo veiksmai arba mąstymo procesas yra toks: Ona turi tris kriaušes. Ji nusiperka dar dvi. Trys plius du lygu penki. Atsakymas: penkios kriaušės.

 

Tokie mąstymo procesai ne tik padaro atsakymą suprantamą, bet, dar svarbiau, žymiai pagerina jo kokybę. Į daug sudėtingesnius klausimus galima atsakyti, plėtojant atsakymą per šiuos mąstymo procesus. Tokie mąstymo procesai yra ypač naudingi sudėtingų klausimų, pavyzdžiui, matematinių, atveju. Mąstymo procese modelis siūlo įvairius metodus, kai kuriuos atmeta, kitus išbando ir ištaiso Klaidas, joms pasitaikant. Sudėtingoms matematinėms problemoms spręsti mąstymo procesai gali lengvai apimti 50 ar daugiau puslapių. Jie paprastai nerodomi vartotojui. Tai turi savo kainą: ilgesnėms samprotavimo eilutėms kalbos modelis turi sugeneruoti daugiau teksto, o tam reikia daugiau apdorojimo galios.

 

Modelio mokymo apie tokius mąstymo procesus sudėtingumas slypi tame, kad tokių duomenų rašytine forma yra labai mažai. Pavyzdžiui, kai matematikai sprendžia sudėtingą problemą, jie paprastai užrašo tik teisingą sprendimą, bet ne sudėtingus mąstymo procesus ir bandymus, kurie padėjo rasti sprendimą.

 

Norint išmokti tokių mąstymo procesų matematikoje, lemiamą vaidmenį atlieka vadinamasis sustiprinimo mokymasis. Šiame procese modeliai išbando skirtingas samprotavimo linijas, ir tos samprotavimo linijos, kurios veda prie teisingų sprendimų, atitinkamai tampa labiau tikėtinos. Tai leidžia kalbos modeliams išmokti generuoti ilgus ir sudėtingus mąstymo procesus, kurie veda prie sprendimo. Tai veikia tik tuo atveju, jei modelis jau geba tvirtai matematiškai samprotauti; kitaip jis negalėtų generuoti jokių mąstymo procesų, kurie veda prie sprendimo.

 

Todėl sustiprinimo mokymasis gali būti laikomas metodu, kuriuo kalbos modeliai mokosi iš savarankiškai sugeneruotų, sintetinių duomenų. Tam būtina sąlyga – gebėjimas automatiškai įvertinti, ar tekstas ar rezultatas yra geras, ar blogas. Tai dažnai įmanoma matematikoje ir programavime, ir tai yra pagrindinė priežastis, kodėl kalbos modeliai tapo tokie galingi šiose srityse.

 

Kalbos modeliai vis dažniau naudojami sistemose, kuriose jie savarankiškai sąveikauja su kompiuteriais, kad atliktų užduotis, pavyzdžiui, užsakytų skrydžius, įvertintų duomenis ar atliktų tyrimus. Tokios sistemos vadinamos agentais. Kaip ir mąstymo procesuose, kalbos modeliai čia taip pat mokosi iš pavyzdžių, kai tokios užduotys buvo sėkmingai išspręstos – kai kurias sukūrė žmonės, kai kurias – patys kalbos modeliai.

 

Norėdamas įvertinti, kaip gerai Miuncheno technikos universiteto studentas supranta mano paskaitų medžiagą, aš, kaip ir dauguma mano kolegų, naudoju egzaminus ir kursinius darbus. Egzamino tipo klausimai taip pat dažnai naudojami kalbos modeliams, siekiant patikrinti jų galimybes. Vienas iš pavyzdžių yra Jungtinių Valstijų medicinos licencijavimo egzaminas – gydytojų licencijavimo egzaminas Jungtinėse Valstijose. Egzaminą sudaro klausimai su keliais atsakymų variantais apie pagrindus, klinikines žinias ir ligų gydymą. Bendrieji kalbos modeliai, tokie, kaip GPT-4, taip pat specializuoti kalbos modeliai medicinai, lengvai išlaiko tokius egzaminus ir pasiekia panašius balus kaip ir medicinos specialistų. Tas pats pasakytina ir apie kitas profesines grupes.

 

Tačiau lygiai taip pat, kaip geras egzamino pažymys tik iš dalies numato, kaip gerai studentas gali pritaikyti paskaitos medžiagą darbe ar tyrimuose, kalbos modelio aukštas našumas dar mažiau pasako apie jo gebėjimą produktyviai vykdyti procesus profesinėje aplinkoje, pavyzdžiui, ligoninėje.

 

Žmonės gali daug lanksčiau perkelti žinias į naujas situacijas, nei kalbos modeliai. Kita vertus, kalbos modeliai yra ypač veiksmingi, kai mokymo duomenys labai panašūs į užduotis.

 

Todėl kalbos modelius tikrai galima naudoti labai sudėtingiems procesams profesiniame gyvenime; tačiau jie turi būti apmokyti, naudojant tinkamus duomenis.

 

Duomenų spragų užpildymas

 

Kokios yra dirbtinio intelekto, orientuoto į mokymo duomenis, pasekmės? Kalbos modeliai ypač tinka užduotims, kuriose yra daug, tinkamų naudoti ir aukštos kokybės duomenų. Net ir ankstyviausi kalbos modeliai puikiai pasižymėjo rašant dalykus, kuriems internete yra daug gerų pavyzdžių, pavyzdžiui, receptus, santraukas ir bendrąsias žinias.

 

Šiandieniniai modeliai taip pat labai gerai veikia, atliekant labai sudėtingas matematines ar programavimo užduotis, nes egzistuoja arba buvo sugeneruota daug aukštos kokybės duomenų.

 

Užduotims, kuriose surinkti ar sugeneruoti pakankamai duomenų yra sudėtingiau – kaip daugelyje mokslo sričių, teisinėse paslaugose ar vidiniuose verslo procesuose – kalbos modelių sėkmė priklausys nuo šių duomenų spragų užpildymo.

 

Tai, kad bendrieji modeliai nuolat tobulėja, tai mažai ką keičia. Tie, kurie nori naudoti kalbos modelius specializuotoms užduotims, turi juos apmokyti tinkamais duomenimis. Todėl vis dar yra didelis potencialas padaryti kalbos modelius daug geresnius ir naudingesnius. Reikšminga šio darbo dalis bus susijusi su mokymo duomenų rinkimu ir generavimu.

 

Prof. dr. Reinhardas Heckelis eina Mašininio mokymosi katedros vedėjo pareigas Miuncheno technikos universiteto Kompiuterių inžinerijos katedroje.

 

Skaitmeninė ekonomika

Viską, kas svarbu apie dirbtinį intelektą, platformų ekonomiką ir skaitmeninimą, galima rasti kartu su daugybe išsamių žinių mūsų PRO skaitmeninės ekonomikos produktuose.“ [1]

 

1. Wieso es in der Künstlichen Intelligenz jetzt erst recht auf hochwertige Daten ankommt. Frankfurter Allgemeine Zeitung; Frankfurt. 09 Feb 2026: 19.  Von Reinhard Heckel

Why High-Quality Data Matters Now More Than Ever in Artificial Intelligence


“Language models have achieved remarkable progress. However, for them to tackle even more complex tasks and become more versatile, they require more than just computational power.

 

Whether it’s ChatGPT, Gemini, or Qwen—all these language models ultimately rely on the same underlying technology. What distinguishes them is the training data used. The specific training data that is collected, filtered, and generated determines the quality of a language model: how reliably it reproduces facts, how well it executes tasks, and when it ‘hallucinates.’

 

In short: For tasks where high-quality training data is available, language models perform exceptionally well. For tasks lacking such data, they fail quickly.

 

Currently, language models are improving primarily because we are identifying their weaknesses and deliberately collecting or generating data to bridge those gaps. For instance, language models today operate at an expert level in mathematics and programming—two areas where, just two years ago, they still exhibited significant deficiencies.

 

A look at the training data reveals how this process works. As a reminder: Language models learn to predict the next word in a text—initially by training on massive datasets, and subsequently by fine-tuning on specific examples, such as questions paired with their corresponding answers. As simple as the act of generating the next word may sound, it is sufficient to enable the creation of entire texts, the answering of questions, and even programming.

 

The texts used for this initial training are largely sourced from the internet. To this end, AI developers first gather every available text they can find. This collection is highly diverse, encompassing Wikipedia entries, news articles, scientific papers, forum discussions, and—naturally—a great deal of advertising copy. However, a significant portion of the texts obtained in this manner consists of gibberish; it is of low quality and unsuitable for direct use in training. These unsuitable texts are identified—in some cases, with the aid of smaller language models—and subsequently filtered out. What remains constitutes a mere fraction of the internet, yet it still represents a colossal volume of text—a quantity on a scale which corresponds to hundreds of millions of books.

 

Next, the texts are weighted: texts that appear frequently—and in very similar forms—within the training data are downweighted, while other texts of exceptionally high quality are duplicated, thereby appearing multiple times within the training set. This is a delicate process. On one hand, factual statements—such as "Berlin is the capital of Germany"—should appear multiple times to ensure the model correctly learns and reproduces such facts. On the other hand, developers aim to avoid having texts appear in identical or highly similar forms too frequently; otherwise, the language model tends to reproduce these texts verbatim.

 

For instance, there are articles from *The New York Times* that the OpenAI language model GPT-4 can reproduce almost word-for-word with only minimal prompting. This occurs when such texts appear with very high frequency within the training data. Incidentally, this observation forms the basis of the ongoing lawsuit filed by *The New York Times* against OpenAI and Microsoft for copyright infringement.

 

When Machines Generate Data

 

But why are language models capable of handling such a diverse range of tasks in the first place? How can they learn to summarize texts, write code, and answer questions—simply by predicting the next word in texts scraped from the internet?

 

The internet contains such a vast volume of text that even rare formats—such as question-and-answer pairs, or texts accompanied by their corresponding summaries—occur in large numbers. However, the relative proportion of such examples remains very small. Consequently, following this initial training phase, a model will often respond to a question not with an answer, but with another question—precisely because the internet also hosts numerous webpages consisting solely of questions, such as quiz sites or practice exercises.

 

To transform a language model—originally trained on internet data—into a useful assistant capable of answering questions and following instructions, it must undergo a process known as "fine-tuning." The simplest—and an effective—method of fine-tuning involves training the model on data that exemplifies the desired behavior; for instance, pairs of questions and their corresponding answers. In this way, the models learns to respond to a question instead of asking a question themselves.

 

In the first generation of language models, humans played a major role in generating such data. They wrote answers to questions and rated various responses as better or worse, enabling the model to learn which answers humans prefer.

 

Synthetic data—that is, data generated by language models themselves—is, however, becoming increasingly important for training, as having more data generally helps. And because the highest-quality texts on the internet have, in fact, already been largely utilized for training—many texts from the internet are of low quality and are therefore discarded from the training set. However, such texts can serve as an excellent foundation for generating synthetic data, which, in turn, can prove beneficial for training purposes.

 

Language models are employed to transform such poor or mediocre data into high-quality data.

 

Much like humans, language models learn more effectively when exposed to information presented in various forms. Consequently, it can be an effective strategy to utilize a language model to generate multiple variations of a single text and subsequently incorporate these into the training process. Such synthetic data is playing an increasingly pivotal role in the training of language models.

 

Training on synthetic data is a highly effective approach. Consequently, synthetic data can also be leveraged to replicate the capabilities of other models. For instance, if a company like OpenAI releases a new, high-performance language model, other companies could utilize it to generate data that enhances their own models—even if OpenAI’s terms of service explicitly prohibit such usage.

 

An intriguing example of this is DeepSeek V3. This refers to a highly capable language model that the Chinese company DeepSeek released as a freely accessible resource in December 2024. V3 quickly made headlines, as the innovators at DeepSeek had succeeded in training a top-tier model at a relatively low cost. In the wake of this development, the share price of the AI ​​chip manufacturer Nvidia plummeted by 17 percent in a single day in January. A key factor behind this Chinese success lies in the fact that the DeepSeek team worked with exceptionally high-quality data. Superior data enables the training of a model of equal caliber using less computational power—and, consequently, at a lower cost. When asked, "What model are you?", V3 responds: "I'm an AI language model called ChatGPT, created by OpenAI"—a response strongly suggesting that its training data was, at least in part, derived from OpenAI models. Our own research supports this hypothesis: V3 responds to many prompts in a manner that is very difficult to distinguish from responses generated by GPT-4, suggesting that a portion of Deepseek’s training data was generated using GPT-4.

 

However, this could also be attributed to the fact that the Deepseek model was trained on data from the internet—for even as early as 2024, the internet already contained numerous texts that had been generated by OpenAI models.

 

**Breaking Down Thought Processes**

 

One of the most significant developments in the field of language models over the past year and a half has been teaching them to execute extended chains of thought. Most models now feature such a "Think" function: when faced with questions requiring reflection, the model first executes a series of internal thought steps and then provides an answer based on those steps. Examples of this include OpenAI’s O1 model, Google’s Gemini Thinking, and Deepseek’s R1 model. Such thought steps are highly beneficial for answering more complex questions.

 

Let us illustrate this with a simple example. Question: Anne has three pears and buys two more—how many does she have in total? The thought steps—or the reasoning process—proceed as follows: Anne has three pears. She buys two more. Three plus two equals five. Answer: five pears.

 

Such thought steps not only make an answer comprehensible but, more importantly, significantly enhance its quality. This enables the models to answer far more difficult questions by systematically working out the solution through these intermediate steps. These thought steps prove particularly useful when tackling complex inquiries, such as mathematical problems. During this internal reasoning process, the model proposes various approaches to a solution, discards some, experiments with others, and corrects errors as they arise. For intricate mathematical problems, these thought steps can easily span 50 pages or more; typically, however, they are not displayed to the user. This comes at a price: for longer chains of thought, the language model must generate more text, which requires greater computational power.

 

The difficulty in teaching a model such reasoning processes lies in the fact that very little data of this kind exists in written form. For instance, when mathematicians solve a difficult problem, they typically record only the correct solution—not the intricate lines of reasoning and attempts that contributed to finding that solution.

 

To acquire such mathematical reasoning steps, so-called Reinforcement Learning plays a pivotal role. In this process, models experiment with various lines of reasoning; those lines of reasoning that lead to correct solutions are subsequently reinforced, making them more likely to occur in the future. Through this mechanism, language models learn to generate long and complex reasoning processes that culminate in a solution. This approach works only if the model already possesses strong mathematical reasoning capabilities; otherwise, it would be unable to generate any reasoning processes that lead to  a solution.

 

Reinforcement learning can therefore be viewed as a method through which language models learn from self-generated, synthetic data. A prerequisite for this is the ability to automatically assess whether a given text or result is good or bad. In mathematics and programming, this is often possible—a key reason why language models have become so powerful in these fields.

 

Language models are increasingly being deployed in systems where they interact autonomously with computers to perform tasks—for instance, booking flights, analyzing data, or conducting research. Such systems are referred to as "agents." As with the "steps of thought" discussed earlier, language models in this context also learn from examples where such tasks have been successfully completed—examples generated partly by humans and partly by the language models themselves.

 

To gauge how well a student at TU Munich understands my lecture material, I—like most of my colleagues—rely on exams and assignments. For language models, exam-style tasks are also frequently used to test the models' capabilities. One such example is the United States Medical Licensing Examination (USMLE), the licensing exam for physicians in the United States. This exam consists of multiple-choice questions covering foundational concepts, clinical knowledge, and disease management. General-purpose language models like GPT-4, as well as specialized medical language models, pass such exams with ease, achieving scores on par with those of medical professionals. The same holds true for other professional fields.

 

However, just as a high grade on an exam offers only limited predictive power regarding how effectively a student can apply lecture material in their professional work or research, a language model’s strong performance signals—to an even lesser extent—whether that model is capable of productively executing real-world professional workflows, such as those found in a hospital setting.

 

Humans are able to transfer knowledge to novel situations with far greater flexibility than language models can. Language models, conversely, demonstrate exceptional performance primarily when the training data closely resembles the specific tasks at hand. Therefore, it is certainly possible to deploy language models for highly complex professional workflows; however, they must be trained on appropriate data.

 

**Filling the Data Gaps**

 

What follows from the fact that training data lies at the very heart of artificial intelligence? Language models are particularly well-suited for tasks where an abundance of usable, high-quality data is available. Even the earliest language models excelled at generating content for which numerous good examples existed on the internet—such as recipes, summaries, and general knowledge.

 

Today’s models also perform exceptionally well in highly complex tasks within mathematics or programming, precisely because a vast amount of high-quality data exists for these domains—or could be generated.

 

For tasks where gathering or generating sufficient data proves more challenging—as is often the case in various scientific fields, legal services, or internal corporate workflows—the success of language models will hinge on the ability to fill these data gaps.

 

The fact that general-purpose models are constantly improving does little to alter this fundamental reality. Anyone wishing to deploy language models for specialized tasks must train them using data specifically tailored to those applications. Consequently, there remains immense potential to make language models significantly more effective and useful. And a substantial portion of this effort will involve the collection and generation of training data.

 

Prof. Dr. Reinhard Heckel holds the Chair of Machine Learning within the Department of Computer Engineering at the Technical University of Munich (TU Munich).

 

**Digital Economy**

"You will find everything you need to know about artificial intelligence, the platform economy, and digitalization bundled together—and enriched with in-depth insights—in our 'PRO-Digital Economy' product offerings." [1]

 

1. Wieso es in der Künstlichen Intelligenz jetzt erst recht auf hochwertige Daten ankommt. Frankfurter Allgemeine Zeitung; Frankfurt. 09 Feb 2026: 19.  Von Reinhard Heckel

Last nail in the coffin for any pretense that there is such a thing as a rules-based order or international law


“WHEN AMERICA and Israel began their war on February 28th, many expected Iran to choke off shipping in the Strait of Hormuz. Few would have predicted that less than two months later, Donald Trump would impose his own blockade, targeting traffic to and from Iranian ports and coastal areas. It went into effect on April 13th. Mr Trump hopes economic strangulation might force Iran to open the strait where bombs have failed. It is a gamble that could compound the global energy crisis and lead to fresh escalation.

 

America’s rationale is simple. Iranian threats have drastically reduced tanker traffic through Hormuz. But Iran has continued to export its own oil, albeit in smaller quantities. It has also allowed some ships to pass if they pay a fee. Mr Trump’s message is that if neutral cargo cannot pass unhindered, Iran’s can’t either. The military aspect of the plan is “absolutely feasible”, says Mark Montgomery, a retired rear admiral. America can board and seize ships relatively easily. “You don’t have to catch every ship,” he adds. “Just enough ships to send the message.”

 

Oil pressure warning

 

The economic and political aspects are trickier. The aim, presumably, is to sever Iran’s economic lifeline and force the regime to make concessions in peace talks, particularly over its nuclear programme. In theory, Iran is vulnerable. Given its current crude-oil storage levels, it may be forced to curb production within 10-20 days of a full, effective blockade, reckons Ernest Censier of Vortexa, a data firm. “As Iran’s oil exports collapse, there’ll be no cash for imports, so activity implodes, the currency goes into a devaluation spiral and hyperinflation ensues,” argues Robin Brooks of the Brookings Institution, an American think-tank. “There’s no doubt in my mind this will bring the mullahs to the negotiating table in good faith.”

 

Others are less sure. Iran had assumed its oil exports would be disrupted, says Esfandyar Batmanghelidj, the chief executive of the Bourse & Bazaar Foundation, a think-tank. It can endure maybe six months of pressure by printing money, selling 100m or so barrels of oil in floating storage off Malaysia and China, and securing informal credit from import-suppliers.

 

That leaves two big questions. One is the impact on energy markets, including in America. The loss of Iranian output alone is not catastrophic. But it compounds the far greater volume of Gulf supply trapped by the largely closed strait. With the ceasefire looking shaky, Iran has little incentive to reopen the passage and could well restart attacks on neutral ships.

 

Importers would be forced to draw down already limited stocks, potentially pushing Brent crude futures towards $150 a barrel by the end of April. Factor in the risk of Iranian strikes on Gulf production facilities, pipelines and ports as well as the possibility of attacks on Red Sea shipping by Iran’s Houthi allies in Yemen, and the measure looks unlikely to survive a few weeks without prompting another price surge.

 

The second issue is which countries might be caught in a blockade. India, for instance, has denied paying a fee to get its ships through, which Mr Trump said on April 12th would be the trigger for interdiction. But on the same day America’s Central Command said that the blockade would be enforced impartially—a requirement in international law—against ships from all countries that had passed through Iranian ports or coastal waters.

 

That would cover the Indian vessels. Oil bound for China, Pakistan and Thailand also moved out of Hormuz in the days after the ceasefire. Prior to that, France and Turkey, both American allies, had sent their ships through, apparently with Iranian consent. America might only need to board a handful of ships to deter others from attempting to break out. But even that could anger some friendly countries in the process. Some American officials think China will not challenge the blockade, but accepting it would set a dangerous precedent. China has long worried about a blockade around the Strait of Malacca in the event of a war in the Pacific.

 

Mr Trump’s decision to impose a blockade, which came after he toyed with the idea that he might “jointly” control Hormuz with Iran’s regime, a practice that would upend international law governing such waterways, suggests that the principle of freedom of navigation is coming under enormous stress. Kevin Rowlands, who ran the Royal Navy’s think-tank until last year and now edits the RUSI Journal, a military journal, concludes that it is “another nail in the coffin for any pretence that there is such a thing as a rules-based order or international law”.” [1]

 

1. Double trouble. The Economist; London Vol. 459, Iss. 9495,  (Apr 18, 2026): 53, 54.

2026 m. gegužės 1 d., penktadienis

Breakthrough Heart Procedure Comes With Risky Tradeoffs --- New valves save lives, but don't last as long as some hoped


“After months of dizziness and arms aching so badly she could barely walk her dog, Susan Glannan lay stunned in a sunny hospital room as a doctor told her she should have open heart surgery.

 

The idea of a surgeon cracking her chest open and stopping her heart terrified her. Glannan, who was 64, lived alone. She didn't have her affairs in order. And just four years earlier, she had had a procedure that she thought would take care of her heart problem -- a diseased aortic valve. "I was disappointed and scared," she said, "and I started worrying, 'Do I have a will?'"

 

That first procedure was called a transcatheter aortic valve replacement, or TAVR.

 

It's considered one of the biggest innovations in cardiovascular medicine, offering a way to spare patients the physical and emotional trauma of open heart surgery.

 

TAVR was approved in 2011 for frail, older patients unlikely to withstand surgery -- people with no more than a few years left to live. The Food and Drug Administration later approved it for healthier patients at intermediate and low risk of dying from surgery.

 

Yet there's limited research on how long the valves might last. And as TAVR has become more widely used among younger and healthier people, some are finding that their valves don't work as well or last as long as they hoped. The procedure they thought would spare them a complicated surgery leads some to the operating table anyway.

 

"It never dawned on me that four years later I'd be like, the murmur is bigger than it ever was," Glannan said. "I never should have gotten the TAVR."

 

After her TAVR in 2021, Glannan was back on her feet in a few days. But by 2025, the TAVR valve was stiff and leaking, restricting the flow of blood to the rest of her body.

 

Glannan is one of more than 710,000 Americans who received a TAVR between 2015 and 2024. The procedure is designed to treat severe aortic stenosis, in which calcium buildup narrows the opening of the aortic valve. Aortic stenosis affects at least 1.5 million Americans, and if severe and left untreated, can lead to heart failure and death.

 

TAVR is an appealing option for patients and doctors because it's a simpler, less invasive procedure than a surgical aortic valve replacement, with shorter recovery times.

 

Aortic stenosis most commonly develops with aging, and most cases are in people over 65. The number of people under 65 who have gotten a TAVR has grown even though surgery is recommended for that age group under U.S. medical guidelines unless they have a shortened life expectancy or a health condition that makes surgery risky. TAVR valves haven't been well studied in that age group. Nearly one-third of patients ages 40 to 65 who need to replace their aortic valve receive a TAVR, according to a recent study.

 

Whether a patient has a TAVR or a surgical aortic valve replacement should depend on their health and projected life expectancy, said Dr. Vinay Badhwar, president of the Society of Thoracic Surgeons and executive chair of the West Virginia University Heart & Vascular Institute. "TAVR is an outstanding lifesaving treatment for high-risk and elderly patients, but surgery may be the better choice for low-risk younger patients," Badhwar said. "Our focus must be on the optimal long-term outcome."

 

TAVR valves, as well as many surgical valves, are made with animal tissue and eventually deteriorate -- faster in younger, more active patients, according to studies. How long they last is important to study in younger patients because "it's clear many of them are going to outlive the durability of the valve," said Dr. Martin Leon, professor of cardiology at Columbia University Irving Medical Center and co-principal investigator of a clinical trial following low-risk patients.

 

Doctors are concerned by recent data from another clinical trial in low-risk patients suggesting that some TAVR valves may wear out faster than their surgical counterparts.

 

Some TAVR patients end up needing an "explant," a complex surgery to remove the original aortic valve along with the TAVR valve and sew in a new surgical one. While uncommon, the explant is the fastest-growing type of cardiac surgery, by rate, in the U.S. today. It is a riskier operation than regular surgical aortic valve replacement.

 

Leon Schiefer, then 55, learned after a 2019 physical that he had severe aortic stenosis. He was told he would need to get his valve replaced to keep working as a long-haul trucker. A cardiologist and a cardiac surgeon near his home in Vassar, Mich., both recommended a TAVR to Schiefer. The recovery from open heart surgery would be long, particularly for Schiefer, who weighed about 350 pounds, the surgeon told him. Reluctant to miss months of work, he agreed. He had the TAVR on a Monday and was back driving Friday.

 

This past December, Schiefer began feeling short of breath and tight in his chest. By January he was in such bad shape he had to stop trucking.

 

He later sought care at University of Michigan Health, where doctors found his heart barely pumping. They told him he had days to live.

 

Schiefer's TAVR valve had become calcified and was leaking, leading Schiefer to heart failure, said Dr. Robert Hawkins, the cardiac surgeon who operated on him in February. He replaced it and fixed three blocked arteries.

 

Schiefer, who has lost 90 pounds since he had the TAVR, still has heart failure, but said he is improving. He feels he might have been better off if he had initially had surgery instead of the TAVR. "I wouldn't have got in the boat I got in," he said.

 

The number of TAVR explants has grown more steeply than cardiac specialists anticipated and is likely to continue to grow as more younger people get TAVRs, said Dr. Shinichi Fukuhara, a cardiac surgeon at University of Michigan Health who has performed more than 175 TAVR explants. "Almost nobody's thinking about what's next," he said.

 

Valve makers and many doctors say when a TAVR valve wears out, a new valve can be inserted inside the old one, like Russian nesting dolls. The number of patients undergoing a second TAVR procedure after their valve wears out is also growing. But little is known about how long second TAVR valves last, and the nesting doll procedure doesn't work for everyone.

 

Edwards Lifesciences, Medtronic and other companies have built TAVR into a global business with $7.02 billion in 2025 revenue, according to Wells Fargo Securities.

 

Last year the FDA approved use of Edwards TAVR valves for patients with severe aortic stenosis but no symptoms. Edwards, the market leader, and Medtronic are exploring use of TAVR for patients with moderate aortic stenosis. Both companies fund education for physicians and clinical trials, including currently following valves in low-risk patients for 10 years.

 

"Our goal is to follow the science to see how we can help patients," said Dr. Todd Brinton, Edwards' chief scientific officer. Patients under 65 who get TAVR are a small subset and include people with other health conditions that make surgery not possible or optimal, he said. All patients who get a TAVR, including those under 65, are approved by a "heart team" of cardiac specialists, he said.

 

The real challenge, he said, is "the larger number of people over 65 who are at higher risk and not receiving diagnosis or treatment." Edwards is supporting an initiative by the American Heart Association to improve diagnosis and appropriate treatment, he added.

 

TAVR technology is advancing, with companies developing newer valves to improve long-term outcomes, according to Medtronic and Edwards, which both also make the traditional surgical valves.

 

Jane Booth was diagnosed with severe aortic stenosis at 62. A cardiac surgeon near her home in Orange County, Calif., recommended surgery and warned that the recovery would be long. "He said, you're going to get depressed," recalled Booth, who walks 5 to 6 miles a day and plays tennis.

 

She learned about the possibility of a TAVR at Cedars-Sinai in Los Angeles. She had the procedure there at age 65, when insurance covered it.

 

Her recovery was quick. As for replacing the TAVR valve when it wears out, "I'm just not worrying about it," said Booth, now 69. "The technology continues advancing."

 

As more young or low-risk patients get valve replacements, doctors need to plan for more, either surgery or TAVR, over the patient's lifetime, said Dr. Kendra Grubb, chief medical officer of Medtronic's structural heart business. As a practicing cardiac surgeon, she encountered younger patients who would "come in and say, 'I'm here for a TAVR,'" she said. But "surgery is still the gold standard for those who are under 65," Grubb said.

 

Glannan, who lives in Odessa, Fla., was relieved when a heart team recommended a TAVR in 2021, when she was 60. With a busy job and her dog, the last thing she wanted was to be knocked off her feet for months.

 

After the procedure, she said she felt "maybe 30 to 40 percent better." Follow-up tests every year showed a leak around the valve. In 2024, she started having dizzy spells. She had to rest on the stairs in her three-story townhouse.

 

In spring 2025, her doctors recommended a second TAVR. In November 2025, a new primary care doctor told Glannan her heart sounded like "a bunch of galloping horses," Glannan recalled. Soon after, she was admitted to the hospital because she thought she was having a heart attack.

 

Dr. Eric Wherley saw Glannan needed a new valve. A test showed calcium deposits peppering the leaflets of the TAVR valve, making it hard for it to open and close. Blood was leaking around it. Wherley, a cardiothoracic surgeon with the AdventHealth Pepin Heart Institute in Tampa, Fla., and a team of specialists felt that taking out the TAVR valve and sewing in a surgical one was the best option. He hoped a surgical valve would last longer and then Glannan could have a TAVR when it wore out.

 

She had the surgery in December. Now, Glannan is back working as a business development executive and walking her dog. The dizziness is gone and she has more energy.

 

"I thought I asked all the right questions and dug deep enough," she said of getting a TAVR. "But with the heart, your defense goes down."” [1]

 

1. Breakthrough Heart Procedure Comes With Risky Tradeoffs --- New valves save lives, but don't last as long as some hoped. McKay, Betsy.  Wall Street Journal, Eastern edition; New York, N.Y.. 25 Apr 2026: A1.  

Tycoon capitalism

 

“DARIO, DEMIS, Elon, Mark and Sam. The five most important people in artificial intelligence are so famous that first names alone are enough to identify them. Politicians and journalists hang on their every word. ChatGPT, run by Sam Altman’s OpenAI, has more than 900m weekly users. Dario Amodei’s Anthropic has developed an AI model so good at hacking it has caused panic among policymakers. Demis Hassabis, head of Google’s AI efforts, has won a Nobel prize for his scientific research. Elon Musk, who runs xAI, among other businesses, is the richest person alive. Mark Zuckerberg’s Meta has created the West’s most popular family of open-source models, and is spending enormous sums on AI researchers in an attempt to catch up to the technology’s frontier.

 

In a very real sense, these five men hold the fate of Western civilisation in their hands. Already the American military uses their AI tools, with some of the tycoons (Mr Altman and Mr Musk) showing more enthusiasm for this than others (Mr Amodei). Some economists believe that AI will eventually supercharge economic growth. Others say it will put millions out of work. Plenty of people fret that it might end humanity altogether. Not since the splitting of the atom has a new technology created such angst.

 

It is unnerving that so few men wield such awesome power, particularly men as opportunistic as Mr Altman or as volatile as Mr Musk. But it is hardly unprecedented. AI’s famous five are but the latest example of a common phenomenon in the history of Western capitalism.

 

There are many examples where a small cluster of men has pushed new technologies forward—not necessarily by inventing them, but by bringing them to the masses. In the process, they have accrued enormous power.

 

These technologies have shaped how everyone else lives. Railways helped people move farther and faster than ever before. Oil provided the energy for industrial capitalism. Steel made it easier to build taller buildings. Automobiles helped create mass consumerism. Retail banking gave the world credit. The internet monopolised humanity’s attention. All of these technologies made the world richer. They also upturned social norms.

 

You might think that tycoons are overrated, or worse. Technological progress is the result of the actions of millions of people. No single person invented steel or developed the internet, for instance. A handful of people monopolise the returns from these collective efforts. Popular anger at the uber-rich stems from the belief that, at best, they were in the right place at the right time—and, at worst, that they are leeching off the rest of society. Every billionaire is a policy failure, runs the slogan.

 

This is an uncharitable conclusion. History shows that time and again tycoons have played the decisive role in spreading new technologies to the mass market. They are a necessary condition of innovation. A paper published in 2023 by Shari Eli of the University of Toronto and colleagues finds that Ford’s development of the Model T, a car first launched in 1908 that was far cheaper than any before it, largely explains why Americans were the first to widely adopt automobiles.

 

A paper from last year by Ufuk Akcigit of the University of Chicago and co-authors points to the crucial role of so-called “transformative entrepreneurs” in turning inventions into long-run economic growth. In short, prosperity requires tycoons.

 

To understand how the AI magnates compare with business titans through history, The Economist examined 11 technological waves in America over the past 150 years, from railways to the internet. For each, we picked the top five people responsible for the control, distribution and popularisation of that technology.

 

We quantified the power of each by looking at the revenue, employment and market value of their companies at their peak, as well as a subjective assessment of the degree of corporate control held by the tycoon, along with their personal wealth. We consulted books and historical datasets, alongside figures from Forbes, which began tracking the fortunes of the very rich in 1918. The measures were standardised based on the most relevant benchmark, such as GDP or population at the time. For many earlier tycoons, data were poor; fortunes, for example, were often disguised. What follows therefore represents only our best estimate.

 

Riches alone would not capture the full extent of a tycoon’s power. At his peak the wealth of John D. Rockefeller, founder of Standard Oil, was equivalent to around 1.5% of American GDP. Mr Musk may be richer still, depending on how his wealth is calculated. By our ranking, however, Henry Ford is the most powerful mogul America has seen so far.

 

Visible hands

 

Ford was fabulously rich. We estimate that, at his peak, he held assets worth well over 1% of American GDP. His sprawling estate near his company’s headquarters in Dearborn, Michigan, is beautiful. Rockefeller was richer still, but employed far fewer people: during Ford’s tenure his car company was truly enormous, employing about 0.15% of the American population in 1925. Ford also exercised almost complete control over the firm. After buying out minority shareholders in 1919 his family owned the business in its entirety.

 

No other tycoon, moreover, has done so much to alter society. Ford’s Model T was revolutionary because it was produced at mass scale and aimed at the mass market. In 1917 more than 40% of cars on America’s roads were Model Ts. Ford’s workers were paid enough—the famous $5-a-day wage—to purchase the vehicles that his factories created.

 

You can hardly turn a corner in Dearborn today without encountering the man’s legacy: from the Henry Ford Medical Centre to the numerous roads that are named after members of the family.

 

Most of the other titans in our top ten—among them Cornelius Vanderbilt (a railway magnate), Andrew Carnegie (a steel tycoon) and Alfred P. Sloan (a former boss of General Motors)—died long ago. But two living moguls make the cut. One is Jeff Bezos, the founder of Amazon, who comes fourth in our ranking. Amazon employs over 1m Americans and is worth $2.7trn. Then there is Mr Musk, at number eight, though his elevated rank is more a reflection of his success in carmaking (Tesla) and rocketry (SpaceX) than AI. Not far behind him, in 11th place, is Mr Zuckerberg, which is likewise a result more of Meta’s dominance over social media than its position in AI.

 

By contrast, Mr Altman, Mr Amodei and Sir Demis, whose power is more directly tied to AI, all fall in the bottom half of our ranking. Model-making relies on a small number of clever people and oodles of computing power, meaning that the labs these men run have relatively few workers. None of the three, moreover, enjoys the kind of corporate control held by Ford or Vanderbilt. Mr Altman runs OpenAI at the pleasure of his board (which briefly ousted him in November 2023, though it was subsequently purged). Mr Amodei owns only a small stake in the lab he co-founded. And Sir Demis is not even the most senior employee at his company.

 

In fairness, the technology they wield, unlike the others on our list, is still only in its infancy. Few tycoons of the past had the same potential to shape the direction of numerous industries, from entertainment to defence. And it may be many years until the moguls behind AI reach the apex of their power. In 1913, ten years after it was founded, Ford Motor Company was making an annual profit of roughly $1bn in today’s money. OpenAI, which recently reached the same age, is still a long way from making any profits whatsoever.

 

Power laws

 

Studying tycoons through history also reveals three important commonalities. The first is that many were deeply strange. Ford was odd in a bad way, with his paper, the Dearborn Independent, spreading antisemitic poison. Rockefeller was odd in a better way, obsessing over how to save money even as he became fabulously rich. Vanderbilt liaised with spirits from the nether world; John Pierpont Morgan, a banking titan, consulted astrologers. Thomas Edison, an electricity pioneer, was fanatically opposed to sleep. Steve Jobs, founder of Apple, practised extreme diets. With this in mind, Mr Musk’s conspiracy theories or Mr Zuckerberg’s robotic demeanour do not seem so out of the ordinary.

 

The second commonality is that, as these tycoons popularised new technologies, they introduced new dangers. Some of these were perceived as threats to life and limb. In the early days of railways many scientists worried that humans were biologically incapable of travelling at high speeds. Aviation was highly unsafe at first. So was drilling for oil. Cars killed pedestrians and occupants alike. The contest between Edison’s direct current and George Westinghouse’s alternating current generated a public-safety panic; Edison’s men staged gruesome public electrocutions of animals to persuade Americans that his rival’s technology was lethal.

 

Other risks were financial. Over-investment in railways helped cause repeated market crashes in the 19th century. A bigger banking system spread credit but magnified financial crises. And many of these new technologies automated jobs, putting people on the economic scrapheap. Railways and cars crushed horse-based locomotion. Electrification removed the mechanical constraints that had prevented automation in manufacturing.

 

The third commonality concerns relations between magnates and the state. The tycoons of the 19th century undoubtedly had more latitude than their modern counterparts: more scope to control markets; more ability to discipline labour; more opportunities for cronyism. Carnegie violently suppressed labour unrest. Morgan held so much sway over the financial system that during a market meltdown in 1907 he personally functioned as America’s central bank. Andrew Mellon, another magnate on our list, served as treasury secretary while continuing to steer one of America’s largest industrial empires.

 

Yet from the 20th century onwards, governments curbed many of the earlier tycoons’ worst excesses. In 1911 the Supreme Court ordered the breakup of Standard Oil into 34 independent companies after ruling it had violated antitrust law. In part to avoid another Morgan-style bail-out, in 1913 Congress created the Federal Reserve. Reforms in the 1930s made it harder for magnates to control vast holding companies. In 2000 a judge ordered the breakup of Microsoft for unlawful monopolisation (the software giant narrowly escaped dismemberment on appeal, but was chastened nevertheless). As AI transforms the economy and society, the people behind it may likewise encounter governments that wish to curb their power.

 

In theory, capitalism tends to be presented as impersonal and decentralised. In practice, however, its most important phases are often driven forward by individuals.

 

Time and again, towering, quasi-autocratic figures have gained control over large swathes of the economy.

 

The men currently propelling AI may not necessarily be among their number. But if history is any guide, a Rockefeller or Ford is likely to emerge soon enough.” [1]

 

What person in Chinese AI can become Ford of AI application in economy?

 

Kai-Fu Lee is the most prominent figure positioned to be the "Ford of AI application" in China, focused on transforming AI from research into industrial applications (manufacturing, supply chains) to drive the real economy. As CEO of Sinovation Ventures, he has pivoted toward GenAI applications in finance.

 

Key figures driving industrial AI adoption in China include:

 

    Kai-Fu Lee (Sinovation Ventures): A prominent voice arguing that China's strength lies in AI implementation, specifically in improving manufacturing and supply chain efficiency, making AI accessible for practical economic use.

    Zhang Yaqin (Tsinghua University): As dean of the Institute for AI Industry Research, he is critical in bridging the gap between cutting-edge research and industrial application.

    Robin Li (Baidu): As leader of one of China’s top "national champions," he oversees major investments in autonomous driving and enterprise-focused AI, crucial to integrating AI into infrastructure.

    Qwen Team (Alibaba): The developers behind Alibaba's Qwen models have created the largest open-source model ecosystem, driving massive adoption of Chinese AI tools across the economy.

 

China's 2025–2030 strategy emphasizes adopting AI in 90% of its economy, aiming for pervasive, affordable AI application over groundbreaking, expensive foundational models.

 

1. Tycoon capitalism. The Economist; London Vol. 459, Iss. 9495,  (Apr 18, 2026): 69, 70, 71.