Sekėjai

Ieškoti šiame dienoraštyje

2022 m. vasario 13 d., sekmadienis

Millions of new microbes: It's raining genes and viruses

"Metagenomics see our environment as a source of undiscovered viruses and germs: How Big Data biologists are genetically re-discovering the world.

 

John Dennehy and his colleagues from the biology department in the borough of Queens don't know exactly what has happened in the New York sewers over the past two years. One thing is clear: the pandemic has produced a lot of new and unknown things here. New Sars-CoV-2 strains that had not yet been described were romping about in the wastewater. Between January and June 2021, Dennehy and his team took samples every two weeks. They had been sent underground as outposts after the first, devastating wave of infections. It was now known that the face of the pandemic virus, which had spread from Wuhan, China, to the east and west coasts of the United States the year before, had long since changed. New variants with a dozen mutations and significantly different properties—higher infectivity, faster replication—had arisen and spread in different places: alpha, beta, gamma, delta, epsilon, kappa. The world had pricked up its ears, even the virologists were surprised by the astonishing evolutionary "drive", the ability to change, of the corona virus.

 

Dennehy and his colleagues didn't have to look far for anything unusual either. In the wastewater samples, they used their molecular filters to fish out countless gene snippets that could clearly be assigned to the RNA virus first described in Wuhan and then again not. Although many mutations had become known in patients in the New York Covid clinics after the sequence analyzes that had been carried out more and more in the meantime, there remained a residue that the researchers called "cryptic virus lines". Laboratory tests designed to examine the function of the unusual gene sequences showed that the coronavirus had obviously greatly expanded its host range in the first year of the pandemic: With the modified spike proteins on the surface, it could not only infect cells from humans, but also other cells of rats and mice.

 

In addition, mutations were discovered that occur in the omicron variant, which was only described many months later and is now dominant. Laboratory viruses equipped with the appropriate surface molecules – so-called pseudoviruses – were resistant to antibodies. A large number of completely mysterious gene snippets whose properties have not yet been clarified were also found and described in the wastewater samples. Whether they come from a previously unknown animal reservoir or from Covid-19 patients whose viruses had fallen through the inevitably incomplete sequencing grid - it also remains unanswered in the evaluation that the New York biologists have now published in the journal "Nature Communications”.

 

A huge, valuable data bubble

 

So-called metagenomics are only too familiar with the experiences of the biologists from Queens. Metagenomics is the umbrella term for very special techniques used to decode the genetic material of organisms from a specific habitat - of organisms that are largely unknown because they cannot be reproduced, cultivated and examined in the laboratory. Global genome surveys on the fly, sort of. Recently, these processes, which were created in the 1990s, have developed at breakneck speed. Huge equipment parks and genome databases have been set up and software tools have been programmed for identification. "Big Biology": Similar to Big Data in AI research, this is a resource that is growing almost daily. A data bubble that grows as a network between laboratories around the world. Peer Bork from the European Molecular Biology Laboratory (EMBL) in Heidelberg is one of the pioneers in this field. When he has to formulate the goal of metagenomics, he is never petty.

 

The aim should be nothing less than the recording of as many organismic genes as possible worldwide - and thus the recording of the molecular basis of all life on earth today. A gene project of truly planetary proportions. Maybe just a utopia. If you consider that the human reference genome was finally published only last year, i.e. without gaps that could not be sequenced, the claim of the metagenomics sounds a little more utopian. And yet they are currently spreading an almost unprecedented optimism. It is expressed in huge numbers with lots of zeros.

 

A few days ago, an international team of computer biologists, including scientists from Heidelberg and Tübingen, published their database analyzes with "Serratus" in "Nature". This is a new, shared data cloud in which 5.7 million sequenced gene snippets from all over the world have been sequenced so far are stored and in which the genetic information can be examined for virus traces using special tools.  Serratus is an Open Science project to uncover the planetary virome, freely and openly.

 

10.2 petabases of genetic information have come together in the last thirteen years, in other words: a one followed by fifteen zeros - that's how many gene building blocks, collected on all continents and in oceans, are what Serratus has in mind. 

 

Within a few days and with more than 20,000 computer processors working in parallel, the bioinformaticians discovered no fewer than 131,957 new RNA viruses in this mess. So far, only 15,000 RNA viruses were known and very few could be cultivated in laboratories. Among the new viruses from the gene library are at least nine corona viruses that have not been described before.

 

Hundreds of thousands of variations of RNA viruses

 

To a certain extent, three short sections of the gene responsible for building the RNA-dependent RNA polymerase served as fishing hooks for the unknown RNA viruses. It is the enzyme that RNA viruses usually need in order to multiply regularly. Even this essential and highly conserved molecule can differ significantly among the extremely versatile RNA viruses. Hundreds of thousands of variations have been identified by biologists. The group of small hepatitis delta viruses, which was relatively manageable with 13 members up to 2018, was expanded in one fell swoop by what is believed to be more than three hundred variants. However, many of these metagenomic discoveries have yet to be confirmed by further laboratory analysis. And even if that were to succeed for all of these discoveries, the virus researchers are still far from having exhaustive knowledge of the global “Virom” – the world empire of viruses. 

 

So far, science has probably studied little more than 0.001 percent of earthly viruses in more detail.

 

The situation is not much different with the tiny creatures that Peer Bork and many of his metagenomics colleagues have their eyes on worldwide: the bacteria. Shortly after the human genome became public in its first rough version, Bork and his colleagues traveled practically all over the world to scan the environment for bacterial genetic material. In the years that followed, the intestinal flora of thousands of people was analyzed, and the individual microbial fingerprints of more and more volunteers were examined to see how they changed in the event of illness or under the influence of medication. Logistical and digital mammoth projects, all of which not only deal with very practical medical questions, but also advance the overall genetic view of the planet ever faster.

 

In a new preprint published in “bioRxiv” by the biotechnologist Jaime Huerta-Cepas from Madrid together with Bork and scientists from Berlin to Shanghai, the gene repertoire in bacteria is suddenly multiplied. The researchers have identified nearly 400 million genes in five large metagenome databases covering germs from 82 “habitats” – from the human gut and vagina to marine and sewage samples. Of course, most of these merely reflect thousands of variations on the same gene. Nevertheless: The gene discoveries in the previously uncultivable bacteria greatly expand nature's natural catalog of genes. 

 

In any case, Bork and his colleagues have created their own computer database for the updating of bacterial gene diversity on earth: "Global Microbial Gene Catalog v1.0".

 

No one knows what all these mega gene projects, which are completely unmanageable for laypeople, will ultimately be good for. One idea is that, for example, the spread of antibiotic-resistant bacteria can be tracked and problematic germs can be identified at an early stage. Similar to the viruses. However, it cannot be predicted with certainty whether the scientists will ever actually get the chance to use mass sequencing to identify and even predict the development of infectious variants or those with the potential to escape the immune system at an early stage. In any case, the computer tools and data treasures of metagenomics could be suitable for discovering threatening signals in the environment earlier."

 

 


 

Komentarų nėra: