A few months after the announcement The COVID-19 pandemicIn early 2020, scientists sequenced a genome CorovirusSARS-CoV-2, but much of it remains unknown Protein coding genes. Now, a comparative genome study has produced Genetic map A more accurate and complete virus.
Made by researchers from Massachusetts Institute of Technology (MIT), published this Tuesday in the journal Nature Communications, the study confirmed several Protein coding genes And he discovered that the others – who have been suggested as genes – do not code for any proteins.
“We were able to use this robust comparative genome approach to evolutionary signatures to reveal the true functional protein-coding content of this very important genome,” says Manolis Killis, lead author of the study and professor of computer science at MIT and a member. From the Broad Institute of Massachusetts Institute of Technology and Harvard.
In the second part of the study, the research team also looked About 2000 mutations That have appeared in SARS-CoV-2 since the outbreak of the epidemic, allowing them to assess the significance of these mutations and their ability to evade the immune system or become more contagious.
Foto: Hannah A. Bullock, Azaibi Tamin / CDC via AP, archivo
It was known that, with Almost 30,000 bases of RNA, The SARS-CoV-2 genome contains many regions that code for protein genes and other genes that are suspected but have not been definitively classified.
Read also: Coronavirus: How mRNA technology could open the doors to a cancer vaccine
To determine which parts of the SARS-CoV-2 genome actually contained the genes, the researchers turned to comparative genomics, and encountered SARS-CoV-2 (which belongs to a subgenus of viruses called Sarbecovirus, which infects bats) with SARS-CoV (which causes it). In the 2003 SARS outbreak) and 42 strains of sorbic bat viruses.
Thus, they confirmed six protein-coding genes in the SARS-CoV-2 genome, in addition to the five genes that are anchored in all coronaviruses.
They also decided the area It encodes a gene called ORF3a It also encodes an additional gene, ORF3c, that contains RNA bases that interfere with ORF3a, but are in a different reading frame, which is rare in large genomes, but common in many viruses. SARS-CoV-2, The job you have is not yet known.
The researchers also showed that five other regions have been proposed as potential genes that do not code for functional proteins, and they ruled out that other regions remain to be explored.
Moreover, the authors found that many earlier works They used not only the wrong sets of genes, but also sometimes contradictory namesTherefore, in a parallel article recently published in the Journal of Virology, they provided recommendations for naming SARS-CoV-2 genes.
Read also: Mexicans eliminate the Coronavirus by 99.9% with UV rays
In the study, researchers also looked at More than 1,800 mutations That appeared in SARS-CoV-2 and found that in most cases, genes that evolved rapidly before the epidemic continued to do so, and those that tended to evolve slowly maintained this trend.
They also analyzed the mutations that arose in variants of concern, such as British, Brazilian and South African variant They found that many of the mutations that make these variants more dangerous are present in the spike protein, helping the virus to spread rapidly and bypass the immune system.
However, each of these variants has “more than 20 other mutations, and it is important to know which ones can do something and which ones cannot,” says Erwin Jongres, lead author of the study and a researcher at MIT.
For the authors, this data could help other scientists focus their attention on the mutations that seem to have the most important effects on virus infection.