In June 2000, with the splendor of the wonderful moments, US President Bill Clinton – along with British Prime Minister Tony Blair – announced the end of Complete sequencing of the human genomeThe genetic design of humans. A revolution has been expected in knowledge of the genetic basis of the biological characteristics that define us, including diseases. Since that show, what has happened? What is the stage of personalized medicine? What does mathematics have to do with all of this?
The strategy seemed clear. If until then it was known that only a few genes influence certain aspects of our biology, the availability of the entire genome would make it possible to extend this knowledge to situations where characters are determined by many genes. The first is the so-called simple letters and the last is complex. Conversely, diseases defined by some genes (or a single gene) are known as Mendelian (eg, cystic fibrosis), and diseases associated with many (eg, hypertension) as non-Mendelian.
However, the same gene sequence can change in each person and this also modifies their characters (height, susceptibility to hypertension, etc.). The ideal, then, is not simply to match genes to characters, but to associate certain sequences – variants of the same gene – with their size. If we get this relationship we will achieve two goals. The first is that we will better understand the biological basis of that characteristic. The second is that we can predict it in those individuals who presented the specific sequence that was identified. Both sides will contribute to the development of personalized medicine.
However, despite cheaper sequencing (in 2000 it was about $300 million; today, $1,000), sequencing the genomes of many people—which is necessary to be able to form a sequence-to-letter link—remains complex. experiments GWAS (In English , Genome-wide association studies) provides an alternative: the exclusive sequencing of the regions of the genome in which the most common type of genetic variation appears. These regions are called SNPs (from the English single nucleotide) and contain only one nucleotide, the basic component of the genome, which can present four different states, abbreviated G, A, T, C.
Variation in the SNP does not have to be the cause of the existence or modification of the corresponding biological characteristic. SNPs in most cases act as “signals” for the presence of genetic variants that are the true causes, physically close in the genome. This is due to the “linkage” that exists between physically close sequences in the human genome, known as splicing imbalance.
Using proprietary methodologies to examine these cheaper and easier regions, about one million SNPs can be assessed per individual. However, the first works were Unable to recognize Groups of SNPs associated with diversity found in complex traits or susceptibility to non-Mendelian diseases. To our surprise, it appears that most aspects of human biology are determined by SNPs and with a much weaker effect than we expected. These SNPs also appear distributed throughout the genome.
On the other hand, from sequencing the genome to showing the biological aspect, there are intermediate levels of molecular activity, which modulate the potential expression of this aspect, further complicating the understanding of this relationship. The latter is known as a problem Genotype phenotype map.
This is where mathematics comes in. The promotion of quantitative methods allows for a better understanding of the relationship between sequence and biological character, with the incorporation of information from the molecular and cellular context, in the form of genetic networks. For example, these techniques make it possible to identify polymorphisms whose variance is highly correlated with susceptibility to a disease. Among these tools we find simple regression models and more complex methodologies that include Bayesian estimation, and more recently the use of deep neural networks and causal inference.
In relation to the second goal we revealed, the prediction goal, mathematics is used to develop systems that predict a particular character value from single sequence information. For this purpose, all information on available SNPs, taking into account the severity of their effect, is added in ‘Predicting’ polygenic risk. As their prognostic capacity improves, many suggest their use as independent biomarkers and to assess patients’ severity. However, they also have limitations: our understanding of the functioning of these predictors is very limited, given the interlocking nature of the described genotype-phenotype map. Moreover, its evolution depends on the specific population under investigation (and on environment-dependent interactions between genes) and is therefore difficult to generalize.
Thus, polygenic risk predictions represent another example – in genomics – of the challenges faced by other disciplines whose goal is quantitative prediction based on so-called big data. These tools do their job, but we don’t really see why. Warren Weaver, one of the pioneers of information theory, prediction In his writings on science and complexity (from 1947) that this kind of challenge, which he called “organized” complexity, would dominate science and technology in the future. Deepening it through the use of mathematics will undoubtedly mark the desired progress and success of personalized medicine, but we must always bear in mind the inescapable limitations imposed by complexity.
Juan F Boyatos Directs the genomic systems logic lab in National Center for Biotechnology, integrated into .’s LifeHUB connection Supreme Council for Scientific Investigations, and a visiting researcher at ICMAT.
Coffee and theories A section dedicated to mathematics and the environment in which it was created, coordinated by the Institute of Mathematical Sciences (ICMAT), where researchers and center members describe the latest developments in the discipline, share meeting points between mathematics and other social and cultural expressions and remember those who marked its development and knew how to turn coffee into theories . The name evokes the definition of Hungarian mathematician Alfred Rainey: “A mathematician is a machine that turns coffee into theorems.”
Editing and Formatting: Agate A. Rudder G Longoria (ICMAT).
“Creator. Devoted pop culture specialist. Certified web fanatic. Unapologetic coffee lover.”