Oral Presentation 44th Lorne Genome Conference 2023

Computing spatial mutation intolerance to assess selective pressure across the human proteome (#38)

Aaron Kovacs 1 , Michael Silk 2 3 , Carlos Rodrigues 1 4 , Stephanie Portelli 1 3 4 , David Ascher 1 3 4
  1. School of Chemistry & Molecular Biosciences, The University of Queensland, Brisbane, Queensland, Australia
  2. Centre for Population Genomics, Murdoch Children's Research Institute, Melbourne, Victoria, Australia
  3. Systems and Computational Biology, Bio21 Institute, The University of Melbourne, Melbourne, Victoria, Australia
  4. Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia

Advances in genomic sequencing, despite initially showing promise, have yet to successfully catalyze routine personalised treatment approaches. A given individual has millions of benign SNPs, therefore identifying the few key pathogenic variants in a patient is a challenging task. Current state-of-the-art computational methods rely largely on evolutionary conservation measures, which, although capable of effectively identifying important functional and interaction loci, remain incapable of effectively performing the task of identifying pathogenic variants. Output from these tools are thus only regarded as weak evidence in the clinical setting.

To address this, we have leveraged genomic sequencing information to develop the missense tolerance ratio (MTR), which identifies genomic regions in humans that are under selective pressure. These regions have been found to be greatly enriched in pathogenic variants. Applying these principles within the 3-dimensional protein structure also proved discriminative, however, the resulting spatial measure of intolerance (MTR3D) was initially limited to experimental protein structures. Utilising the recent advances in protein folding prediction, we have now calculated spatial mutational tolerance across the entire human proteome, greatly expanding the predictive scope of the MTR3D score. We are integrating the two scores along with other known predictors of variant pathogenicity into a machine learning model to produce a predictor of deleteriousness superior to any of the features by themselves.

We are now also exploring the intersection between conservation within a population (MTR) and across evolution (traditional measures). While the latter method lacks sufficient sampling of protein functional space, this is not the case for the former, suggesting that MTR would be a more sensitive approach to detect regions under tight evolutionary control. This work will not only guide the integration of protein structure and spatial intolerance into variant characterisation pipelines, but provide deeper insight into the inter- and intra- species evolutionary forces driving protein structure-function-phenotype relationships.