Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 111
Filter
Add more filters










Publication year range
1.
PLoS Comput Biol ; 20(5): e1012132, 2024 May.
Article in English | MEDLINE | ID: mdl-38805561

ABSTRACT

Accurate models describing the relationship between genotype and phenotype are necessary in order to understand and predict how mutations to biological sequences affect the fitness and evolution of living organisms. The apparent abundance of epistasis (genetic interactions), both between and within genes, complicates this task and how to build mechanistic models that incorporate epistatic coefficients (genetic interaction terms) is an open question. The Walsh-Hadamard transform represents a rigorous computational framework for calculating and modeling epistatic interactions at the level of individual genotypic values (known as genetical, biological or physiological epistasis), and can therefore be used to address fundamental questions related to sequence-to-function encodings. However, one of its main limitations is that it can only accommodate two alleles (amino acid or nucleotide states) per sequence position. In this paper we provide an extension of the Walsh-Hadamard transform that allows the calculation and modeling of background-averaged epistasis (also known as ensemble epistasis) in genetic landscapes with an arbitrary number of states per position (20 for amino acids, 4 for nucleotides, etc.). We also provide a recursive formula for the inverse matrix and then derive formulae to directly extract any element of either matrix without having to rely on the computationally intensive task of constructing or inverting large matrices. Finally, we demonstrate the utility of our theory by using it to model epistasis within both simulated and empirical multiallelic fitness landscapes, revealing that both pairwise and higher-order genetic interactions are enriched between physically interacting positions.


Subject(s)
Epistasis, Genetic , Models, Genetic , Epistasis, Genetic/genetics , Computational Biology/methods , Algorithms , Mutation/genetics , Genotype
2.
ArXiv ; 2024 Apr 16.
Article in English | MEDLINE | ID: mdl-38699161

ABSTRACT

Computational methods for assessing the likely impacts of mutations, known as variant effect predictors (VEPs), are widely used in the assessment and interpretation of human genetic variation, as well as in other applications like protein engineering. Many different VEPs have been released to date, and there is tremendous variability in their underlying algorithms and outputs, and in the ways in which the methodologies and predictions are shared. This leads to considerable challenges for end users in knowing which VEPs to use and how to use them. Here, to address these issues, we provide guidelines and recommendations for the release of novel VEPs. Emphasising open-source availability, transparent methodologies, clear variant effect score interpretations, standardised scales, accessible predictions, and rigorous training data disclosure, we aim to improve the usability and interpretability of VEPs, and promote their integration into analysis and evaluation pipelines. We also provide a large, categorised list of currently available VEPs, aiming to facilitate the discovery and encourage the usage of novel methods within the scientific community.

3.
bioRxiv ; 2024 Apr 25.
Article in English | MEDLINE | ID: mdl-38712134

ABSTRACT

Thousands of human proteins function by binding short linear motifs embedded in intrinsically disordered regions. How affinity and specificity are encoded in these binding domains and the motifs themselves is not well understood. The evolvability of binding specificity - how rapidly and extensively it can change upon mutation - is also largely unexplored, as is the contribution of 'fuzzy' dynamic residues to affinity and specificity in protein-protein interactions. Here we report the first complete map of specificity encoding for a globular protein domain. Quantifying >200,000 energetic interactions between a PDZ domain and its ligand identifies 20 major energetically coupled pairs of sites that control specificity. These are organized into six modules, with most mutations in each module reprogramming specificity for a single position in the ligand. Nine of the major energetic couplings controlling specificity are between structural contacts and 11 have an allosteric mechanism of action. The dynamic tail of the ligand is more robust to mutation than the structured residues but contributes additively to binding affinity and communicates with structured residues to enable changes in specificity. Our results quantify the binding specificities of >1,800 globular proteins to reveal how specificity is encoded and provide a direct comparison of the encoding of affinity and specificity in structured and dynamic molecular recognition.

4.
Nature ; 626(7999): 643-652, 2024 Feb.
Article in English | MEDLINE | ID: mdl-38109937

ABSTRACT

Thousands of proteins have been validated genetically as therapeutic targets for human diseases1. However, very few have been successfully targeted, and many are considered 'undruggable'. This is particularly true for proteins that function via protein-protein interactions-direct inhibition of binding interfaces is difficult and requires the identification of allosteric sites. However, most proteins have no known allosteric sites, and a comprehensive allosteric map does not exist for any protein. Here we address this shortcoming by charting multiple global atlases of inhibitory allosteric communication in KRAS. We quantified the effects of more than 26,000 mutations on the folding of KRAS and its binding to six interaction partners. Genetic interactions in double mutants enabled us to perform biophysical measurements at scale, inferring more than 22,000 causal free energy changes. These energy landscapes quantify how mutations tune the binding specificity of a signalling protein and map the inhibitory allosteric sites for an important therapeutic target. Allosteric propagation is particularly effective across the central ß-sheet of KRAS, and multiple surface pockets are genetically validated as allosterically active, including a distal pocket in the C-terminal lobe of the protein. Allosteric mutations typically inhibit binding to all tested effectors, but they can also change the binding specificity, revealing the regulatory, evolutionary and therapeutic potential to tune pathway activation. Using the approach described here, it should be possible to rapidly and comprehensively identify allosteric target sites in many proteins.


Subject(s)
Allosteric Site , Protein Folding , Proto-Oncogene Proteins p21(ras) , Humans , Allosteric Regulation/drug effects , Allosteric Regulation/genetics , Allosteric Site/drug effects , Allosteric Site/genetics , Mutation , Protein Binding , Proto-Oncogene Proteins p21(ras)/antagonists & inhibitors , Proto-Oncogene Proteins p21(ras)/chemistry , Proto-Oncogene Proteins p21(ras)/genetics , Proto-Oncogene Proteins p21(ras)/metabolism , Reproducibility of Results , Substrate Specificity/drug effects , Substrate Specificity/genetics , Thermodynamics
5.
Nat Commun ; 14(1): 5551, 2023 09 09.
Article in English | MEDLINE | ID: mdl-37689712

ABSTRACT

An important challenge in genetics, evolution and biotechnology is to understand and predict how mutations combine to alter phenotypes, including molecular activities, fitness and disease. In diploids, mutations in a gene can combine on the same chromosome or on different chromosomes as a "heteroallelic combination". However, a direct comparison of the extent, sign, and stability of the genetic interactions between variants within and between alleles is lacking. Here we use thermodynamic models of protein folding and ligand-binding to show that interactions between mutations within and between alleles are expected in even very simple biophysical systems. Protein folding alone generates within-allele interactions and a single molecular interaction is sufficient to cause between-allele interactions and dominance. These interactions change differently, quantitatively and qualitatively as a system becomes more complex. Altering the concentration of a ligand can, for example, switch alleles from dominant to recessive. Our results show that intra-molecular epistasis and dominance should be widely expected in even the simplest biological systems but also reinforce the view that they are plastic system properties and so a formidable challenge to predict. Accurate prediction of both intra-molecular epistasis and dominance will require either detailed mechanistic understanding and experimental parameterization or brute-force measurement and learning.


Subject(s)
Epistasis, Genetic , Protein Folding , Alleles , Ligands , Biophysics
7.
PLoS One ; 18(7): e0288158, 2023.
Article in English | MEDLINE | ID: mdl-37418460

ABSTRACT

Multiplexed assays of variant effects (MAVEs) have made possible the functional assessment of all possible mutations to genes and regulatory sequences. A core pillar of the approach is generation of variant libraries, but current methods are either difficult to scale or not uniform enough to enable MAVEs at the scale of gene families or beyond. We present an improved method called Scalable and Uniform Nicking (SUNi) mutagenesis that combines massive scalability with high uniformity to enable cost-effective MAVEs of gene families and eventually genomes.


Subject(s)
Genome , Mutagenesis , Mutation
8.
Nat Commun ; 13(1): 7084, 2022 11 18.
Article in English | MEDLINE | ID: mdl-36400770

ABSTRACT

Multiplexed assays of variant effects (MAVEs) guide clinical variant interpretation and reveal disease mechanisms. To date, MAVEs have focussed on a single mutation type-amino acid (AA) substitutions-despite the diversity of coding variants that cause disease. Here we use Deep Indel Mutagenesis (DIM) to generate a comprehensive atlas of diverse variant effects for a disease protein, the amyloid beta (Aß) peptide that aggregates in Alzheimer's disease (AD) and is mutated in familial AD (fAD). The atlas identifies known fAD mutations and reveals that many variants beyond substitutions accelerate Aß aggregation and are likely to be pathogenic. Truncations, substitutions, insertions, single- and internal multi-AA deletions differ in their propensity to enhance or impair aggregation, but likely pathogenic variants from all classes are highly enriched in the polar N-terminal region of Aß. This comparative atlas highlights the importance of including diverse mutation types in MAVEs and provides important mechanistic insights into amyloid nucleation.


Subject(s)
Alzheimer Disease , Amyloid beta-Peptides , Humans , Alzheimer Disease/metabolism , Amyloid/genetics , Amyloid/metabolism , Amyloid beta-Peptides/metabolism , Mutation, Missense
9.
Nat Commun ; 13(1): 3724, 2022 06 28.
Article in English | MEDLINE | ID: mdl-35764656

ABSTRACT

Somatic mutations are an inevitable component of ageing and the most important cause of cancer. The rates and types of somatic mutation vary across individuals, but relatively few inherited influences on mutation processes are known. We perform a gene-based rare variant association study with diverse mutational processes, using human cancer genomes from over 11,000 individuals of European ancestry. By combining burden and variance tests, we identify 207 associations involving 15 somatic mutational phenotypes and 42 genes that replicated in an independent data set at a false discovery rate of 1%. We associate rare inherited deleterious variants in genes such as MSH3, EXO1, SETD2, and MTOR with two phenotypically different forms of DNA mismatch repair deficiency, and variants in genes such as EXO1, PAXIP1, RIF1, and WRN with deficiency in homologous recombination repair. In addition, we identify associations with other mutational processes, such as APEX1 with APOBEC-signature mutagenesis. Many of the genes interact with each other and with known mutator genes within cellular sub-networks. Considered collectively, damaging variants in the identified genes are prevalent in the population. We suggest that rare germline variation in diverse genes commonly impacts mutational processes in somatic cells.


Subject(s)
Neoplastic Syndromes, Hereditary , Genome, Human/genetics , Germ Cells , Humans , Mutagenesis , Mutation , Neoplastic Syndromes, Hereditary/genetics
10.
Nature ; 604(7904): 175-183, 2022 04.
Article in English | MEDLINE | ID: mdl-35388192

ABSTRACT

Allosteric communication between distant sites in proteins is central to biological regulation but still poorly characterized, limiting understanding, engineering and drug development1-6. An important reason for this is the lack of methods to comprehensively quantify allostery in diverse proteins. Here we address this shortcoming and present a method that uses deep mutational scanning to globally map allostery. The approach uses an efficient experimental design to infer en masse the causal biophysical effects of mutations by quantifying multiple molecular phenotypes-here we examine binding and protein abundance-in multiple genetic backgrounds and fitting thermodynamic models using neural networks. We apply the approach to two of the most common protein interaction domains found in humans, an SH3 domain and a PDZ domain, to produce comprehensive atlases of allosteric communication. Allosteric mutations are abundant, with a large mutational target space of network-altering 'edgetic' variants. Mutations are more likely to be allosteric closer to binding interfaces, at glycine residues and at specific residues connecting to an opposite surface within the PDZ domain. This general approach of quantifying mutational effects for multiple molecular phenotypes and in multiple genetic backgrounds should enable the energetic and allosteric landscapes of many proteins to be rapidly and comprehensively mapped.


Subject(s)
Allosteric Site , PDZ Domains , Proteins , Allosteric Regulation/genetics , PDZ Domains/genetics , Protein Binding/genetics , Proteins/chemistry , Thermodynamics
11.
Nat Commun ; 12(1): 7051, 2021 12 03.
Article in English | MEDLINE | ID: mdl-34862370

ABSTRACT

The classic two-hit model posits that both alleles of a tumor suppressor gene (TSG) must be inactivated to cause cancer. In contrast, for some oncogenes and haploinsufficient TSGs, a single genetic alteration can suffice to increase tumor fitness. Here, by quantifying the interactions between mutations and copy number alterations (CNAs) across 10,000 tumors, we show that many cancer genes actually switch between acting as one-hit or two-hit drivers. Third order genetic interactions identify the causes of some of these switches in dominance and dosage sensitivity as mutations in other genes in the same biological pathway. The correct genetic model for a gene thus depends on the other mutations in a genome, with a second hit in the same gene or an alteration in a different gene in the same pathway sometimes representing alternative evolutionary paths to cancer.


Subject(s)
Carcinogenesis/genetics , Genes, Tumor Suppressor , Models, Genetic , Neoplasms/genetics , Oncogenes , Alleles , DNA Copy Number Variations , Datasets as Topic , Haploinsufficiency , Humans , Mutation
12.
Curr Biol ; 31(19): 4256-4268.e7, 2021 10 11.
Article in English | MEDLINE | ID: mdl-34358445

ABSTRACT

An old and controversial question in biology is whether information perceived by the nervous system of an animal can "cross the Weismann barrier" to alter the phenotypes and fitness of their progeny. Here, we show that such intergenerational transmission of sensory information occurs in the model organism, C. elegans, with a major effect on fitness. Specifically, that perception of social pheromones by chemosensory neurons controls the post-embryonic timing of the development of one tissue, the germline, relative to others in the progeny of an animal. Neuronal perception of the social environment thus intergenerationally controls the generation time of this animal.


Subject(s)
Caenorhabditis elegans Proteins , Caenorhabditis elegans , Animals , Caenorhabditis elegans/genetics , Caenorhabditis elegans Proteins/genetics , Neurons/physiology , Perception , Social Environment
13.
Elife ; 102021 02 01.
Article in English | MEDLINE | ID: mdl-33522485

ABSTRACT

Plaques of the amyloid beta (Aß) peptide are a pathological hallmark of Alzheimer's disease (AD), the most common form of dementia. Mutations in Aß also cause familial forms of AD (fAD). Here, we use deep mutational scanning to quantify the effects of >14,000 mutations on the aggregation of Aß. The resulting genetic landscape reveals mechanistic insights into fibril nucleation, including the importance of charge and gatekeeper residues in the disordered region outside of the amyloid core in preventing nucleation. Strikingly, unlike computational predictors and previous measurements, the empirical nucleation scores accurately identify all known dominant fAD mutations in Aß, genetically validating that the mechanism of nucleation in a cell-based assay is likely to be very similar to the mechanism that causes the human disease. These results provide the first comprehensive atlas of how mutations alter the formation of any amyloid fibril and a resource for the interpretation of genetic variation in Aß.


Alzheimer's disease is the most common form of dementia, affecting more than 50 million people worldwide. Despite more than 400 clinical trials, there are still no effective drugs that can prevent or treat the disease. A common target in Alzheimer's disease trials is a small protein called amyloid beta. Amyloid beta proteins are 'sticky' molecules. In the brains of people with Alzheimer's disease, they join to form first small aggregates and then long chains called fibrils, a process which is toxic to neurons. Specific mutations in the gene for amyloid beta are known to cause rare, aggressive forms of Alzheimer's disease that typically affect people in their fifties or sixties. But these are not the only mutations that can occur in amyloid beta. In principle, any part of the protein could undergo mutation. And given the size of the human population, it is likely that each of these mutations exists in someone alive today. Seuma et al. reasoned that studying these mutations could help us understand the process by which amyloid beta forms new aggregates. Using an approach called deep mutational scanning, Seuma et al. mutated each point in the protein, one at a time. This produced more than 14,000 different versions of amyloid beta. Seuma et al. then measured how quickly these mutants were able to form aggregates by introducing them into yeast cells. All the mutations known to cause early-onset Alzheimer's disease accelerated amyloid beta aggregation in the yeast. But the results also revealed previously unknown properties that control how fast aggregation occurs. In addition, they highlighted a number of positions in the amyloid beta sequence that act as 'gatekeepers'. In healthy brains, these gatekeepers prevent amyloid beta proteins from sticking together. When mutated, they drive the protein to form aggregates. This comprehensive dataset will help researchers understand how proteins form toxic aggregates, which could in turn help them find ways to prevent this from happening. By providing an 'atlas' of all possible amyloid beta mutations, the dataset will also help clinicians interpret any new mutations they encounter in patients. By showing whether or not a mutation speeds up aggregation, the atlas will help clinicians predict whether that mutation increases the risk of Alzheimer's disease.


Subject(s)
Alzheimer Disease/genetics , Amyloid beta-Peptides/genetics , Amyloid/metabolism , Mutation , DNA Mutational Analysis , High-Throughput Nucleotide Sequencing , Plasmids , Saccharomyces cerevisiae/metabolism
14.
Trends Genet ; 37(7): 657-668, 2021 07.
Article in English | MEDLINE | ID: mdl-33277042

ABSTRACT

The nonsense-mediated mRNA decay (NMD) pathway degrades some but not all mRNAs bearing premature termination codons (PTCs). Decades of work have elucidated the molecular mechanisms of NMD. More recently, statistical analyses of large genomic datasets have allowed the importance of known and novel 'rules of NMD' to be tested and combined into methods that accurately predict whether PTC-containing mRNAs are degraded or not. We discuss these genomic approaches and how they can be applied to identify diseases and individuals that may benefit from inhibition or activation of NMD. We also discuss the importance of NMD for gene editing and tumor evolution, and how inhibiting NMD may be an effective strategy to increase the efficacy of cancer immunotherapy.


Subject(s)
Alternative Splicing/genetics , Genetic Diseases, Inborn/genetics , Neoplasms/genetics , Nonsense Mediated mRNA Decay/genetics , Codon, Nonsense/genetics , Humans , RNA, Messenger/genetics
15.
Nat Commun ; 11(1): 4923, 2020 10 01.
Article in English | MEDLINE | ID: mdl-33004824

ABSTRACT

A goal of biology is to predict how mutations combine to alter phenotypes, fitness and disease. It is often assumed that mutations combine additively or with interactions that can be predicted. Here, we show using simulations that, even for the simple example of the lambda phage transcription factor CI repressing a gene, this assumption is incorrect and that perfect measurements of the effects of mutations on a trait and mechanistic understanding can be insufficient to predict what happens when two mutations are combined. This apparent paradox arises because mutations can have different biophysical effects to cause the same change in a phenotype and the outcome in a double mutant depends upon what these hidden biophysical changes actually are. Pleiotropy and non-monotonic functions further confound prediction of how mutations interact. Accurate prediction of phenotypes and disease will sometimes not be possible unless these biophysical ambiguities can be resolved using additional measurements.


Subject(s)
Biophysical Phenomena/genetics , Genetic Association Studies/methods , Models, Genetic , Thermodynamics , Bacteriophage lambda/genetics , Gene Expression Regulation, Viral , Mutation , Phenotype , Repressor Proteins/genetics , Repressor Proteins/metabolism , Viral Regulatory and Accessory Proteins/genetics , Viral Regulatory and Accessory Proteins/metabolism
16.
Elife ; 92020 10 28.
Article in English | MEDLINE | ID: mdl-33112234

ABSTRACT

Genetic analyses and systematic mutagenesis have revealed that synonymous, non-synonymous and intronic mutations frequently alter the inclusion levels of alternatively spliced exons, consistent with the concept that altered splicing might be a common mechanism by which mutations cause disease. However, most exons expressed in any cell are highly-included in mature mRNAs. Here, by performing deep mutagenesis of highly-included exons and by analysing the association between genome sequence variation and exon inclusion across the transcriptome, we report that mutations only very rarely alter the inclusion of highly-included exons. This is true for both exonic and intronic mutations as well as for perturbations in trans. Therefore, mutations that affect splicing are not evenly distributed across primary transcripts but are focussed in and around alternatively spliced exons with intermediate inclusion levels. These results provide a resource for prioritising synonymous and other variants as disease-causing mutations.


Subject(s)
Alternative Splicing , Disease/genetics , Exons , Mutation , Alleles , Humans , Introns , RNA, Messenger/genetics
17.
Genome Biol ; 21(1): 207, 2020 08 17.
Article in English | MEDLINE | ID: mdl-32799905

ABSTRACT

Deep mutational scanning (DMS) enables multiplexed measurement of the effects of thousands of variants of proteins, RNAs, and regulatory elements. Here, we present a customizable pipeline, DiMSum, that represents an end-to-end solution for obtaining variant fitness and error estimates from raw sequencing data. A key innovation of DiMSum is the use of an interpretable error model that captures the main sources of variability arising in DMS workflows, outperforming previous methods. DiMSum is available as an R/Bioconda package and provides summary reports to help researchers diagnose common DMS pathologies and take remedial steps in their analyses.


Subject(s)
DNA Mutational Analysis/methods , Molecular Diagnostic Techniques/methods , Mutation , Computational Biology , High-Throughput Nucleotide Sequencing/methods , Models, Genetic , Polymerase Chain Reaction , Proteins/genetics , Software
18.
Nat Genet ; 51(11): 1645-1651, 2019 11.
Article in English | MEDLINE | ID: mdl-31659324

ABSTRACT

Premature termination codons (PTCs) can result in the production of truncated proteins or the degradation of messenger RNAs by nonsense-mediated mRNA decay (NMD). Which of these outcomes occurs can alter the effect of a mutation, with the engagement of NMD being dependent on a series of rules. Here, by applying these rules genome-wide to obtain a resource called NMDetective, we explore the impact of NMD on genetic disease and approaches to therapy. First, human genetic diseases differ in whether NMD typically aggravates or alleviates the effects of PTCs. Second, failure to trigger NMD is a cause of ineffective gene inactivation by CRISPR-Cas9 gene editing. Finally, NMD is a determinant of the efficacy of cancer immunotherapy, with only frameshifted transcripts that escape NMD predicting a response. These results demonstrate the importance of incorporating the rules of NMD into clinical decision-making. Moreover, they suggest that inhibiting NMD may be effective in enhancing cancer immunotherapy.


Subject(s)
Gene Editing , Genetic Diseases, Inborn/genetics , Immunotherapy , Neoplasms/genetics , Nonsense Mediated mRNA Decay/genetics , RNA, Messenger/genetics , Humans , Models, Genetic , Neoplasms/immunology , Neoplasms/therapy
19.
Nat Commun ; 10(1): 4319, 2019 Sep 17.
Article in English | MEDLINE | ID: mdl-31530808

ABSTRACT

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

20.
Front Physiol ; 10: 1067, 2019.
Article in English | MEDLINE | ID: mdl-31551797

ABSTRACT

Vitellogenins are a family of yolk proteins that are by far the most abundant among oviparous animals. In the model nematode Caenorhabditis elegans, the 6 vitellogenins are among the most highly expressed genes in the adult hermaphrodite intestine, which produces copious yolk to provision eggs. In this article we review what is known about the vitellogenin genes and proteins in C. elegans, in comparison with vitellogenins in other taxa. We argue that the primary purpose of abundant vitellogenesis in C. elegans is to support post-embryonic development and fertility, rather than embryogenesis, especially in harsh environments. Increasing vitellogenin provisioning underlies several post-embryonic phenotypic alterations associated with advancing maternal age, demonstrating that vitellogenins can act as an intergenerational signal mediating the influence of parental physiology on progeny. We also review what is known about vitellogenin regulation - how tissue-, sex- and stage-specificity of expression is achieved, how vitellogenins are regulated by major signaling pathways, how vitellogenin expression is affected by extra-intestinal tissues and how environmental experience affects vitellogenesis. Lastly, we speculate whether C. elegans vitellogenins may play other roles in worm physiology.

SELECTION OF CITATIONS
SEARCH DETAIL
...