Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 27
Filter
Add more filters










Publication year range
1.
Mol Syst Biol ; 2024 Jun 07.
Article in English | MEDLINE | ID: mdl-38849564
2.
ArXiv ; 2024 Apr 16.
Article in English | MEDLINE | ID: mdl-38699161

ABSTRACT

Computational methods for assessing the likely impacts of mutations, known as variant effect predictors (VEPs), are widely used in the assessment and interpretation of human genetic variation, as well as in other applications like protein engineering. Many different VEPs have been released to date, and there is tremendous variability in their underlying algorithms and outputs, and in the ways in which the methodologies and predictions are shared. This leads to considerable challenges for end users in knowing which VEPs to use and how to use them. Here, to address these issues, we provide guidelines and recommendations for the release of novel VEPs. Emphasising open-source availability, transparent methodologies, clear variant effect score interpretations, standardised scales, accessible predictions, and rigorous training data disclosure, we aim to improve the usability and interpretability of VEPs, and promote their integration into analysis and evaluation pipelines. We also provide a large, categorised list of currently available VEPs, aiming to facilitate the discovery and encourage the usage of novel methods within the scientific community.

3.
medRxiv ; 2024 Apr 12.
Article in English | MEDLINE | ID: mdl-38645101

ABSTRACT

Background: Multiplexed Assays of Variant Effects (MAVEs) can test all possible single variants in a gene of interest. The resulting saturation-style data may help resolve variant classification disparities between populations, especially for variants of uncertain significance (VUS). Methods: We analyzed clinical significance classifications in 213,663 individuals of European-like genetic ancestry versus 206,975 individuals of non-European-like genetic ancestry from All of Us and the Genome Aggregation Database. Then, we incorporated clinically calibrated MAVE data into the Clinical Genome Resource's Variant Curation Expert Panel rules to automate VUS reclassification for BRCA1, TP53, and PTEN . Results: Using two orthogonal statistical approaches, we show a higher prevalence ( p ≤5.95e-06) of VUS in individuals of non-European-like genetic ancestry across all medical specialties assessed in all three databases. Further, in the non-European-like genetic ancestry group, higher rates of Benign or Likely Benign and variants with no clinical designation ( p ≤2.5e-05) were found across many medical specialties, whereas Pathogenic or Likely Pathogenic assignments were higher in individuals of European-like genetic ancestry ( p ≤2.5e-05). Using MAVE data, we reclassified VUS in individuals of non-European-like genetic ancestry at a significantly higher rate in comparison to reclassified VUS from European-like genetic ancestry ( p =9.1e-03) effectively compensating for the VUS disparity. Further, essential code analysis showed equitable impact of MAVE evidence codes but inequitable impact of allele frequency ( p =7.47e-06) and computational predictor ( p =6.92e-05) evidence codes for individuals of non-European-like genetic ancestry. Conclusions: Generation of saturation-style MAVE data should be a priority to reduce VUS disparities and produce equitable training data for future computational predictors.

4.
Genome Biol ; 25(1): 100, 2024 Apr 19.
Article in English | MEDLINE | ID: mdl-38641812

ABSTRACT

Multiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.


Subject(s)
Metadata , Research Design , Reproducibility of Results
6.
bioRxiv ; 2024 Jan 02.
Article in English | MEDLINE | ID: mdl-38260256

ABSTRACT

Recent advances in AI-based methods have revolutionized the field of structural biology. Concomitantly, high-throughput sequencing and functional genomics technologies have enabled the detection and generation of variants at an unprecedented scale. However, efficient tools and resources are needed to link these two disparate data types - to "map" variants onto protein structures, to better understand how the variation causes disease and thereby design therapeutics. Here we present the Genomics 2 Proteins Portal (G2P; g2p.broadinstitute.org/): a human proteome-wide resource that maps 19,996,443 genetic variants onto 42,413 protein sequences and 77,923 structures, with a comprehensive set of structural and functional features. Additionally, the G2P portal generalizes the capability of linking genomics to proteins beyond databases by allowing users to interactively upload protein residue-wise annotations (variants, scores, etc.) as well as the protein structure to establish the connection. The portal serves as an easy-to-use discovery tool for researchers and scientists to hypothesize the structure-function relationship between natural or synthetic variations and their molecular phenotype.

7.
Proc Natl Acad Sci U S A ; 120(33): e2203828120, 2023 08 15.
Article in English | MEDLINE | ID: mdl-37549298

ABSTRACT

Cellular omics such as single-cell genomics, proteomics, and microbiomics allow the characterization of tissue and microbial community composition, which can be compared between conditions to identify biological drivers. This strategy has been critical to revealing markers of disease progression, such as cancer and pathogen infection. A dedicated statistical method for differential variability analysis is lacking for cellular omics data, and existing methods for differential composition analysis do not model some compositional data properties, suggesting there is room to improve model performance. Here, we introduce sccomp, a method for differential composition and variability analyses that jointly models data count distribution, compositionality, group-specific variability, and proportion mean-variability association, being aware of outliers. sccomp provides a comprehensive analysis framework that offers realistic data simulation and cross-study knowledge transfer. Here, we demonstrate that mean-variability association is ubiquitous across technologies, highlighting the inadequacy of the very popular Dirichlet-multinomial distribution. We show that sccomp accurately fits experimental data, significantly improving performance over state-of-the-art algorithms. Using sccomp, we identified differential constraints and composition in the microenvironment of primary breast cancer.


Subject(s)
Genomics , Microbiota , Proteomics/methods , Computer Simulation , Algorithms
8.
ArXiv ; 2023 Jun 26.
Article in English | MEDLINE | ID: mdl-37426450

ABSTRACT

Multiplexed Assays of Variant Effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines has led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.

9.
Genome Biol ; 24(1): 147, 2023 07 03.
Article in English | MEDLINE | ID: mdl-37394429

ABSTRACT

Sequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact. However, variant effect assays have generally been undertaken reactively for individual variants only after and, in most cases long after, their first observation. Now, multiplexed assays of variant effect can characterise massive numbers of variants simultaneously, yielding variant effect maps that reveal the function of every possible single nucleotide change in a gene or regulatory element. Generating maps for every protein encoding gene and regulatory element in the human genome would create an 'Atlas' of variant effect maps and transform our understanding of genetics and usher in a new era of nucleotide-resolution functional knowledge of the genome. An Atlas would reveal the fundamental biology of the human genome, inform human evolution, empower the development and use of therapeutics and maximize the utility of genomics for diagnosing and treating disease. The Atlas of Variant Effects Alliance is an international collaborative group comprising hundreds of researchers, technologists and clinicians dedicated to realising an Atlas of Variant Effects to help deliver on the promise of genomics.


Subject(s)
Genetic Variation , Genomics , Humans , Genome, Human , High-Throughput Nucleotide Sequencing , Precision Medicine
10.
Gigascience ; 122022 12 28.
Article in English | MEDLINE | ID: mdl-37721410

ABSTRACT

BACKGROUND: Evaluating the impact of amino acid variants has been a critical challenge for studying protein function and interpreting genomic data. High-throughput experimental methods like deep mutational scanning (DMS) can measure the effect of large numbers of variants in a target protein, but because DMS studies have not been performed on all proteins, researchers also model DMS data computationally to estimate variant impacts by predictors. RESULTS: In this study, we extended a linear regression-based predictor to explore whether incorporating data from alanine scanning (AS), a widely used low-throughput mutagenesis method, would improve prediction results. To evaluate our model, we collected 146 AS datasets, mapping to 54 DMS datasets across 22 distinct proteins. CONCLUSIONS: We show that improved model performance depends on the compatibility of the DMS and AS assays, and the scale of improvement is closely related to the correlation between DMS and AS results.


Subject(s)
Amino Acids , Genomics , Amino Acids/genetics , Mutation , Mutagenesis , Linear Models
11.
Am J Hum Genet ; 108(12): 2248-2258, 2021 12 02.
Article in English | MEDLINE | ID: mdl-34793697

ABSTRACT

Clinical interpretation of missense variants is challenging because the majority identified by genetic testing are rare and their functional effects are unknown. Consequently, most variants are of uncertain significance and cannot be used for clinical diagnosis or management. Although not much can be done to ameliorate variant rarity, multiplexed assays of variant effect (MAVEs), where thousands of single-nucleotide variant effects are simultaneously measured experimentally, provide functional evidence that can help resolve variants of unknown significance (VUSs). However, a rigorous assessment of the clinical value of multiplexed functional data for variant interpretation is lacking. Thus, we systematically combined previously published BRCA1, TP53, and PTEN multiplexed functional data with phenotype and family history data for 324 VUSs identified by a single diagnostic testing laboratory. We curated 49,281 variant functional scores from MAVEs for these three genes and integrated four different TP53 multiplexed functional datasets into a single functional prediction for each variant by using machine learning. We then determined the strength of evidence provided by each multiplexed functional dataset and reevaluated 324 VUSs. Multiplexed functional data were effective in driving variant reclassification when combined with clinical data, eliminating 49% of VUSs for BRCA1, 69% for TP53, and 15% for PTEN. Thus, multiplexed functional data, which are being generated for numerous genes, are poised to have a major impact on clinical variant interpretation.


Subject(s)
BRCA1 Protein/genetics , Genetic Testing , Mutation, Missense , PTEN Phosphohydrolase/genetics , Tumor Suppressor Protein p53/genetics , Adult , Data Collection , Datasets as Topic , Genetic Association Studies , Humans , Medical History Taking , Phenotype , Predictive Value of Tests
12.
J Biol Chem ; 297(1): 100900, 2021 07.
Article in English | MEDLINE | ID: mdl-34157285

ABSTRACT

Immune-stimulatory ligands, such as major histocompatibility complex molecules and the T-cell costimulatory ligand CD86, are central to productive immunity. Endogenous mammalian membrane-associated RING-CHs (MARCH) act on these and other targets to regulate antigen presentation and activation of adaptive immunity, whereas virus-encoded homologs target the same molecules to evade immune responses. Substrate specificity is encoded in or near the membrane-embedded domains of MARCHs and the proteins they regulate, but the exact sequences that distinguish substrates from nonsubstrates are poorly understood. Here, we examined the requirements for recognition of the costimulatory ligand CD86 by two different MARCH-family proteins, human MARCH1 and Kaposi's sarcoma herpesvirus modulator of immune recognition 2 (MIR2), using deep mutational scanning. We identified a highly specific recognition surface in the hydrophobic core of the CD86 transmembrane (TM) domain (TMD) that is required for recognition by MARCH1 and prominently features a proline at position 254. In contrast, MIR2 requires no specific sequences in the CD86 TMD but relies primarily on an aspartic acid at position 244 in the CD86 extracellular juxtamembrane region. Surprisingly, MIR2 recognized CD86 with a TMD composed entirely of valine, whereas many different single amino acid substitutions in the context of the native TM sequence conferred MIR2 resistance. These results show that the human and viral proteins evolved completely different recognition modes for the same substrate. That some TM sequences are incompatible with MIR2 activity, even when no specific recognition motif is required, suggests a more complicated mechanism of immune modulation via CD86 than was previously appreciated.


Subject(s)
B7-2 Antigen/chemistry , Ubiquitin-Protein Ligases/metabolism , Viral Proteins/metabolism , B7-2 Antigen/genetics , B7-2 Antigen/metabolism , Cell Membrane/metabolism , Down-Regulation , HEK293 Cells , HeLa Cells , Humans , Mutation , Protein Domains , Protein Transport
13.
Front Immunol ; 12: 667870, 2021.
Article in English | MEDLINE | ID: mdl-33995402

ABSTRACT

In 2016 Delong et al. discovered a new type of neoepitope formed by the fusion of two unrelated peptide fragments. Remarkably these neoepitopes, called hybrid insulin peptides, or HIPs, are recognized by pathogenic CD4+ T cells in the NOD mouse and human pancreatic islet-infiltrating T cells in people with type 1 diabetes. Current data implicates CD4+ T-cell responses to HIPs in the immune pathogenesis of human T1D. Because of their role in the immune pathogenesis of human T1D it is important to identify new HIPs that are recognized by CD4+ T cells in people at risk of, or with, T1D. A detailed knowledge of T1D-associated HIPs will allow HIPs to be used in assays to monitor changes in T cell mediated beta-cell autoimmunity. They will also provide new targets for antigen-specific therapies for T1D. However, because HIPs are formed by the fusion of two unrelated peptides there are an enormous number of potential HIPs which makes it technically challenging to identify them. Here we review the discovery of HIPs, how they form and discuss approaches to identifying new HIPs relevant to the immune pathogenesis of human type 1 diabetes.


Subject(s)
Autoantigens/immunology , Autoimmunity , CD4-Positive T-Lymphocytes/immunology , Diabetes Mellitus, Type 1/immunology , Epitopes , Insulin/immunology , Islets of Langerhans/immunology , Peptide Fragments/immunology , Animals , Autoantigens/metabolism , CD4-Positive T-Lymphocytes/metabolism , Diabetes Mellitus, Type 1/metabolism , Diabetes Mellitus, Type 1/pathology , Humans , Insulin/metabolism , Islets of Langerhans/metabolism , Islets of Langerhans/pathology , Peptide Fragments/metabolism
14.
Bioinformatics ; 37(19): 3382-3383, 2021 Oct 11.
Article in English | MEDLINE | ID: mdl-33774657

ABSTRACT

SUMMARY: Multiplexed assays of variant effect (MAVEs) are capable of experimentally testing all possible single nucleotide or amino acid variants in selected genomic regions, generating 'variant effect maps', which provide biochemical insight and functional evidence to enable more rapid and accurate clinical interpretation of human variation. Because the international community applying MAVE approaches is growing rapidly, we developed the online MaveRegistry platform to catalyze collaboration, reduce redundant efforts, allow stakeholders to nominate targets and enable tracking and sharing of progress on ongoing MAVE projects. AVAILABILITY AND IMPLEMENTATION: MaveRegistry service: https://registry.varianteffect.org. MaveRegistry source code: https://github.com/kvnkuang/maveregistry-front-end.

15.
Proc Natl Acad Sci U S A ; 117(21): 11597-11607, 2020 05 26.
Article in English | MEDLINE | ID: mdl-32385156

ABSTRACT

The distribution of fitness effects of mutation plays a central role in constraining protein evolution. The underlying mechanisms by which mutations lead to fitness effects are typically attributed to changes in protein specific activity or abundance. Here, we reveal the importance of a mutation's collateral fitness effects, which we define as effects that do not derive from changes in the protein's ability to perform its physiological function. We comprehensively measured the collateral fitness effects of missense mutations in the Escherichia coli TEM-1 ß-lactamase antibiotic resistance gene using growth competition experiments in the absence of antibiotic. At least 42% of missense mutations in TEM-1 were deleterious, indicating that for some proteins collateral fitness effects occur as frequently as effects on protein activity and abundance. Deleterious mutations caused improper posttranslational processing, incorrect disulfide-bond formation, protein aggregation, changes in gene expression, and pleiotropic effects on cell phenotype. Deleterious collateral fitness effects occurred more frequently in TEM-1 than deleterious effects on antibiotic resistance in environments with low concentrations of the antibiotic. The surprising prevalence of deleterious collateral fitness effects suggests they may play a role in constraining protein evolution, particularly for highly expressed proteins, for proteins under intermittent selection for their physiological function, and for proteins whose contribution to fitness is buffered against deleterious effects on protein activity and protein abundance.


Subject(s)
Evolution, Molecular , Genetic Fitness/genetics , Mutation, Missense/genetics , Mutation, Missense/physiology , Escherichia coli/enzymology , Escherichia coli/genetics , Escherichia coli/physiology , Escherichia coli Proteins/chemistry , Escherichia coli Proteins/genetics , Escherichia coli Proteins/metabolism , beta-Lactamases/chemistry , beta-Lactamases/genetics , beta-Lactamases/metabolism
16.
Article in English | MEDLINE | ID: mdl-33855258
17.
Blood ; 135(4): 287-292, 2020 01 23.
Article in English | MEDLINE | ID: mdl-31697803

ABSTRACT

The single transmembrane domain (TMD) of the human thrombopoietin receptor (TpoR/myeloproliferative leukemia [MPL] protein), encoded by exon 10 of the MPL gene, is a hotspot for somatic mutations associated with myeloproliferative neoplasms (MPNs). Approximately 6% and 14% of JAK2 V617F- essential thrombocythemia and primary myelofibrosis patients, respectively, have "canonical" MPL exon 10 driver mutations W515L/K/R/A or S505N, which generate constitutively active receptors and consequent loss of Tpo dependence. Other "noncanonical" MPL exon 10 mutations have also been identified in patients, both alone and in combination with canonical mutations, but, in almost all cases, their functional consequences and relevance to disease are unknown. Here, we used a deep mutational scanning approach to evaluate all possible single amino acid substitutions in the human TpoR TMD for their ability to confer cytokine-independent growth in Ba/F3 cells. We identified all currently recognized driver mutations and 7 novel mutations that cause constitutive TpoR activation, and a much larger number of second-site mutations that enhance S505N-driven activation. We found examples of both of these categories in published and previously unpublished MPL exon 10 sequencing data from MPN patients, demonstrating that some, if not all, of the new mutations reported here represent likely drivers or modifiers of myeloproliferative disease.


Subject(s)
Amino Acid Substitution , Myeloproliferative Disorders/genetics , Receptors, Thrombopoietin/genetics , Animals , Cell Line , Exons , Humans , Mice , Models, Molecular , Mutation , Protein Domains , Receptors, Thrombopoietin/chemistry
18.
Genome Med ; 11(1): 85, 2019 12 20.
Article in English | MEDLINE | ID: mdl-31862013

ABSTRACT

Variants of uncertain significance represent a massive challenge to medical genetics. Multiplexed functional assays, in which the functional effects of thousands of genomic variants are assessed simultaneously, are increasingly generating data that can be used as additional evidence for or against variant pathogenicity. Such assays have the potential to resolve variants of uncertain significance, thereby increasing the clinical utility of genomic testing. Existing standards from the American College of Medical Genetics and Genomics (ACMG)/Association for Molecular Pathology (AMP) and new guidelines from the Clinical Genome Resource (ClinGen) establish the role of functional data in variant interpretation, but do not address the specific challenges or advantages of using functional data derived from multiplexed assays. Here, we build on these existing guidelines to provide recommendations to experimentalists for the production and reporting of multiplexed functional data and to clinicians for the evaluation and use of such data. By following these recommendations, experimentalists can produce transparent, complete, and well-validated datasets that are primed for clinical uptake. Our recommendations to clinicians and diagnostic labs on how to evaluate the quality of multiplexed functional datasets, and how different datasets could be incorporated into the ACMG/AMP variant-interpretation framework, will hopefully clarify whether and how such data should be used. The recommendations that we provide are designed to enhance the quality and utility of multiplexed functional data, and to promote their judicious use.


Subject(s)
Genetic Testing/standards , Genetic Variation , Gene Library , Guidelines as Topic , Humans , Precision Medicine , Quality Control , Sequence Analysis, DNA , Societies, Medical
19.
Genome Biol ; 20(1): 223, 2019 11 04.
Article in English | MEDLINE | ID: mdl-31679514

ABSTRACT

Multiplex assays of variant effect (MAVEs), such as deep mutational scans and massively parallel reporter assays, test thousands of sequence variants in a single experiment. Despite the importance of MAVE data for basic and clinical research, there is no standard resource for their discovery and distribution. Here, we present MaveDB ( https://www.mavedb.org ), a public repository for large-scale measurements of sequence variant impact, designed for interoperability with applications to interpret these datasets. We also describe the first such application, MaveVis, which retrieves, visualizes, and contextualizes variant effect maps. Together, the database and applications will empower the community to mine these powerful datasets.


Subject(s)
Databases, Genetic , Genetic Variation , Genomics , Software
20.
Mol Cell ; 74(2): 393-408.e20, 2019 04 18.
Article in English | MEDLINE | ID: mdl-30956043

ABSTRACT

Multiple layers of regulation modulate the activity and localization of protein kinases. However, many details of kinase regulation remain incompletely understood. Here, we apply saturation mutagenesis and a chemical genetic method for allosterically modulating kinase global conformation to Src kinase, providing insight into known regulatory mechanisms and revealing a previously undiscovered interaction between Src's SH4 and catalytic domains. Abrogation of this interaction increased phosphotransferase activity, promoted membrane association, and provoked phosphotransferase-independent alterations in cell morphology. Thus, Src's SH4 domain serves as an intramolecular regulator coupling catalytic activity, global conformation, and localization, as well as mediating a phosphotransferase-independent function. Sequence conservation suggests that the SH4 domain regulatory interaction exists in other Src-family kinases. Our combined approach's ability to reveal a regulatory mechanism in one of the best-studied kinases suggests that it could be applied broadly to provide insight into kinase structure, regulation, and function.


Subject(s)
Catalytic Domain/genetics , Mutagenesis/genetics , Protein Conformation , src-Family Kinases/chemistry , Allosteric Regulation/genetics , Cell Membrane/chemistry , Cell Membrane/enzymology , HEK293 Cells , Humans , Phosphorylation , src-Family Kinases/genetics
SELECTION OF CITATIONS
SEARCH DETAIL
...