ABSTRACT
Proteins are key to all cellular processes and their structure is important in understanding their function and evolution. Sequence-based predictions of protein structures have increased in accuracy1, and over 214 million predicted structures are available in the AlphaFold database2. However, studying protein structures at this scale requires highly efficient methods. Here, we developed a structural-alignment-based clustering algorithm-Foldseek cluster-that can cluster hundreds of millions of structures. Using this method, we have clustered all of the structures in the AlphaFold database, identifying 2.30 million non-singleton structural clusters, of which 31% lack annotations representing probable previously undescribed structures. Clusters without annotation tend to have few representatives covering only 4% of all proteins in the AlphaFold database. Evolutionary analysis suggests that most clusters are ancient in origin but 4% seem to be species specific, representing lower-quality predictions or examples of de novo gene birth. We also show how structural comparisons can be used to predict domain families and their relationships, identifying examples of remote structural similarity. On the basis of these analyses, we identify several examples of human immune-related proteins with putative remote homology in prokaryotic species, illustrating the value of this resource for studying protein function and evolution across the tree of life.
Subject(s)
Algorithms , Cluster Analysis , Proteins , Structural Homology, Protein , Humans , Databases, Protein , Proteins/chemistry , Proteins/classification , Proteins/metabolism , Sequence Alignment , Molecular Sequence Annotation , Prokaryotic Cells/chemistry , Phylogeny , Species Specificity , Evolution, MolecularABSTRACT
A newly described coronavirus named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is the causative agent of coronavirus disease 2019 (COVID-19), has infected over 2.3 million people, led to the death of more than 160,000 individuals and caused worldwide social and economic disruption1,2. There are no antiviral drugs with proven clinical efficacy for the treatment of COVID-19, nor are there any vaccines that prevent infection with SARS-CoV-2, and efforts to develop drugs and vaccines are hampered by the limited knowledge of the molecular details of how SARS-CoV-2 infects cells. Here we cloned, tagged and expressed 26 of the 29 SARS-CoV-2 proteins in human cells and identified the human proteins that physically associated with each of the SARS-CoV-2 proteins using affinity-purification mass spectrometry, identifying 332 high-confidence protein-protein interactions between SARS-CoV-2 and human proteins. Among these, we identify 66 druggable human proteins or host factors targeted by 69 compounds (of which, 29 drugs are approved by the US Food and Drug Administration, 12 are in clinical trials and 28 are preclinical compounds). We screened a subset of these in multiple viral assays and found two sets of pharmacological agents that displayed antiviral activity: inhibitors of mRNA translation and predicted regulators of the sigma-1 and sigma-2 receptors. Further studies of these host-factor-targeting agents, including their combination with drugs that directly target viral enzymes, could lead to a therapeutic regimen to treat COVID-19.
Subject(s)
Betacoronavirus/drug effects , Coronavirus Infections/drug therapy , Coronavirus Infections/metabolism , Drug Repositioning , Molecular Targeted Therapy , Pneumonia, Viral/drug therapy , Pneumonia, Viral/metabolism , Protein Interaction Maps , Viral Proteins/metabolism , Animals , Antiviral Agents/classification , Antiviral Agents/pharmacology , Betacoronavirus/genetics , Betacoronavirus/metabolism , Betacoronavirus/pathogenicity , COVID-19 , Chlorocebus aethiops , Cloning, Molecular , Coronavirus Infections/immunology , Coronavirus Infections/virology , Drug Evaluation, Preclinical , HEK293 Cells , Host-Pathogen Interactions/drug effects , Humans , Immunity, Innate , Mass Spectrometry , Pandemics , Pneumonia, Viral/immunology , Pneumonia, Viral/virology , Protein Binding , Protein Biosynthesis/drug effects , Protein Domains , Protein Interaction Mapping , Receptors, sigma/metabolism , SARS-CoV-2 , SKP Cullin F-Box Protein Ligases/metabolism , Vero Cells , Viral Proteins/genetics , COVID-19 Drug TreatmentABSTRACT
The physical interactome of a protein can be altered upon perturbation, modulating cell physiology and contributing to disease. Identifying interactome differences of normal and disease states of proteins could help understand disease mechanisms, but current methods do not pinpoint structure-specific PPIs and interaction interfaces proteome-wide. We used limited proteolysis-mass spectrometry (LiP-MS) to screen for structure-specific PPIs by probing for protease susceptibility changes of proteins in cellular extracts upon treatment with specific structural states of a protein. We first demonstrated that LiP-MS detects well-characterized PPIs, including antibody-target protein interactions and interactions with membrane proteins, and that it pinpoints interfaces, including epitopes. We then applied the approach to study conformation-specific interactors of the Parkinson's disease hallmark protein alpha-synuclein (aSyn). We identified known interactors of aSyn monomer and amyloid fibrils and provide a resource of novel putative conformation-specific aSyn interactors for validation in further studies. We also used our approach on GDP- and GTP-bound forms of two Rab GTPases, showing detection of differential candidate interactors of conformationally similar proteins. This approach is applicable to screen for structure-specific interactomes of any protein, including posttranslationally modified and unmodified, or metabolite-bound and unbound protein states.
Subject(s)
alpha-Synuclein , Humans , alpha-Synuclein/metabolism , alpha-Synuclein/chemistry , Protein Interaction Mapping , Mass Spectrometry , Protein Binding , Proteolysis , Parkinson Disease/metabolism , rab GTP-Binding Proteins/metabolism , Protein Interaction Maps , Protein Conformation , Amyloid/metabolism , Amyloid/chemistry , Proteome/metabolismABSTRACT
Rare genetic diseases affect millions, and identifying causal DNA variants is essential for patient care. Therefore, it is imperative to estimate the effect of each independent variant and improve their pathogenicity classification. Our study of 140 214 unrelated UK Biobank (UKB) participants found that each of them carries a median of 7 variants previously reported as pathogenic or likely pathogenic. We focused on 967 diagnostic-grade gene (DGG) variants for rare bleeding, thrombotic, and platelet disorders (BTPDs) observed in 12 367 UKB participants. By association analysis, for a subset of these variants, we estimated effect sizes for platelet count and volume, and odds ratios for bleeding and thrombosis. Variants causal of some autosomal recessive platelet disorders revealed phenotypic consequences in carriers. Loss-of-function variants in MPL, which cause chronic amegakaryocytic thrombocytopenia if biallelic, were unexpectedly associated with increased platelet counts in carriers. We also demonstrated that common variants identified by genome-wide association studies (GWAS) for platelet count or thrombosis risk may influence the penetrance of rare variants in BTPD DGGs on their associated hemostasis disorders. Network-propagation analysis applied to an interactome of 18 410 nodes and 571 917 edges showed that GWAS variants with large effect sizes are enriched in DGGs and their first-order interactors. Finally, we illustrate the modifying effect of polygenic scores for platelet count and thrombosis risk on disease severity in participants carrying rare variants in TUBB1 or PROC and PROS1, respectively. Our findings demonstrate the power of association analyses using large population datasets in improving pathogenicity classifications of rare variants.
Subject(s)
Genome-Wide Association Study , Thrombosis , Humans , Biological Specimen Banks , Hemostasis , Hemorrhage/genetics , Rare DiseasesABSTRACT
Bone marrow-derived mesenchymal stem cells (MSCs) differentiate into osteoblasts upon stimulation by signals present in their niche. Because the global signaling cascades involved in the early phases of MSCs osteoblast (OB) differentiation are not well-defined, we used quantitative mass spectrometry to delineate changes in human MSCs proteome and phosphoproteome during the first 24 h of their OB lineage commitment. The temporal profiles of 6252 proteins and 15,059 phosphorylation sites suggested at least two distinct signaling waves: one peaking within 30 to 60 min after stimulation and a second upsurge after 24 h. In addition to providing a comprehensive view of the proteome and phosphoproteome dynamics during early MSCs differentiation, our analyses identified a key role of serine/threonine protein kinase D1 (PRKD1) in OB commitment. At the onset of OB differentiation, PRKD1 initiates activation of the pro-osteogenic transcription factor RUNX2 by triggering phosphorylation and nuclear exclusion of the histone deacetylase HDAC7.
Subject(s)
Cell Differentiation , Mesenchymal Stem Cells/cytology , Mesenchymal Stem Cells/metabolism , Osteoblasts/cytology , Osteoblasts/metabolism , Phosphoproteins/metabolism , Proteome , Proteomics , Humans , Phylogeny , Proteomics/methodsABSTRACT
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a global threat to human health and has compromised economic stability. In addition to the development of an effective vaccine, it is imperative to understand how SARS-CoV-2 hijacks host cellular machineries on a system-wide scale so that potential host-directed therapies can be developed. In situ proteome-wide abundance and thermal stability measurements using thermal proteome profiling (TPP) can inform on global changes in protein activity. Here we adapted TPP to high biosafety conditions amenable to SARS-CoV-2 handling. We discovered pronounced temporal alterations in host protein thermostability during infection, which converged on cellular processes including cell cycle, microtubule and RNA splicing regulation. Pharmacological inhibition of host proteins displaying altered thermal stability or abundance during infection suppressed SARS-CoV-2 replication. Overall, this work serves as a framework for expanding TPP workflows to globally important human pathogens that require high biosafety containment and provides deeper resolution into the molecular changes induced by SARS-CoV-2 infection.
Subject(s)
COVID-19/metabolism , Host-Pathogen Interactions , Protein Stability , SARS-CoV-2/physiology , Viral Proteins/metabolism , Antiviral Agents/pharmacology , COVID-19/virology , Humans , Proteome , SARS-CoV-2/isolation & purification , SARS-CoV-2/metabolism , Temperature , Virus Replication/drug effectsABSTRACT
The evolution of gene expression regulation has contributed to species differentiation. The 3' untranslated regions (3'UTRs) of mRNAs include regulatory elements that modulate gene expression; however, our knowledge of their implications in the divergence of bacterial species is currently limited. In this study, we performed genome-wide comparative analyses of mRNAs encoding orthologous proteins from the genus Staphylococcus and found that mRNA conservation was lost mostly downstream of the coding sequence (CDS), indicating the presence of high sequence diversity in the 3'UTRs of orthologous genes. Transcriptomic mapping of different staphylococcal species confirmed that 3'UTRs were also variable in length. We constructed chimeric mRNAs carrying the 3'UTR of orthologous genes and demonstrated that 3'UTR sequence variations affect protein production. This suggested that species-specific functional 3'UTRs might be specifically selected during evolution. 3'UTR variations may occur through different processes, including gene rearrangements, local nucleotide changes, and the transposition of insertion sequences. By extending the conservation analyses to specific 3'UTRs, as well as the entire set of Escherichia coli and Bacillus subtilis mRNAs, we showed that 3'UTR variability is widespread in bacteria. In summary, our work unveils an evolutionary bias within 3'UTRs that results in species-specific non-coding sequences that may contribute to bacterial diversity.
Subject(s)
3' Untranslated Regions/genetics , Evolution, Molecular , Gene Expression Regulation, Bacterial , Staphylococcus/genetics , Animals , Bacterial Proteins/genetics , Bacterial Proteins/metabolism , Base Sequence , DNA Transposable Elements/genetics , Gene Rearrangement/genetics , Genes, Bacterial , Hemolysis , Nucleotides/genetics , Phylogeny , RNA, Messenger/genetics , RNA, Messenger/metabolism , Sheep , Species SpecificityABSTRACT
Cylindromatosis tumor suppressor protein (CYLD) is a deubiquitinase, best known as an essential negative regulator of the NFkB pathway. Previous studies have suggested an involvement of CYLD in epidermal growth factor (EGF)-dependent signal transduction as well, as it was found enriched within the tyrosine-phosphorylated complexes in cells stimulated with the growth factor. EGF receptor (EGFR) signaling participates in central cellular processes and its tight regulation, partly through ubiquitination cascades, is decisive for a balanced cellular homeostasis. Here, using a combination of mass spectrometry-based quantitative proteomic approaches with biochemical and immunofluorescence strategies, we demonstrate the involvement of CYLD in the regulation of the ubiquitination events triggered by EGF. Our data show that CYLD regulates the magnitude of ubiquitination of several major effectors of the EGFR pathway by assisting the recruitment of the ubiquitin ligase Cbl-b to the activated EGFR complex. Notably, CYLD facilitates the interaction of EGFR with Cbl-b through its Tyr15 phosphorylation in response to EGF, which leads to fine-tuning of the receptor's ubiquitination and subsequent degradation. This represents a previously uncharacterized strategy exerted by this deubiquitinase and tumors suppressor for the negative regulation of a tumorigenic signaling pathway.
Subject(s)
Deubiquitinating Enzyme CYLD/metabolism , ErbB Receptors/metabolism , Proteolysis , Proto-Oncogene Proteins c-cbl/metabolism , Ubiquitination , Chromatography, Liquid , Deubiquitinating Enzyme CYLD/genetics , HeLa Cells , Humans , Phosphorylation , Proteomics , Tandem Mass Spectrometry , Tyrosine/metabolismABSTRACT
Modulation of protein activities by reversible post-translational modifications (PTMs) is a major molecular mechanism involved in the control of virtually all cellular processes. One of these PTMs is ubiquitination, which regulates key processes including protein degradation, cell cycle, DNA damage repair, and signal transduction. Because of its importance for numerous cellular functions, ubiquitination has become an intense topic of research in recent years, and proteomics tools have greatly facilitated the identification of many ubiquitination targets. Taking advantage of the StUbEx strategy for exchanging the endogenous ubiquitin with an epitope-tagged version, we created a modified system, StUbEx PLUS, which allows precise mapping of ubiquitination sites by mass spectrometry. Application of StUbEx PLUS to U2OS cells treated with proteasomal inhibitors resulted in the identification of 41â¯589 sites on 7762 proteins, which thereby revealed the ubiquitous nature of this PTM and demonstrated the utility of the approach for comprehensive ubiquitination studies at site-specific resolution.
Subject(s)
Binding Sites , Peptides/isolation & purification , Ubiquitin/metabolism , Ubiquitination , Cell Line , Humans , Mass Spectrometry , Peptides/metabolism , Protein Processing, Post-TranslationalABSTRACT
Muscle stem cells, or satellite cells, play an important role in the maintenance and repair of muscle tissue and have the capacity to proliferate and differentiate in response to physiological or environmental changes. Although they have been extensively studied, the key regulatory steps and the complex temporal protein dynamics accompanying the differentiation of primary human muscle cells remain poorly understood. Here, we demonstrate the advantages of applying a MS-based quantitative approach, stable isotope labeling by amino acids in cell culture (SILAC), for studying human myogenesis in vitro and characterize the fine-tuned changes in protein expression underlying the dramatic phenotypic conversion of primary mononucleated human muscle cells during in vitro differentiation to form multinucleated myotubes. Using an exclusively optimized triple encoding SILAC procedure, we generated dynamic expression profiles during the course of myogenic differentiation and quantified 2240 proteins, 243 of which were regulated. These changes in protein expression occurred in sequential waves and underlined vast reprogramming in key processes governing cell fate decisions, i.e., cell cycle withdrawal, RNA metabolism, cell adhesion, proteolysis, and cytoskeletal organization. In silico transcription factor target analysis demonstrated that the observed dynamic changes in the proteome could be attributed to a cascade of transcriptional events involving key myogenic regulatory factors as well as additional regulators not yet known to act on muscle differentiation. In addition, we created of a dynamic map of the developing myofibril, providing valuable insights into the formation and maturation of the contractile apparatus in vitro. Finally, our SILAC-based quantitative approach offered the possibility to follow the expression profiles of several muscle disease-associated proteins simultaneously and therefore could be a valuable resource for future studies investigating pathogenesis of degenerative muscle disorders as well as assessing new therapeutic strategies.
Subject(s)
Cell Differentiation , Muscle Fibers, Skeletal/metabolism , Proteome/metabolism , Proteomics/methods , Satellite Cells, Skeletal Muscle/metabolism , Amino Acids/metabolism , Blotting, Western , Cells, Cultured , Chromatography, Liquid , Cluster Analysis , Humans , Immunohistochemistry , Infant, Newborn , Isotope Labeling/methods , Kinetics , Muscle Fibers, Skeletal/cytology , Proteome/classification , Satellite Cells, Skeletal Muscle/cytology , Spectrometry, Mass, Electrospray Ionization , Tandem Mass Spectrometry , Time FactorsABSTRACT
Genome-wide association studies identified several disease-causing mutations in neurodegenerative diseases, including amyotrophic lateral sclerosis (ALS). However, the contribution of genetic variants to pathway disturbances and their cell type-specific variations, especially in glia, is poorly understood. We integrated ALS GWAS-linked gene networks with human astrocyte-specific multi-omics datasets to elucidate pathognomonic signatures. It predicts that KIF5A, a motor protein kinesin-1 heavy-chain isoform, previously detected only in neurons, can also potentiate disease pathways in astrocytes. Using postmortem tissue and super-resolution structured illumination microscopy in cell-based perturbation platforms, we provide evidence that KIF5A is present in astrocyte processes and its deficiency disrupts structural integrity and mitochondrial transport. We show that this may underly cytoskeletal and trafficking changes in SOD1 ALS astrocytes characterised by low KIF5A levels, which can be rescued by c-Jun N-terminal Kinase-1 (JNK1), a kinesin transport regulator. Altogether, our pipeline reveals a mechanism controlling astrocyte process integrity, a pre-requisite for synapse maintenance and suggests a targetable loss-of-function in ALS.
Subject(s)
Amyotrophic Lateral Sclerosis , Proteogenomics , Humans , Amyotrophic Lateral Sclerosis/genetics , Astrocytes , Genome-Wide Association Study , Kinesins/geneticsABSTRACT
Cellular functions are governed by molecular machines that assemble through protein-protein interactions. Their atomic details are critical to studying their molecular mechanisms. However, fewer than 5% of hundreds of thousands of human protein interactions have been structurally characterized. Here we test the potential and limitations of recent progress in deep-learning methods using AlphaFold2 to predict structures for 65,484 human protein interactions. We show that experiments can orthogonally confirm higher-confidence models. We identify 3,137 high-confidence models, of which 1,371 have no homology to a known structure. We identify interface residues harboring disease mutations, suggesting potential mechanisms for pathogenic variants. Groups of interface phosphorylation sites show patterns of co-regulation across conditions, suggestive of coordinated tuning of multiple protein interactions as signaling responses. Finally, we provide examples of how the predicted binary complexes can be used to build larger assemblies helping to expand our understanding of human cell biology.
Subject(s)
Protein Interaction Maps , Signal Transduction , Humans , Mutation , Computational Biology/methodsABSTRACT
Interacting proteins tend to have similar functions, influencing the same organismal traits. Interaction networks can be used to expand the list of candidate trait-associated genes from genome-wide association studies. Here, we performed network-based expansion of trait-associated genes for 1,002 human traits showing that this recovers known disease genes or drug targets. The similarity of network expansion scores identifies groups of traits likely to share an underlying genetic and biological process. We identified 73 pleiotropic gene modules linked to multiple traits, enriched in genes involved in processes such as protein ubiquitination and RNA processing. In contrast to gene deletion studies, pleiotropy as defined here captures specifically multicellular-related processes. We show examples of modules linked to human diseases enriched in genes with known pathogenic variants that can be used to map targets of approved drugs for repurposing. Finally, we illustrate the use of network expansion scores to study genes at inflammatory bowel disease genome-wide association study loci, and implicate inflammatory bowel disease-relevant genes with strong functional and genetic support.
Subject(s)
Cell Biology , Cells , Disease , Genetic Association Studies , Genetic Pleiotropy , Genetic Association Studies/methods , Humans , Ubiquitination/genetics , RNA Processing, Post-Transcriptional/genetics , Cells/metabolism , Cells/pathology , Drug Repositioning/methods , Drug Repositioning/trends , Disease/genetics , Inflammatory Bowel Diseases/genetics , Inflammatory Bowel Diseases/pathology , Genome-Wide Association Study , Phenotype , Autoimmune Diseases/genetics , Autoimmune Diseases/pathologyABSTRACT
Over the past decades, genome-wide association studies (GWAS) have led to a dramatic expansion of genetic variants implicated with human traits and diseases. These advances are expected to result in new drug targets but the identification of causal genes and the cell biology underlying human diseases from GWAS remains challenging. Here, we review protein interaction network-based methods to analyse GWAS data. These approaches can rank candidate drug targets at GWAS-associated loci or among interactors of disease genes without direct genetic support. These methods identify the cell biology affected in common across diseases, offering opportunities for drug repurposing, as well as be combined with expression data to identify focal tissues and cell types. Going forward, we expect that these methods will further improve from advances in the characterisation of context specific interaction networks and the joint analysis of rare and common genetic signals.
Subject(s)
Genome-Wide Association Study , Protein Interaction Maps , Humans , Genome-Wide Association Study/methods , Phenotype , Polymorphism, Single NucleotideABSTRACT
Protein degradation is a key component of the regulation of gene expression and is at the center of several pathogenic processes. Proteins are regularly degraded, but there is large variation in their lifetimes, and the kinetics of protein degradation are not well understood. Many different factors can influence protein degradation rates, painting a highly complex picture. This has been partially unravelled in recent years thanks to invaluable advances in proteomics techniques. In this Mini-Review, we give a global vision of the determinants of protein degradation rates with the backdrop of the current understanding of proteolytic systems to give a contemporary view of the field.
ABSTRACT
Since the start of 2020, the world has been upended by the pandemic caused by the severe acute respiratory coronavirus type 2 (SARS-CoV-2), the causative agent of coronavirus disease 2019 (COVID-19). It has not only led to a tragic loss of life and terrible economic costs but has also been met with an unprecedented response of the scientific and medical communities. In an effort to better understand this viral infection, scientists around the world generated the largest surge in research in documented history for any topic (Lever & Altman, 2021). A part of this work has included the need to better understand the impact of the virus on human proteins-the key machinery of the cell-and human physiology. In their recent study, Geyer and colleagues (Geyer et al, 2021) analyzed a total of 720 proteomes from longitudinal serum samples of 31 hospitalized COVID-19 patients and control individuals with COVID-19-like symptoms but not infected with SARS-CoV-2, providing a comprehensive characterization of the plasma proteome changes along the time course of infection.
Subject(s)
COVID-19 , Proteomics , Humans , Pandemics , Proteome , SARS-CoV-2ABSTRACT
Receptor tyrosine kinases (RTK) bind growth factors and are critical for cell proliferation and differentiation. Their dysregulation leads to a loss of growth control, often resulting in cancer. Epidermal growth factor receptor (EGFR) is the prototypic RTK and can bind several ligands exhibiting distinct mitogenic potentials. Whereas the phosphorylation on individual EGFR sites and their roles for downstream signaling have been extensively studied, less is known about ligand-specific ubiquitination events on EGFR, which are crucial for signal attenuation and termination. We used a proteomics-based workflow for absolute quantitation combined with mathematical modeling to unveil potentially decisive ubiquitination events on EGFR from the first 30 seconds to 15 minutes of stimulation. Four ligands were used for stimulation: epidermal growth factor (EGF), heparin-binding-EGF like growth factor, transforming growth factor-α and epiregulin. Whereas only little differences in the order of individual ubiquitination sites were observed, the overall amount of modified receptor differed depending on the used ligand, indicating that absolute magnitude of EGFR ubiquitination, and not distinctly regulated ubiquitination sites, is a major determinant for signal attenuation and the subsequent cellular outcomes.
Subject(s)
Epidermal Growth Factor/metabolism , Epiregulin/metabolism , Heparin-binding EGF-like Growth Factor/metabolism , Signal Transduction/genetics , Transforming Growth Factor alpha/metabolism , Amino Acid Sequence , Cell Line, Tumor , Epidermal Growth Factor/chemistry , Epidermal Growth Factor/genetics , Epiregulin/chemistry , Epiregulin/genetics , Epithelial Cells/cytology , Epithelial Cells/metabolism , ErbB Receptors/chemistry , ErbB Receptors/genetics , ErbB Receptors/metabolism , Gene Expression , Heparin-binding EGF-like Growth Factor/chemistry , Heparin-binding EGF-like Growth Factor/genetics , Humans , Ligands , Models, Molecular , Mutation , Phosphorylation , Protein Conformation , Protein Processing, Post-Translational , Proteomics , Transforming Growth Factor alpha/chemistry , Transforming Growth Factor alpha/genetics , UbiquitinationABSTRACT
Genome-wide association studies have discovered numerous genomic loci associated with Alzheimer's disease (AD); yet the causal genes and variants are incompletely identified. We performed an updated genome-wide AD meta-analysis, which identified 37 risk loci, including new associations near CCDC6, TSPAN14, NCK2 and SPRED2. Using three SNP-level fine-mapping methods, we identified 21 SNPs with >50% probability each of being causally involved in AD risk and others strongly suggested by functional annotation. We followed this with colocalization analyses across 109 gene expression quantitative trait loci datasets and prioritization of genes by using protein interaction networks and tissue-specific expression. Combining this information into a quantitative score, we found that evidence converged on likely causal genes, including the above four genes, and those at previously discovered AD loci, including BIN1, APH1B, PTK2B, PILRA and CASS4.
Subject(s)
Alzheimer Disease/genetics , Adaptor Proteins, Signal Transducing/genetics , Chromosome Mapping , Cytoskeletal Proteins/genetics , Gene Expression , Genetic Predisposition to Disease , Genome-Wide Association Study , Humans , Linkage Disequilibrium , Microglia/physiology , Oncogene Proteins/genetics , Polymorphism, Single Nucleotide , Protein Interaction Maps/genetics , Quantitative Trait Loci , Risk Factors , Tetraspanins/geneticsABSTRACT
Drug repurposing provides a rapid approach to meet the urgent need for therapeutics to address COVID-19. To identify therapeutic targets relevant to COVID-19, we conducted Mendelian randomization analyses, deriving genetic instruments based on transcriptomic and proteomic data for 1,263 actionable proteins that are targeted by approved drugs or in clinical phase of drug development. Using summary statistics from the Host Genetics Initiative and the Million Veteran Program, we studied 7,554 patients hospitalized with COVID-19 and >1 million controls. We found significant Mendelian randomization results for three proteins (ACE2, P = 1.6 × 10-6; IFNAR2, P = 9.8 × 10-11 and IL-10RB, P = 2.3 × 10-14) using cis-expression quantitative trait loci genetic instruments that also had strong evidence for colocalization with COVID-19 hospitalization. To disentangle the shared expression quantitative trait loci signal for IL10RB and IFNAR2, we conducted phenome-wide association scans and pathway enrichment analysis, which suggested that IFNAR2 is more likely to play a role in COVID-19 hospitalization. Our findings prioritize trials of drugs targeting IFNAR2 and ACE2 for early management of COVID-19.
Subject(s)
COVID-19/genetics , Drug Repositioning , Mendelian Randomization Analysis/methods , SARS-CoV-2 , Angiotensin-Converting Enzyme 2/genetics , Angiotensin-Converting Enzyme 2/physiology , Genome-Wide Association Study , Humans , Interleukin-10 Receptor beta Subunit/genetics , Interleukin-10 Receptor beta Subunit/physiology , Quantitative Trait Loci , Receptor, Interferon alpha-beta/genetics , Receptor, Interferon alpha-beta/physiology , COVID-19 Drug TreatmentABSTRACT
The COVID-19 pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is a grave threat to public health and the global economy. SARS-CoV-2 is closely related to the more lethal but less transmissible coronaviruses SARS-CoV-1 and Middle East respiratory syndrome coronavirus (MERS-CoV). Here, we have carried out comparative viral-human protein-protein interaction and viral protein localization analyses for all three viruses. Subsequent functional genetic screening identified host factors that functionally impinge on coronavirus proliferation, including Tom70, a mitochondrial chaperone protein that interacts with both SARS-CoV-1 and SARS-CoV-2 ORF9b, an interaction we structurally characterized using cryo-electron microscopy. Combining genetically validated host factors with both COVID-19 patient genetic data and medical billing records identified molecular mechanisms and potential drug treatments that merit further molecular and clinical study.