Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 24
Filter
Add more filters

Affiliation country
Publication year range
1.
Nature ; 631(8021): 583-592, 2024 Jul.
Article in English | MEDLINE | ID: mdl-38768635

ABSTRACT

Rare coding variants that substantially affect function provide insights into the biology of a gene1-3. However, ascertaining the frequency of such variants requires large sample sizes4-8. Here we present a catalogue of human protein-coding variation, derived from exome sequencing of 983,578 individuals across diverse populations. In total, 23% of the Regeneron Genetics Center Million Exome (RGC-ME) data come from individuals of African, East Asian, Indigenous American, Middle Eastern and South Asian ancestry. The catalogue includes more than 10.4 million missense and 1.1 million predicted loss-of-function (pLOF) variants. We identify individuals with rare biallelic pLOF variants in 4,848 genes, 1,751 of which have not been previously reported. From precise quantitative estimates of selection against heterozygous loss of function (LOF), we identify 3,988 LOF-intolerant genes, including 86 that were previously assessed as tolerant and 1,153 that lack established disease annotation. We also define regions of missense depletion at high resolution. Notably, 1,482 genes have regions that are depleted of missense variants despite being tolerant of pLOF variants. Finally, we estimate that 3% of individuals have a clinically actionable genetic variant, and that 11,773 variants reported in ClinVar with unknown significance are likely to be deleterious cryptic splice sites. To facilitate variant interpretation and genetics-informed precision medicine, we make this resource of coding variation from the RGC-ME dataset publicly accessible through a variant allele frequency browser.


Subject(s)
Exome , Genetic Variation , Proteins , Humans , Alleles , Exome/genetics , Exome Sequencing , Gene Frequency , Genetic Variation/genetics , Heterozygote , Loss of Function Mutation/genetics , Mutation, Missense/genetics , Open Reading Frames/genetics , Proteins/genetics , RNA Splice Sites/genetics , Precision Medicine
2.
Nature ; 612(7939): 301-309, 2022 12.
Article in English | MEDLINE | ID: mdl-36450978

ABSTRACT

Clonal haematopoiesis involves the expansion of certain blood cell lineages and has been associated with ageing and adverse health outcomes1-5. Here we use exome sequence data on 628,388 individuals to identify 40,208 carriers of clonal haematopoiesis of indeterminate potential (CHIP). Using genome-wide and exome-wide association analyses, we identify 24 loci (21 of which are novel) where germline genetic variation influences predisposition to CHIP, including missense variants in the lymphocytic antigen coding gene LY75, which are associated with reduced incidence of CHIP. We also identify novel rare variant associations with clonal haematopoiesis and telomere length. Analysis of 5,041 health traits from the UK Biobank (UKB) found relationships between CHIP and severe COVID-19 outcomes, cardiovascular disease, haematologic traits, malignancy, smoking, obesity, infection and all-cause mortality. Longitudinal and Mendelian randomization analyses revealed that CHIP is associated with solid cancers, including non-melanoma skin cancer and lung cancer, and that CHIP linked to DNMT3A is associated with the subsequent development of myeloid but not lymphoid leukaemias. Additionally, contrary to previous findings from the initial 50,000 UKB exomes6, our results in the full sample do not support a role for IL-6 inhibition in reducing the risk of cardiovascular disease among CHIP carriers. Our findings demonstrate that CHIP represents a complex set of heterogeneous phenotypes with shared and unique germline genetic causes and varied clinical implications.


Subject(s)
COVID-19 , Cardiovascular Diseases , Humans , Clonal Hematopoiesis/genetics , Cardiovascular Diseases/epidemiology , Cardiovascular Diseases/genetics
3.
Nature ; 599(7886): 628-634, 2021 11.
Article in English | MEDLINE | ID: mdl-34662886

ABSTRACT

A major goal in human genetics is to use natural variation to understand the phenotypic consequences of altering each protein-coding gene in the genome. Here we used exome sequencing1 to explore protein-altering variants and their consequences in 454,787 participants in the UK Biobank study2. We identified 12 million coding variants, including around 1 million loss-of-function and around 1.8 million deleterious missense variants. When these were tested for association with 3,994 health-related traits, we found 564 genes with trait associations at P ≤ 2.18 × 10-11. Rare variant associations were enriched in loci from genome-wide association studies (GWAS), but most (91%) were independent of common variant signals. We discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer, among others, as well as risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Six genes were associated with brain imaging phenotypes, including two involved in neural development (GBE1, PLD1). Of the signals available and powered for replication in an independent cohort, 81% were confirmed; furthermore, association signals were generally consistent across individuals of European, Asian and African ancestry. We illustrate the ability of exome sequencing to identify gene-trait associations, elucidate gene function and pinpoint effector genes that underlie GWAS signals at scale.


Subject(s)
Biological Specimen Banks , Databases, Genetic , Exome Sequencing , Exome/genetics , Africa/ethnology , Asia/ethnology , Asthma/genetics , Diabetes Mellitus/genetics , Europe/ethnology , Eye Diseases/genetics , Female , Genetic Predisposition to Disease/genetics , Genetic Variation , Genome-Wide Association Study , Humans , Hypertension/genetics , Liver Diseases/genetics , Male , Mutation , Neoplasms/genetics , Quantitative Trait, Heritable , United Kingdom
4.
Nature ; 590(7845): 290-299, 2021 02.
Article in English | MEDLINE | ID: mdl-33568819

ABSTRACT

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.


Subject(s)
Genetic Variation/genetics , Genome, Human/genetics , Genomics , National Heart, Lung, and Blood Institute (U.S.) , Precision Medicine , Cytochrome P-450 CYP2D6/genetics , Haplotypes/genetics , Heterozygote , Humans , INDEL Mutation , Loss of Function Mutation , Mutagenesis , Phenotype , Polymorphism, Single Nucleotide , Population Density , Precision Medicine/standards , Quality Control , Sample Size , United States , Whole Genome Sequencing/standards
5.
Proc Natl Acad Sci U S A ; 119(27): e2123227119, 2022 07 05.
Article in English | MEDLINE | ID: mdl-35759659

ABSTRACT

DNA methyltransferase inhibitors (DNMTis) reexpress hypermethylated genes in cancers and leukemias and also activate endogenous retroviruses (ERVs), leading to interferon (IFN) signaling, in a process known as viral mimicry. In the present study we show that in the subset of acute myeloid leukemias (AMLs) with mutations in TP53, associated with poor prognosis, DNMTis, important drugs for treatment of AML, enable expression of ERVs and IFN and inflammasome signaling in a STING-dependent manner. We previously reported that in solid tumors poly ADP ribose polymerase inhibitors (PARPis) combined with DNMTis to induce an IFN/inflammasome response that is dependent on STING1 and is mechanistically linked to generation of a homologous recombination defect (HRD). We now show that STING1 activity is actually increased in TP53 mutant compared with wild-type (WT) TP53 AML. Moreover, in TP53 mutant AML, STING1-dependent IFN/inflammatory signaling is increased by DNMTi treatment, whereas in AMLs with WT TP53, DNMTis alone have no effect. While combining DNMTis with PARPis increases IFN/inflammatory gene expression in WT TP53 AML cells, signaling induced in TP53 mutant AML is still several-fold higher. Notably, induction of HRD in both TP53 mutant and WT AMLs follows the pattern of STING1-dependent IFN and inflammatory signaling that we have observed with drug treatments. These findings increase our understanding of the mechanisms that underlie DNMTi + PARPi treatment, and also DNMTi combinations with immune therapies, suggesting a personalized approach that statifies by TP53 status, for use of such therapies, including potential immune activation of STING1 in AML and other cancers.


Subject(s)
Antineoplastic Combined Chemotherapy Protocols , DNA-Cytosine Methylases , Leukemia, Myeloid, Acute , Membrane Proteins , Poly(ADP-ribose) Polymerase Inhibitors , Tumor Suppressor Protein p53 , Antineoplastic Combined Chemotherapy Protocols/therapeutic use , DNA-Cytosine Methylases/antagonists & inhibitors , Homologous Recombination/genetics , Humans , Inflammasomes/metabolism , Leukemia, Myeloid, Acute/drug therapy , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/immunology , Membrane Proteins/immunology , Mutation , Poly(ADP-ribose) Polymerase Inhibitors/pharmacology , Poly(ADP-ribose) Polymerase Inhibitors/therapeutic use , Signal Transduction , Tumor Suppressor Protein p53/genetics , Tumor Suppressor Protein p53/metabolism
7.
Proc Natl Acad Sci U S A ; 117(17): 9458-9465, 2020 04 28.
Article in English | MEDLINE | ID: mdl-32291332

ABSTRACT

Archaeological studies estimate the initial settlement of Samoa at 2,750 to 2,880 y ago and identify only limited settlement and human modification to the landscape until about 1,000 to 1,500 y ago. At this point, a complex history of migration is thought to have begun with the arrival of people sharing ancestry with Near Oceanic groups (i.e., Austronesian-speaking and Papuan-speaking groups), and was then followed by the arrival of non-Oceanic groups during European colonialism. However, the specifics of this peopling are not entirely clear from the archaeological and anthropological records, and is therefore a focus of continued debate. To shed additional light on the Samoan population history that this peopling reflects, we employ a population genetic approach to analyze 1,197 Samoan high-coverage whole genomes. We identify population splits between the major Samoan islands and detect asymmetrical gene flow to the capital city. We also find an extreme bottleneck until about 1,000 y ago, which is followed by distinct expansions across the islands and subsequent bottlenecks consistent with European colonization. These results provide for an increased understanding of Samoan population history and the dynamics that inform it, and also demonstrate how rapid demographic processes can shape modern genomes.


Subject(s)
Biological Evolution , Native Hawaiian or Other Pacific Islander/genetics , Archaeology , Demography , Humans , Samoa , Time Factors
8.
Proc Natl Acad Sci U S A ; 117(5): 2560-2569, 2020 02 04.
Article in English | MEDLINE | ID: mdl-31964835

ABSTRACT

De novo mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole-genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) Program, we called 93,325 single-nucleotide DNMs across 1,465 trios from an array of diverse human populations, and used them to directly estimate and analyze DNM counts, rates, and spectra. We find a significant positive correlation between local recombination rate and local DNM rate, and that DNM rate explains a substantial portion (8.98 to 34.92%, depending on the model) of the genome-wide variation in population-level genetic variation from 41K unrelated TOPMed samples. Genome-wide heterozygosity does correlate with DNM rate, but only explains <1% of variation. While we are underpowered to see small differences, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, we did find significantly fewer DNMs in Amish individuals, even when compared with other Europeans, and even after accounting for parental age and sequencing center. Specifically, we found significant reductions in the number of C→A and T→C mutations in the Amish, which seem to underpin their overall reduction in DNMs. Finally, we calculated near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by nonadditive genetic effects and the environment.


Subject(s)
Amish/genetics , Genome, Human , Adult , Cohort Studies , DNA Mutational Analysis , Female , Genetics, Population , Heterozygote , Humans , Male , Mutation , Pedigree , Whole Genome Sequencing , Young Adult
9.
Nucleic Acids Res ; 48(12): e68, 2020 07 09.
Article in English | MEDLINE | ID: mdl-32392348

ABSTRACT

While the methods available for single-cell ATAC-seq analysis are well optimized for clustering cell types, the question of how to integrate multiple scATAC-seq data sets and/or sequencing modalities is still open. We present an analysis framework that enables such integration across scATAC-seq data sets by applying the CoGAPS Matrix Factorization algorithm and the projectR transfer learning program to identify common regulatory patterns across scATAC-seq data sets. We additionally integrate our analysis with scRNA-seq data to identify orthogonal evidence for transcriptional regulators predicted by scATAC-seq analysis. Using publicly available scATAC-seq data, we find patterns that accurately characterize cell types both within and across data sets. Furthermore, we demonstrate that these patterns are both consistent with current biological understanding and reflective of novel regulatory biology.


Subject(s)
Algorithms , Chromatin Immunoprecipitation Sequencing/methods , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Animals , Chromatin/genetics , Datasets as Topic , Humans , Machine Learning
10.
Proc Natl Acad Sci U S A ; 115(28): E6526-E6535, 2018 07 10.
Article in English | MEDLINE | ID: mdl-29946025

ABSTRACT

Native Americans from the Amazon, Andes, and coastal geographic regions of South America have a rich cultural heritage but are genetically understudied, therefore leading to gaps in our knowledge of their genomic architecture and demographic history. In this study, we sequence 150 genomes to high coverage combined with an additional 130 genotype array samples from Native American and mestizo populations in Peru. The majority of our samples possess greater than 90% Native American ancestry, which makes this the most extensive Native American sequencing project to date. Demographic modeling reveals that the peopling of Peru began ∼12,000 y ago, consistent with the hypothesis of the rapid peopling of the Americas and Peruvian archeological data. We find that the Native American populations possess distinct ancestral divisions, whereas the mestizo groups were admixtures of multiple Native American communities that occurred before and during the Inca Empire and Spanish rule. In addition, the mestizo communities also show Spanish introgression largely following Peruvian Independence, nearly 300 y after Spain conquered Peru. Further, we estimate migration events between Peruvian populations from all three geographic regions with the majority of between-region migration moving from the high Andes to the low-altitude Amazon and coast. As such, we present a detailed model of the evolutionary dynamics which impacted the genomes of modern-day Peruvians and a Native American ancestry dataset that will serve as a beneficial resource to addressing the underrepresentation of Native American ancestry in sequencing studies.


Subject(s)
Indians, South American/genetics , Models, Genetic , Population Dynamics , History, Ancient , Humans , Indians, South American/history , Peru
11.
Cancer ; 125(12): 2076-2088, 2019 06 15.
Article in English | MEDLINE | ID: mdl-30865299

ABSTRACT

BACKGROUND: Although cell lines are an essential resource for studying cancer biology, many are of unknown ancestral origin, and their use may not be optimal for evaluating the biology of all patient populations. METHODS: An admixture analysis was performed using genome-wide chip data from the Catalogue of Somatic Mutations in Cancer (COSMIC) Cell Lines Project to calculate genetic ancestry estimates for 1018 cancer cell lines. After stratifying the analyses by tissue and histology types, linear models were used to evaluate the influence of ancestry on gene expression and somatic mutation frequency. RESULTS: For the 701 cell lines with unreported ancestry, 215 were of East Asian origin, 30 were of African or African American origin, and 453 were of European origin. Notable imbalances were observed in ancestral representation across tissue type, with the majority of analyzed tissue types having few cell lines of African American ancestral origin, and with Hispanic and South Asian ancestry being almost entirely absent across all cell lines. In evaluating gene expression across these cell lines, expression levels of the genes neurobeachin line 1 (NBEAL1), solute carrier family 6 member 19 (SLC6A19), HEAT repeat containing 6 (HEATR6), and epithelial cell transforming 2 like (ECT2L) were associated with ancestry. Significant differences were also observed in the proportions of somatic mutation types across cell lines with varying ancestral proportions. CONCLUSIONS: By estimating genetic ancestry for 1018 cancer cell lines, the authors have produced a resource that cancer researchers can use to ensure that their cell lines are ancestrally representative of the populations they intend to affect. Furthermore, the novel ancestry-specific signal identified underscores the importance of ancestral awareness when studying cancer.


Subject(s)
Biomarkers, Tumor/genetics , Ethnicity/genetics , Ethnicity/statistics & numerical data , Genetics, Population , Health Status Disparities , Mutation , Neoplasms/genetics , Black or African American/genetics , Black or African American/statistics & numerical data , Asian People/genetics , Asian People/statistics & numerical data , Cell Line, Tumor , Female , Gene Expression Profiling , Genetic Predisposition to Disease , Humans , Male , Middle Aged , Neoplasms/ethnology , Polymorphism, Single Nucleotide , Prognosis , White People/genetics , White People/statistics & numerical data
13.
Cancer Res Commun ; 2024 Oct 29.
Article in English | MEDLINE | ID: mdl-39470352

ABSTRACT

BACKGROUND: Aberrant alternative splicing can generate neoantigens, which can themselves stimulate immune responses and surveillance. Previous methods for quantifying splicing-derived neoantigens are limited by independent references and potential batch effects. RESULTS: Here, we introduce SpliceMutr, a bioinformatics approach and pipeline for identifying splicing derived neoantigens from tumor and normal data. SpliceMutr facilitates the identification of tumor-specific antigenic splice variants, predicts MHC-binding affinity, and estimates splicing antigenicity scores per gene. By applying this tool to genomic data from The Cancer Genome Atlas (TCGA), we generate splicing-derived neoantigens and neoantigenicity scores per sample and across all cancer types and find numerous correlations between splicing antigenicity and well-established biomarkers of anti-tumor immunity. Notably, carriers of mutations within splicing machinery genes have higher splicing antigenicity, which provides support for our approach. Further analysis of splicing antigenicity in cohorts of melanoma patients treated with mono- or combined immune checkpoint inhibition suggests that the abundance of splicing antigens is reduced post-treatment from baseline in patients who progress. We also observe increased splicing antigenicity in responders to immunotherapy, which may relate to an increased capacity to mount an immune response to splicing-derived antigens. CONCLUSIONS: We find the splicing antigenicity to be higher in tumor samples when compared to normal, that mutations in the splicing machinery result in increased splicing antigenicity in some cancers, and higher splicing antigenicity is associated with positive response to immune checkpoint inhibitor therapies. Further, this new computational pipeline provides novel analytical capabilities for splicing antigenicity and is openly available for further immuno-oncology analysis.

14.
Nat Genet ; 56(8): 1592-1596, 2024 Aug.
Article in English | MEDLINE | ID: mdl-39103650

ABSTRACT

Coronavirus disease 2019 (COVID-19) and influenza are respiratory illnesses caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and influenza viruses, respectively. Both diseases share symptoms and clinical risk factors1, but the extent to which these conditions have a common genetic etiology is unknown. This is partly because host genetic risk factors are well characterized for COVID-19 but not for influenza, with the largest published genome-wide association studies for these conditions including >2 million individuals2 and about 1,000 individuals3-6, respectively. Shared genetic risk factors could point to targets to prevent or treat both infections. Through a genetic study of 18,334 cases with a positive test for influenza and 276,295 controls, we show that published COVID-19 risk variants are not associated with influenza. Furthermore, we discovered and replicated an association between influenza infection and noncoding variants in B3GALT5 and ST6GAL1, neither of which was associated with COVID-19. In vitro small interfering RNA knockdown of ST6GAL1-an enzyme that adds sialic acid to the cell surface, which is used for viral entry-reduced influenza infectivity by 57%. These results mirror the observation that variants that downregulate ACE2, the SARS-CoV-2 receptor, protect against COVID-19 (ref. 7). Collectively, these findings highlight downregulation of key cell surface receptors used for viral entry as treatment opportunities to prevent COVID-19 and influenza.


Subject(s)
COVID-19 , Genetic Predisposition to Disease , Genome-Wide Association Study , Influenza, Human , SARS-CoV-2 , Humans , Influenza, Human/genetics , Influenza, Human/epidemiology , Influenza, Human/virology , COVID-19/genetics , COVID-19/virology , Risk Factors , SARS-CoV-2/genetics , Male , Female , Polymorphism, Single Nucleotide , Case-Control Studies , Middle Aged
15.
Nat Genet ; 55(7): 1138-1148, 2023 07.
Article in English | MEDLINE | ID: mdl-37308787

ABSTRACT

Human genetic studies of smoking behavior have been thus far largely limited to common variants. Studying rare coding variants has the potential to identify drug targets. We performed an exome-wide association study of smoking phenotypes in up to 749,459 individuals and discovered a protective association in CHRNB2, encoding the ß2 subunit of the α4ß2 nicotine acetylcholine receptor. Rare predicted loss-of-function and likely deleterious missense variants in CHRNB2 in aggregate were associated with a 35% decreased odds for smoking heavily (odds ratio (OR) = 0.65, confidence interval (CI) = 0.56-0.76, P = 1.9 × 10-8). An independent common variant association in the protective direction ( rs2072659 ; OR = 0.96; CI = 0.94-0.98; P = 5.3 × 10-6) was also evident, suggesting an allelic series. Our findings in humans align with decades-old experimental observations in mice that ß2 loss abolishes nicotine-mediated neuronal responses and attenuates nicotine self-administration. Our genetic discovery will inspire future drug designs targeting CHRNB2 in the brain for the treatment of nicotine addiction.


Subject(s)
Nicotine , Tobacco Use Disorder , Humans , Animals , Mice , Smoking/genetics , Tobacco Use Disorder/genetics , Phenotype , Odds Ratio
16.
bioRxiv ; 2023 Nov 02.
Article in English | MEDLINE | ID: mdl-37214792

ABSTRACT

Coding variants that have significant impact on function can provide insights into the biology of a gene but are typically rare in the population. Identifying and ascertaining the frequency of such rare variants requires very large sample sizes. Here, we present the largest catalog of human protein-coding variation to date, derived from exome sequencing of 985,830 individuals of diverse ancestry to serve as a rich resource for studying rare coding variants. Individuals of African, Admixed American, East Asian, Middle Eastern, and South Asian ancestry account for 20% of this Exome dataset. Our catalog of variants includes approximately 10.5 million missense (54% novel) and 1.1 million predicted loss-of-function (pLOF) variants (65% novel, 53% observed only once). We identified individuals with rare homozygous pLOF variants in 4,874 genes, and for 1,838 of these this work is the first to document at least one pLOF homozygote. Additional insights from the RGC-ME dataset include 1) improved estimates of selection against heterozygous loss-of-function and identification of 3,459 genes intolerant to loss-of-function, 83 of which were previously assessed as tolerant to loss-of-function and 1,241 that lack disease annotations; 2) identification of regions depleted of missense variation in 457 genes that are tolerant to loss-of-function; 3) functional interpretation for 10,708 variants of unknown or conflicting significance reported in ClinVar as cryptic splice sites using splicing score thresholds based on empirical variant deleteriousness scores derived from RGC-ME; and 4) an observation that approximately 3% of sequenced individuals carry a clinically actionable genetic variant in the ACMG SF 3.1 list of genes. We make this important resource of coding variation available to the public through a variant allele frequency browser. We anticipate that this report and the RGC-ME dataset will serve as a valuable reference for understanding rare coding variation and help advance precision medicine efforts.

17.
iScience ; 25(1): 103665, 2022 Jan 21.
Article in English | MEDLINE | ID: mdl-35036865

ABSTRACT

Characterization of ancestry-linked peptide variants in disease-relevant patient tissues represents a foundational step to connect patient ancestry with disease pathogenesis. Nonsynonymous single-nucleotide polymorphisms encoding missense substitutions within tryptic peptides exhibiting high allele frequencies in European, African, and East Asian populations, termed peptide ancestry informative markers (pAIMs), were prioritized from 1000 genomes. In silico analysis identified that as few as 20 pAIMs can determine ancestry proportions similarly to >260K SNPs (R2 = 0.99). Multiplexed proteomic analysis of >100 human endometrial cancer cell lines and uterine leiomyoma tissues combined resulted in the quantitation of 62 pAIMs that correlate with patient race and genotype-confirmed ancestry. Candidates include a D451E substitution in GC vitamin D-binding protein previously associated with altered vitamin D levels in African and European populations. pAIMs will support generalized proteoancestry assessment as well as efforts investigating the impact of ancestry on the human proteome and how this relates to the pathogenesis of uterine neoplasms.

18.
Genome Med ; 13(1): 129, 2021 08 11.
Article in English | MEDLINE | ID: mdl-34376232

ABSTRACT

BACKGROUND: Tumor response to therapy is affected by both the cell types and the cell states present in the tumor microenvironment. This is true for many cancer treatments, including immune checkpoint inhibitors (ICIs). While it is well-established that ICIs promote T cell activation, their broader impact on other intratumoral immune cells is unclear; this information is needed to identify new mechanisms of action and improve ICI efficacy. Many preclinical studies have begun using single-cell analysis to delineate therapeutic responses in individual immune cell types within tumors. One major limitation to this approach is that therapeutic mechanisms identified in preclinical models have failed to fully translate to human disease, restraining efforts to improve ICI efficacy in translational research. METHOD: We previously developed a computational transfer learning approach called projectR to identify shared biology between independent high-throughput single-cell RNA-sequencing (scRNA-seq) datasets. In the present study, we test this algorithm's ability to identify conserved and clinically relevant transcriptional changes in complex tumor scRNA-seq data and expand its application to the comparison of scRNA-seq datasets with additional data types such as bulk RNA-seq and mass cytometry. RESULTS: We found a conserved signature of NK cell activation in anti-CTLA-4 responsive mouse and human tumors. In human metastatic melanoma, we found that the NK cell activation signature associates with longer overall survival and is predictive of anti-CTLA-4 (ipilimumab) response. Additional molecular approaches to confirm the computational findings demonstrated that human NK cells express CTLA-4 and bind anti-CTLA-4 antibodies independent of the antibody binding receptor (FcR) and that similar to T cells, CTLA-4 expression by NK cells is modified by cytokine-mediated and target cell-mediated NK cell activation. CONCLUSIONS: These data demonstrate a novel application of our transfer learning approach, which was able to identify cell state transitions conserved in preclinical models and human tumors. This approach can be adapted to explore many questions in cancer therapeutics, enhance translational research, and enable better understanding and treatment of disease.


Subject(s)
CTLA-4 Antigen/antagonists & inhibitors , Killer Cells, Natural/drug effects , Killer Cells, Natural/metabolism , Lymphocyte Activation/genetics , Models, Biological , Neoplasms/genetics , Transcriptome , Animals , Biomarkers , Cell Line, Tumor , Computational Biology/methods , Databases, Genetic , Disease Models, Animal , Drug Evaluation, Preclinical , Gene Expression Profiling , Gene Expression Regulation/drug effects , Humans , Immune Checkpoint Inhibitors/pharmacology , Immune Checkpoint Inhibitors/therapeutic use , Ipilimumab/pharmacology , Ipilimumab/therapeutic use , Killer Cells, Natural/immunology , Lymphocyte Activation/immunology , Melanoma/drug therapy , Melanoma/genetics , Melanoma/pathology , Mice , Neoplasms/drug therapy , Neoplasms/metabolism , Neoplasms/pathology , Prognosis , ROC Curve , Treatment Outcome
19.
Trends Cancer ; 4(9): 643-654, 2018 09.
Article in English | MEDLINE | ID: mdl-30149882

ABSTRACT

Liquid biopsy, or the capacity to noninvasively isolate and analyze plasma tumor DNA (ptDNA) using blood samples, represents an important tool for modern oncology that enables increasingly safe, personalized, and robust cancer diagnosis and treatment. Here, we review advances in the development and implementation of liquid biopsy approaches, and we focus on the capacity of liquid biopsy to noninvasively detect oncological disease and enhance early detection strategies. In addition to noting the distinctions between mutation-targeted and mutation-agnostic approaches, we discuss the potential for genomic analysis and longitudinal testing to identify somatic lesions early and to guide intervention at more manageable disease stages.


Subject(s)
DNA, Neoplasm , Neoplasms/diagnosis , Animals , Computational Biology , Computer Simulation , Humans , Liquid Biopsy , Neoplasms/genetics , Neoplasms/therapy , Recurrence
20.
Genetics ; 209(3): 845-859, 2018 07.
Article in English | MEDLINE | ID: mdl-29692350

ABSTRACT

Resolving the mechanistic and genetic bases of reproductive barriers between species is essential to understanding the evolutionary forces that shape speciation. Intrinsic hybrid incompatibilities are often treated as fixed between species, yet there can be considerable variation in the strength of reproductive isolation between populations. The extent and causes of this variation remain poorly understood in most systems. We investigated the genetic basis of variable hybrid male sterility (HMS) between two recently diverged subspecies of house mice, Mus musculus domesticus and Mus musculus musculus We found that polymorphic HMS has a surprisingly complex genetic basis, with contributions from at least five autosomal loci segregating between two closely related wild-derived strains of M. m. musculus One of the HMS-linked regions on chromosome 4 also showed extensive introgression among inbred laboratory strains and transmission ratio distortion (TRD) in hybrid crosses. Using additional crosses and whole genome sequencing of sperm pools, we showed that TRD was limited to hybrid crosses and was not due to differences in sperm motility between M. m. musculus strains. Based on these results, we argue that TRD likely reflects additional incompatibilities that reduce hybrid embryonic viability. In some common inbred strains of mice, selection against deleterious interactions appears to have unexpectedly driven introgression at loci involved in epistatic hybrid incompatibilities. The highly variable genetic basis to F1 hybrid incompatibilities between closely related mouse lineages argues that a thorough dissection of reproductive isolation will require much more extensive sampling of natural variation than has been commonly utilized in mice and other model systems.


Subject(s)
Infertility, Male/genetics , Mice/classification , Quantitative Trait Loci , Whole Genome Sequencing/methods , Animals , Chromosomes, Mammalian/genetics , Evolution, Molecular , Genetic Speciation , Hybridization, Genetic , Inbreeding , Male , Mice/genetics , Reproductive Isolation
SELECTION OF CITATIONS
SEARCH DETAIL