Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
1.
Cell Rep Med ; : 101518, 2024 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-38642551

RESUMO

Population-based genomic screening may help diagnose individuals with disease-risk variants. Here, we perform a genome-first evaluation for nine disorders in 29,039 participants with linked exome sequences and electronic health records (EHRs). We identify 614 individuals with 303 pathogenic/likely pathogenic or predicted loss-of-function (P/LP/LoF) variants, yielding 644 observations; 487 observations (76%) lack a corresponding clinical diagnosis in the EHR. Upon further investigation, 75 clinically undiagnosed observations (15%) have evidence of symptomatic untreated disease, including familial hypercholesterolemia (3 of 6 [50%] undiagnosed observations with disease evidence) and breast cancer (23 of 106 [22%]). These genetic findings enable targeted phenotyping that reveals new diagnoses in previously undiagnosed individuals. Disease yield is greater with variants in penetrant genes for which disease is observed in carriers in an independent cohort. The prevalence of P/LP/LoF variants exceeds that of clinical diagnoses, and some clinically undiagnosed carriers are discovered to have disease. These results highlight the potential of population-based genomic screening.

2.
Diabetes Care ; 47(6): 1042-1047, 2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38652672

RESUMO

OBJECTIVE: To identify genetic risk factors for incident cardiovascular disease (CVD) among people with type 2 diabetes (T2D). RESEARCH DESIGN AND METHODS: We conducted a multiancestry time-to-event genome-wide association study for incident CVD among people with T2D. We also tested 204 known coronary artery disease (CAD) variants for association with incident CVD. RESULTS: Among 49,230 participants with T2D, 8,956 had incident CVD events (event rate 18.2%). We identified three novel genetic loci for incident CVD: rs147138607 (near CACNA1E/ZNF648, hazard ratio [HR] 1.23, P = 3.6 × 10-9), rs77142250 (near HS3ST1, HR 1.89, P = 9.9 × 10-9), and rs335407 (near TFB1M/NOX3, HR 1.25, P = 1.5 × 10-8). Among 204 known CAD loci, 5 were associated with incident CVD in T2D (multiple comparison-adjusted P < 0.00024, 0.05/204). A standardized polygenic score of these 204 variants was associated with incident CVD with HR 1.14 (P = 1.0 × 10-16). CONCLUSIONS: The data point to novel and known genomic regions associated with incident CVD among individuals with T2D.


Assuntos
Doenças Cardiovasculares , Diabetes Mellitus Tipo 2 , Estudo de Associação Genômica Ampla , Humanos , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/epidemiologia , Diabetes Mellitus Tipo 2/complicações , Doenças Cardiovasculares/genética , Doenças Cardiovasculares/epidemiologia , Feminino , Masculino , Pessoa de Meia-Idade , Idoso , Polimorfismo de Nucleotídeo Único
4.
medRxiv ; 2023 Jul 28.
Artigo em Inglês | MEDLINE | ID: mdl-37546893

RESUMO

BACKGROUND: Type 2 diabetes mellitus (T2D) confers a two- to three-fold increased risk of cardiovascular disease (CVD). However, the mechanisms underlying increased CVD risk among people with T2D are only partially understood. We hypothesized that a genetic association study among people with T2D at risk for developing incident cardiovascular complications could provide insights into molecular genetic aspects underlying CVD. METHODS: From 16 studies of the Cohorts for Heart & Aging Research in Genomic Epidemiology (CHARGE) Consortium, we conducted a multi-ancestry time-to-event genome-wide association study (GWAS) for incident CVD among people with T2D using Cox proportional hazards models. Incident CVD was defined based on a composite of coronary artery disease (CAD), stroke, and cardiovascular death that occurred at least one year after the diagnosis of T2D. Cohort-level estimated effect sizes were combined using inverse variance weighted fixed effects meta-analysis. We also tested 204 known CAD variants for association with incident CVD among patients with T2D. RESULTS: A total of 49,230 participants with T2D were included in the analyses (31,118 European ancestries and 18,112 non-European ancestries) which consisted of 8,956 incident CVD cases over a range of mean follow-up duration between 3.2 and 33.7 years (event rate 18.2%). We identified three novel, distinct genetic loci for incident CVD among individuals with T2D that reached the threshold for genome-wide significance (P<5.0×10-8): rs147138607 (intergenic variant between CACNA1E and ZNF648) with a hazard ratio (HR) 1.23, 95% confidence interval (CI) 1.15 - 1.32, P=3.6×10-9, rs11444867 (intergenic variant near HS3ST1) with HR 1.89, 95% CI 1.52 - 2.35, P=9.9×10-9, and rs335407 (intergenic variant between TFB1M and NOX3) HR 1.25, 95% CI 1.16 - 1.35, P=1.5×10-8. Among 204 known CAD loci, 32 were associated with incident CVD in people with T2D with P<0.05, and 5 were significant after Bonferroni correction (P<0.00024, 0.05/204). A polygenic score of these 204 variants was significantly associated with incident CVD with HR 1.14 (95% CI 1.12 - 1.16) per 1 standard deviation increase (P=1.0×10-16). CONCLUSIONS: The data point to novel and known genomic regions associated with incident CVD among individuals with T2D.

5.
Nat Genet ; 55(7): 1106-1115, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37308786

RESUMO

The current understanding of the genetic determinants of thoracic aortic aneurysms and dissections (TAAD) has largely been informed through studies of rare, Mendelian forms of disease. Here, we conducted a genome-wide association study (GWAS) of TAAD, testing ~25 million DNA sequence variants in 8,626 participants with and 453,043 participants without TAAD in the Million Veteran Program, with replication in an independent sample of 4,459 individuals with and 512,463 without TAAD from six cohorts. We identified 21 TAAD risk loci, 17 of which have not been previously reported. We leverage multiple downstream analytic methods to identify causal TAAD risk genes and cell types and provide human genetic evidence that TAAD is a non-atherosclerotic aortic disorder distinct from other forms of vascular disease. Our results demonstrate that the genetic architecture of TAAD mirrors that of other complex traits and that it is not solely inherited through protein-altering variants of large effect size.


Assuntos
Aneurisma da Aorta Torácica , Dissecção Aórtica , Veteranos , Humanos , Estudo de Associação Genômica Ampla , Linhagem , Aneurisma da Aorta Torácica/genética , Dissecção Aórtica/genética
6.
Nat Med ; 29(6): 1540-1549, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37248299

RESUMO

Preeclampsia and gestational hypertension are common pregnancy complications associated with adverse maternal and child outcomes. Current tools for prediction, prevention and treatment are limited. Here we tested the association of maternal DNA sequence variants with preeclampsia in 20,064 cases and 703,117 control individuals and with gestational hypertension in 11,027 cases and 412,788 control individuals across discovery and follow-up cohorts using multi-ancestry meta-analysis. Altogether, we identified 18 independent loci associated with preeclampsia/eclampsia and/or gestational hypertension, 12 of which are new (for example, MTHFR-CLCN6, WNT3A, NPR3, PGR and RGL3), including two loci (PLCE1 and FURIN) identified in the multitrait analysis. Identified loci highlight the role of natriuretic peptide signaling, angiogenesis, renal glomerular function, trophoblast development and immune dysregulation. We derived genome-wide polygenic risk scores that predicted preeclampsia/eclampsia and gestational hypertension in external cohorts, independent of clinical risk factors, and reclassified eligibility for low-dose aspirin to prevent preeclampsia. Collectively, these findings provide mechanistic insights into the hypertensive disorders of pregnancy and have the potential to advance pregnancy risk stratification.


Assuntos
Eclampsia , Hipertensão Induzida pela Gravidez , Hipertensão , Pré-Eclâmpsia , Gravidez , Feminino , Criança , Humanos , Hipertensão Induzida pela Gravidez/genética , Pré-Eclâmpsia/genética , Pré-Eclâmpsia/prevenção & controle , Aspirina , Fatores de Risco
8.
medRxiv ; 2023 Dec 24.
Artigo em Inglês | MEDLINE | ID: mdl-38196638

RESUMO

It is estimated that as many as 1 in 16 people worldwide suffer from rare diseases. Rare disease patients face difficulty finding diagnosis and treatment for their conditions, including long diagnostic odysseys, multiple incorrect diagnoses, and unavailable or prohibitively expensive treatments. As a result, it is likely that large electronic health record (EHR) systems include high numbers of participants suffering from undiagnosed rare disease. While this has been shown in detail for specific diseases, these studies are expensive and time consuming and have only been feasible to perform for a handful of the thousands of known rare diseases. The bulk of these undiagnosed cases are effectively hidden, with no straightforward way to differentiate them from healthy controls. The ability to access them at scale would enormously expand our capacity to study and develop drugs for rare diseases, adding to tools aimed at increasing availability of study cohorts for rare disease. In this study, we train a deep learning transformer algorithm, RarePT (Rare-Phenotype Prediction Transformer), to impute undiagnosed rare disease from EHR diagnosis codes in 436,407 participants in the UK Biobank and validated on an independent cohort from 3,333,560 individuals from the Mount Sinai Health System. We applied our model to 155 rare diagnosis codes with fewer than 250 cases each in the UK Biobank and predicted participants with elevated risk for each diagnosis, with the number of participants predicted to be at risk ranging from 85 to 22,000 for different diagnoses. These risk predictions are significantly associated with increased mortality for 65% of diagnoses, with disease burden expressed as disability-adjusted life years (DALY) for 73% of diagnoses, and with 72% of available disease-specific diagnostic tests. They are also highly enriched for known rare diagnoses in patients not included in the training set, with an odds ratio (OR) of 48.0 in cross-validation cohorts of the UK Biobank and an OR of 30.6 in the independent Mount Sinai Health System cohort. Most importantly, RarePT successfully screens for undiagnosed patients in 32 rare diseases with available diagnostic tests in the UK Biobank. Using the trained model to estimate the prevalence of undiagnosed disease in the UK Biobank for these 32 rare phenotypes, we find that at least 50% of patients remain undiagnosed for 20 of 32 diseases. These estimates provide empirical evidence of a high prevalence of undiagnosed rare disease, as well as demonstrating the enormous potential benefit of using RarePT to screen for undiagnosed rare disease patients in large electronic health systems.

9.
Nat Commun ; 13(1): 6914, 2022 11 14.
Artigo em Inglês | MEDLINE | ID: mdl-36376295

RESUMO

Heart failure is a leading cause of cardiovascular morbidity and mortality. However, the contribution of common genetic variation to heart failure risk has not been fully elucidated, particularly in comparison to other common cardiometabolic traits. We report a multi-ancestry genome-wide association study meta-analysis of all-cause heart failure including up to 115,150 cases and 1,550,331 controls of diverse genetic ancestry, identifying 47 risk loci. We also perform multivariate genome-wide association studies that integrate heart failure with related cardiac magnetic resonance imaging endophenotypes, identifying 61 risk loci. Gene-prioritization analyses including colocalization and transcriptome-wide association studies identify known and previously unreported candidate cardiomyopathy genes and cellular processes, which we validate in gene-expression profiling of failing and healthy human hearts. Colocalization, gene expression profiling, and Mendelian randomization provide convergent evidence for the roles of BCKDHA and circulating branch-chain amino acids in heart failure and cardiac structure. Finally, proteome-wide Mendelian randomization identifies 9 circulating proteins associated with heart failure or quantitative imaging traits. These analyses highlight similarities and differences among heart failure and associated cardiovascular imaging endophenotypes, implicate common genetic variation in the pathogenesis of heart failure, and identify circulating proteins that may represent cardiomyopathy treatment targets.


Assuntos
Estudo de Associação Genômica Ampla , Insuficiência Cardíaca , Humanos , Estudo de Associação Genômica Ampla/métodos , Fenótipo , Insuficiência Cardíaca/genética , Coração , Perfilação da Expressão Gênica , Polimorfismo de Nucleotídeo Único , Predisposição Genética para Doença
11.
Nat Genet ; 54(7): 950-962, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35710981

RESUMO

More than 800 million people suffer from kidney disease, yet the mechanism of kidney dysfunction is poorly understood. In the present study, we define the genetic association with kidney function in 1.5 million individuals and identify 878 (126 new) loci. We map the genotype effect on the methylome in 443 kidneys, transcriptome in 686 samples and single-cell open chromatin in 57,229 kidney cells. Heritability analysis reveals that methylation variation explains a larger fraction of heritability than gene expression. We present a multi-stage prioritization strategy and prioritize target genes for 87% of kidney function loci. We highlight key roles of proximal tubules and metabolism in kidney function regulation. Furthermore, the causal role of SLC47A1 in kidney disease is defined in mice with genetic loss of Slc47a1 and in human individuals carrying loss-of-function variants. Our findings emphasize the key role of bulk and single-cell epigenomic information in translating genome-wide association studies into identifying causal genes, cellular origins and mechanisms of complex traits.


Assuntos
Epigenômica , Nefropatias , Animais , Estudo de Associação Genômica Ampla , Humanos , Nefropatias/genética , Camundongos , Polimorfismo de Nucleotídeo Único/genética , Transcriptoma/genética
12.
JAMA ; 327(4): 350-359, 2022 01 25.
Artigo em Inglês | MEDLINE | ID: mdl-35076666

RESUMO

Importance: Population-based assessment of disease risk associated with gene variants informs clinical decisions and risk stratification approaches. Objective: To evaluate the population-based disease risk of clinical variants in known disease predisposition genes. Design, Setting, and Participants: This cohort study included 72 434 individuals with 37 780 clinical variants who were enrolled in the BioMe Biobank from 2007 onwards with follow-up until December 2020 and the UK Biobank from 2006 to 2010 with follow-up until June 2020. Participants had linked exome and electronic health record data, were older than 20 years, and were of diverse ancestral backgrounds. Exposures: Variants previously reported as pathogenic or predicted to cause a loss of protein function by bioinformatic algorithms (pathogenic/loss-of-function variants). Main Outcomes and Measures: The primary outcome was the disease risk associated with clinical variants. The risk difference (RD) between the prevalence of disease in individuals with a variant allele (penetrance) vs in individuals with a normal allele was measured. Results: Among 72 434 study participants, 43 395 were from the UK Biobank (mean [SD] age, 57 [8.0] years; 24 065 [55%] women; 2948 [7%] non-European) and 29 039 were from the BioMe Biobank (mean [SD] age, 56 [16] years; 17 355 [60%] women; 19 663 [68%] non-European). Of 5360 pathogenic/loss-of-function variants, 4795 (89%) were associated with an RD less than or equal to 0.05. Mean penetrance was 6.9% (95% CI, 6.0%-7.8%) for pathogenic variants and 0.85% (95% CI, 0.76%-0.95%) for benign variants reported in ClinVar (difference, 6.0 [95% CI, 5.6-6.4] percentage points), with a median of 0% for both groups due to large numbers of nonpenetrant variants. Penetrance of pathogenic/loss-of-function variants for late-onset diseases was modified by age: mean penetrance was 10.3% (95% CI, 9.0%-11.6%) in individuals 70 years or older and 8.5% (95% CI, 7.9%-9.1%) in individuals 20 years or older (difference, 1.8 [95% CI, 0.40-3.3] percentage points). Penetrance of pathogenic/loss-of-function variants was heterogeneous even in known disease predisposition genes, including BRCA1 (mean [range], 38% [0%-100%]), BRCA2 (mean [range], 38% [0%-100%]), and PALB2 (mean [range], 26% [0%-100%]). Conclusions and Relevance: In 2 large biobank cohorts, the estimated penetrance of pathogenic/loss-of-function variants was variable but generally low. Further research of population-based penetrance is needed to refine variant interpretation and clinical evaluation of individuals with these variant alleles.


Assuntos
Predisposição Genética para Doença , Variação Genética , Mutação com Perda de Função , Penetrância , Idoso , Bancos de Espécimes Biológicos , Estudos de Coortes , Feminino , Humanos , Masculino , Mutação , Reino Unido
13.
Hum Mutat ; 42(8): 969-977, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34005834

RESUMO

Biobanks with exomes linked to electronic health records (EHRs) enable the study of genetic pleiotropy between rare variants and seemingly disparate diseases. We performed robust clinical phenotyping of rare, putatively deleterious variants (loss-of-function [LoF] and deleterious missense variants) in ERCC6, a gene implicated in inherited retinal disease. We analyzed 213,084 exomes, along with a targeted set of retinal, cardiac, and immune phenotypes from two large-scale EHR-linked biobanks. In the primary analysis, a burden of deleterious variants in ERCC6 was strongly associated with (1) retinal disorders; (2) cardiac and electrocardiogram perturbations; and (3) immunodeficiency and decreased immunoglobulin levels. Meta-analysis of results from the BioMe Biobank and UK Biobank showed a significant association of deleterious ERCC6 burden with retinal dystrophy (odds ratio [OR] = 2.6, 95% confidence interval [CI]: 1.5-4.6; p = 8.7 × 10-4 ), atypical atrial flutter (OR = 3.5, 95% CI: 1.9-6.5; p = 6.2 × 10-5 ), arrhythmia (OR = 1.5, 95% CI: 1.2-2.0; p = 2.7 × 10-3 ), and lymphocyte immunodeficiency (OR = 3.8, 95% CI: 2.1-6.8; p = 5.0 × 10-6 ). Carriers of ERCC6 LoF variants who lacked a diagnosis of these conditions exhibited increased symptoms, indicating underdiagnosis. These results reveal a unique genetic link among retinal, cardiac, and immune disorders and underscore the value of EHR-linked biobanks in assessing the full clinical profile of carriers of rare variants.


Assuntos
Pleiotropia Genética , Distrofias Retinianas , Arritmias Cardíacas , DNA Helicases , Enzimas Reparadoras do DNA , Exoma , Humanos , Proteínas de Ligação a Poli-ADP-Ribose , Distrofias Retinianas/genética , Sequenciamento do Exoma/métodos
14.
Kidney Med ; 3(4): 653-658, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33942030

RESUMO

Recent case reports suggest that coronavirus disease 2019 (COVID-19) is associated with collapsing glomerulopathy in African Americans with apolipoprotein L1 gene (APOL1) risk alleles; however, it is unclear whether disease pathogenesis is similar to HIV-associated nephropathy. RNA sequencing analysis of a kidney biopsy specimen from a patient with COVID-19-associated collapsing glomerulopathy and APOL1 risk alleles (G1/G1) revealed similar levels of APOL1 and angiotensin-converting enzyme 2 (ACE2) messenger RNA transcripts as compared with 12 control kidney samples downloaded from the GTEx (Genotype-Tissue Expression) Portal. Whole-genome sequencing of the COVID-19-associated collapsing glomerulopathy kidney sample identified 4 indel gene variants, 3 of which are of unknown significance with respect to chronic kidney disease and/or focal segmental glomerulosclerosis. Molecular profiling of the kidney demonstrated activation of COVID-19-associated cell injury pathways such as inflammation and coagulation. Evidence for direct severe acute respiratory syndrome coronavirus 2 infection of kidney cells was lacking, which is consistent with the findings of several recent studies. Interestingly, immunostaining of kidney biopsy sections revealed increased expression of phospho-STAT3 (signal transducer and activator of transcription 3) in both COVID-19-associated collapsing glomerulopathy and HIV-associated nephropathy as compared with control kidney tissue. Importantly, interleukin 6-induced activation of STAT3 may be a targetable mechanism driving COVID-19-associated acute kidney injury.

15.
Hum Mol Genet ; 30(10): 952-960, 2021 05 29.
Artigo em Inglês | MEDLINE | ID: mdl-33704450

RESUMO

Diabetic retinopathy (DR) is a common consequence in type 2 diabetes (T2D) and a leading cause of blindness in working-age adults. Yet, its genetic predisposition is largely unknown. Here, we examined the polygenic architecture underlying DR by deriving and assessing a genome-wide polygenic risk score (PRS) for DR. We evaluated the PRS in 6079 individuals with T2D of European, Hispanic, African and other ancestries from a large-scale multi-ethnic biobank. Main outcomes were PRS association with DR diagnosis, symptoms and complications, and time to diagnosis, and transferability to non-European ancestries. We observed that PRS was significantly associated with DR. A standard deviation increase in PRS was accompanied by an adjusted odds ratio (OR) of 1.12 [95% confidence interval (CI) 1.04-1.20; P = 0.001] for DR diagnosis. When stratified by ancestry, PRS was associated with the highest OR in European ancestry (OR = 1.22, 95% CI 1.02-1.41; P = 0.049), followed by African (OR = 1.15, 95% CI 1.03-1.28; P = 0.028) and Hispanic ancestries (OR = 1.10, 95% CI 1.00-1.10; P = 0.050). Individuals in the top PRS decile had a 1.8-fold elevated risk for DR versus the bottom decile (P = 0.002). Among individuals without DR diagnosis, the top PRS decile had more DR symptoms than the bottom decile (P = 0.008). The PRS was associated with retinal hemorrhage (OR = 1.44, 95% CI 1.03-2.02; P = 0.03) and earlier DR presentation (10% probability of DR by 4 years in the top PRS decile versus 8 years in the bottom decile). These results establish the significant polygenic underpinnings of DR and indicate the need for more diverse ancestries in biobanks to develop multi-ancestral PRS.


Assuntos
Diabetes Mellitus Tipo 2/epidemiologia , Retinopatia Diabética/epidemiologia , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Adulto , Idoso , População Negra/genética , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/patologia , Retinopatia Diabética/complicações , Retinopatia Diabética/genética , Retinopatia Diabética/patologia , Hispânico ou Latino/genética , Humanos , Pessoa de Meia-Idade , Herança Multifatorial/genética , Medição de Risco , Fatores de Risco , População Branca/genética
17.
PLoS Genet ; 17(1): e1009337, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33493176

RESUMO

Understanding the relationship between natural selection and phenotypic variation has been a long-standing challenge in human population genetics. With the emergence of biobank-scale datasets, along with new statistical metrics to approximate strength of purifying selection at the variant level, it is now possible to correlate a proxy of individual relative fitness with a range of medical phenotypes. We calculated a per-individual deleterious load score by summing the total number of derived alleles per individual after incorporating a weight that approximates strength of purifying selection. We assessed four methods for the weight, including GERP, phyloP, CADD, and fitcons. By quantitatively tracking each of these scores with the site frequency spectrum, we identified phyloP as the most appropriate weight. The phyloP-weighted load score was then calculated across 15,129,142 variants in 335,161 individuals from the UK Biobank and tested for association on 1,380 medical phenotypes. After accounting for multiple test correction, we observed a strong association of the load score amongst coding sites only on 27 traits including body mass, adiposity and metabolic rate. We further observed that the association signals were driven by common variants (derived allele frequency > 5%) with high phyloP score (phyloP > 2). Finally, through permutation analyses, we showed that the load score amongst coding sites had an excess of nominally significant associations on many medical phenotypes. These results suggest a broad impact of deleterious load on medical phenotypes and highlight the deleterious load score as a tool to disentangle the complex relationship between natural selection and medical phenotypes.


Assuntos
Evolução Molecular , Aptidão Genética/genética , Genética Populacional , Seleção Genética/genética , Alelos , Bancos de Espécimes Biológicos , Índice de Massa Corporal , Feminino , Frequência do Gene , Estudos de Associação Genética , Predisposição Genética para Doença , Variação Genética/genética , Humanos , Masculino , Reino Unido
18.
Mol Biol Evol ; 34(11): 2792-2807, 2017 Nov 01.
Artigo em Inglês | MEDLINE | ID: mdl-28981697

RESUMO

It remains a challenge in evolutionary genetics to elucidate how beneficial mutations arise and propagate in a population and how selective pressures on mutant alleles are structured over space and time. By identifying "sweeping haplotypes (SHs)" that putatively carry beneficial alleles and are increasing (or have increased) rapidly in frequency, and surveying the geographic distribution of SH frequencies, we can indirectly infer how selective sweeps unfold in time and thus which modes of positive selection underlie those sweeps. Using population genomic data from African Drosophila melanogaster, we identified SHs from 37 candidate loci under selection. At more than half of loci, we identify single SHs. However, many other loci harbor multiple independent SHs, namely soft selective sweeps, either due to parallel evolution across space or a high beneficial mutation rate. At about a quarter of the loci, intermediate SH frequencies are found across multiple populations, which cannot be explained unless a certain form of frequency-dependent positive selection, such as heterozygote advantage, is invoked given the reasonable range of migration rates between African populations. At one locus, many independent SHs are observed over multiple populations but always together with ancestral haplotypes. This complex pattern is compatible with a large number of mutational targets in a gene and frequency-dependent selection on new variants. We conclude that very diverse modes of positive selection are operating at different sets of loci in D. melanogaster populations.


Assuntos
Drosophila melanogaster/genética , Seleção Genética/genética , África , Alelos , Animais , Evolução Biológica , Bases de Dados de Ácidos Nucleicos , Evolução Molecular , Frequência do Gene/genética , Variação Genética , Genética Populacional/métodos , Genoma de Inseto , Haplótipos/genética , Heterozigoto , Modelos Genéticos , Mutação
19.
Genetics ; 200(2): 633-49, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25911658

RESUMO

Adaptive evolution occurs as beneficial mutations arise and then increase in frequency by positive natural selection. How, when, and where in the genome such evolutionary events occur is a fundamental question in evolutionary biology. It is possible to detect ongoing positive selection or an incomplete selective sweep in species with sexual reproduction because, when a beneficial mutation is on the way to fixation, homologous chromosomes in the population are divided into two groups: one carrying the beneficial allele with very low polymorphism at nearby linked loci and the other carrying the ancestral allele with a normal pattern of sequence variation. Previous studies developed long-range haplotype tests to capture this difference between two groups as the signal of an incomplete selective sweep. In this study, we propose a composite-likelihood-ratio (CLR) test for detecting incomplete selective sweeps based on the joint sampling probabilities for allele frequencies of two groups as a function of strength of selection and recombination rate. Tested against simulated data, this method yielded statistical power and accuracy in parameter estimation that are higher than the iHS test and comparable to the more recently developed nSL test. This procedure was also applied to African Drosophila melanogaster population genomic data to detect candidate genes under ongoing positive selection. Upon visual inspection of sequence polymorphism, candidates detected by our CLR method exhibited clear haplotype structures predicted under incomplete selective sweeps. Our results suggest that different methods capture different aspects of genetic information regarding incomplete sweeps and thus are partially complementary to each other.


Assuntos
Genética Populacional , Genômica , Modelos Genéticos , Modelos Estatísticos , Seleção Genética , Algoritmos , Animais , Evolução Biológica , Simulação por Computador , Drosophila melanogaster/genética , Loci Gênicos , Genômica/métodos , Haplótipos , Homozigoto , Funções Verossimilhança , Desequilíbrio de Ligação , Polimorfismo Genético
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...