RESUMO
Human leukocyte antigen (HLA) class I and II loci are essential elements of innate and acquired immunity. Their functions include antigen presentation to T cells leading to cellular and humoral immune responses, and modulation of NK cells. Their exceptional influence on disease outcome has now been made clear by genome-wide association studies. The exons encoding the peptide-binding groove have been the main focus for determining HLA effects on disease susceptibility/pathogenesis. However, HLA expression levels have also been implicated in disease outcome, adding another dimension to the extreme diversity of HLA that impacts variability in immune responses across individuals. To estimate HLA expression, immunogenetic studies traditionally rely on quantitative PCR (qPCR). Adoption of alternative high-throughput technologies such as RNA-seq has been hampered by technical issues due to the extreme polymorphism at HLA genes. Recently, however, multiple bioinformatic methods have been developed to accurately estimate HLA expression from RNA-seq data. This opens an exciting opportunity to quantify HLA expression in large datasets but also brings questions on whether RNA-seq results are comparable to those by qPCR. In this study, we analyze three classes of expression data for HLA class I genes for a matched set of individuals: (a) RNA-seq, (b) qPCR, and (c) cell surface HLA-C expression. We observed a moderate correlation between expression estimates from qPCR and RNA-seq for HLA-A, -B, and -C (0.2 ≤ rho ≤ 0.53). We discuss technical and biological factors which need to be accounted for when comparing quantifications for different molecular phenotypes or using different techniques.
Assuntos
Estudo de Associação Genômica Ampla , Antígenos de Histocompatibilidade Classe I , Humanos , RNA-Seq , Antígenos de Histocompatibilidade Classe I/genética , Antígenos HLA-C/genética , Reação em Cadeia da PolimeraseRESUMO
The HLA (Human Leukocyte Antigens) genes are well-documented targets of balancing selection, and variation at these loci is associated with many disease phenotypes. Variation in expression levels also influences disease susceptibility and resistance, but little information exists about the regulation and population-level patterns of expression. This results from the difficulty in mapping short reads originated from these highly polymorphic loci, and in accounting for the existence of several paralogues. We developed a computational pipeline to accurately estimate expression for HLA genes based on RNA-seq, improving both locus-level and allele-level estimates. First, reads are aligned to all known HLA sequences in order to infer HLA genotypes, then quantification of expression is carried out using a personalized index. We use simulations to show that expression estimates obtained in this way are not biased due to divergence from the reference genome. We applied our pipeline to the GEUVADIS dataset, and compared the quantifications to those obtained with reference transcriptome. Although the personalized pipeline recovers more reads, we found that using the reference transcriptome produces estimates similar to the personalized pipeline (r ≥ 0.87) with the exception of HLA-DQA1. We describe the impact of the HLA-personalized approach on downstream analyses for nine classical HLA loci (HLA-A, HLA-C, HLA-B, HLA-DRA, HLA-DRB1, HLA-DQA1, HLA-DQB1, HLA-DPA1, HLA-DPB1). Although the influence of the HLA-personalized approach is modest for eQTL mapping, the p-values and the causality of the eQTLs obtained are better than when the reference transcriptome is used. We investigate how the eQTLs we identified explain variation in expression among lineages of HLA alleles. Finally, we discuss possible causes underlying differences between expression estimates obtained using RNA-seq, antibody-based approaches and qPCR.
Assuntos
Mapeamento Cromossômico , Expressão Gênica , Antígenos HLA/genética , Locos de Características Quantitativas , Alelos , Biologia Computacional/métodos , Frequência do Gene , Genótipo , Haplótipos , Humanos , TranscriptomaRESUMO
Genome-wide associations studies have repeatedly identified the major histocompatibility complex genomic region (6p21.3) as key in immune pathologies. Researchers have also aimed to extend the biological interpretation of associations by focusing directly on human leukocyte antigen (HLA) polymorphisms and their combination as haplotypes. To circumvent the effort and high costs of HLA typing, statistical solutions have been developed to infer HLA alleles from single-nucleotide polymorphism (SNP) genotyping data. Though HLA imputation methods have been developed, no unified effort has yet been undertaken to share large and diverse imputation models, or to improve methods. By training the HIBAG software on SNP + HLA data generated by the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA) to create reference panels, we highlighted the importance of (a) the number of individuals in reference panels, with a twofold increase in accuracy (from 10 to 100 individuals) and (b) the number of SNPs, with a 1.5-fold increase in accuracy (from 500 to 24,504 SNPs). Results showed improved accuracy with CAAPA compared to the African American models available in HIBAG, highlighting the need for precise population-matching. The SNP-HLA Reference Consortium is an international endeavor to gather data, enhance HLA imputation and broaden access to highly accurate imputation models for the immunogenomics community.
Assuntos
Negro ou Afro-Americano/genética , Genoma Humano/genética , Antígenos HLA/genética , Polimorfismo de Nucleotídeo Único/genética , Alelos , Asma/genética , Frequência do Gene/genética , Genômica , Genótipo , Haplótipos/genética , Humanos , Disseminação de Informação , Modelos Genéticos , População Branca/genéticaRESUMO
Diagnosis of individuals affected by monogenic disorders was significantly improved by next-generation sequencing targeting clinically relevant genes. Whole exomes yield a large number of variants that require several filtering steps, prioritization, and pathogenicity classification. Among the criteria recommended by ACMG, those that rely on population databases critically affect analyses of individuals with underrepresented ancestries. Population-specific allelic frequencies need consideration when characterizing potential deleteriousness of variants. An orthogonal input for classification is annotation of variants previously classified as pathogenic as a criterion that provide supporting evidence widely sourced at ClinVar. We used a whole-genome dataset from a census-based cohort of 1,171 elderly individuals from São Paulo, Brazil, highly admixed, and unaffected by severe monogenic disorders, to investigate if pathogenic assertions in ClinVar are enriched with higher proportions of European ancestry, indicating bias. Potential loss of function (pLOF) variants were filtered from 4,250 genes associated with Mendelian disorders and annotated with ClinVar assertions. Over 1,800 single nucleotide pLOF variants were included, 381 had non-benign assertions. Among carriers (N = 463), average European ancestry was significantly higher than noncarriers (N = 708; p = .011). pLOFs in genomic contexts of non-European local ancestries were nearly three times less likely to have any ClinVar entry (OR = 0.353; p <.0001). Independent pathogenicity assertions are useful for variant classification in molecular diagnosis. However, European overrepresentation of assertions can promote distortions when classifying variants in non-European individuals, even in admixed samples with a relatively high proportion of European ancestry. The investigation and deposit of clinically relevant findings of diverse populations is fundamental improve this scenario.
Assuntos
Variação Genética , Genômica , Idoso , Brasil , Exoma , Sequenciamento de Nucleotídeos em Larga Escala , HumanosRESUMO
The majority of aneuploid fetuses are spontaneously miscarried. Nevertheless, some aneuploid individuals survive despite the strong genetic insult. Here, we investigate if the survival probability of aneuploid fetuses is affected by the genome-wide burden of slightly deleterious variants. We analyzed two cohorts of live-born Down syndrome individuals (388 genotyped samples and 16 fibroblast transcriptomes) and observed a deficit of slightly deleterious variants on Chromosome 21 and decreased transcriptome-wide variation in the expression level of highly constrained genes. We interpret these results as signatures of embryonic selection, and propose a genetic handicap model whereby an individual bearing an extremely severe deleterious variant (such as aneuploidy) could escape embryonic lethality if the genome-wide burden of slightly deleterious variants is sufficiently low. This approach can be used to study the composition and effect of the numerous slightly deleterious variants in humans and model organisms.
Assuntos
Aneuploidia , Cromossomos Humanos Par 21/genética , Síndrome de Down , Genótipo , Transcriptoma , Aborto Espontâneo , Síndrome de Down/embriologia , Síndrome de Down/genética , Feminino , Humanos , GravidezRESUMO
Meeting the challenges brought by the COVID-19 pandemic requires an interdisciplinary approach. In this context, integrating knowledge of immune function with an understanding of how genetic variation influences the nature of immunity is a key challenge. Immunogenetics can help explain the heterogeneity of susceptibility and protection to the viral infection and disease progression. Here, we review the knowledge developed so far, discussing fundamental genes for triggering the innate and adaptive immune responses associated with a viral infection, especially with the SARS-CoV-2 mechanisms. We emphasize the role of the HLA and KIR genes, discussing what has been uncovered about their role in COVID-19 and addressing methodological challenges of studying these genes. Finally, we comment on questions that arise when studying admixed populations, highlighting the case of Brazil. We argue that the interplay between immunology and an understanding of genetic associations can provide an important contribution to our knowledge of COVID-19.
RESUMO
When humans moved from Asia toward the Americas over 18,000 y ago and eventually peopled the New World they encountered a new environment with extreme climate conditions and distinct dietary resources. These environmental and dietary pressures may have led to instances of genetic adaptation with the potential to influence the phenotypic variation in extant Native American populations. An example of such an event is the evolution of the fatty acid desaturases (FADS) genes, which have been claimed to harbor signals of positive selection in Inuit populations due to adaptation to the cold Greenland Arctic climate and to a protein-rich diet. Because there was evidence of intercontinental variation in this genetic region, with indications of positive selection for its variants, we decided to compare the Inuit findings with other Native American data. Here, we use several lines of evidence to show that the signal of FADS-positive selection is not restricted to the Arctic but instead is broadly observed throughout the Americas. The shared signature of selection among populations living in such a diverse range of environments is likely due to a single and strong instance of local adaptation that took place in the common ancestral population before their entrance into the New World. These first Americans peopled the whole continent and spread this adaptive variant across a diverse set of environments.
Assuntos
Ácidos Graxos Dessaturases/genética , Migração Humana/história , Indígenas Centro-Americanos/genética , Indígenas Norte-Americanos/genética , Indígenas Sul-Americanos/genética , Inuíte/genética , Seleção Genética , Povo Asiático/genética , Povo Asiático/história , População Negra/genética , População Negra/história , Mapeamento Cromossômico , Cromossomos Humanos , Genética Populacional , História Antiga , Humanos , Indígenas Centro-Americanos/história , Indígenas Norte-Americanos/história , Indígenas Sul-Americanos/história , Inuíte/história , Polimorfismo de Nucleotídeo Único , População Branca/genética , População Branca/históriaRESUMO
Several decades of research have convincingly shown that classical human leukocyte antigen (HLA) loci bear signatures of natural selection. Despite this conclusion, many questions remain regarding the type of selective regime acting on these loci, the time frame at which selection acts, and the functional connections between genetic variability and natural selection. In this review, we argue that genomic datasets, in particular those generated by next-generation sequencing (NGS) at the population scale, are transforming our understanding of HLA evolution. We show that genomewide data can be used to perform robust and powerful tests for selection, capable of identifying both positive and balancing selection at HLA genes. Importantly, these tests have shown that natural selection can be identified at both recent and ancient timescales. We discuss how findings from genomewide association studies impact the evolutionary study of HLA genes, and how genomic data can be used to survey adaptive change involving interaction at multiple loci. We discuss the methodological developments which are necessary to correctly interpret genomic analyses involving the HLA region. These developments include adapting the NGS analysis framework so as to deal with the highly polymorphic HLA data, as well as developing tools and theory to search for signatures of selection, quantify differentiation, and measure admixture within the HLA region. Finally, we show that high throughput analysis of molecular phenotypes for HLA genes-namely transcription levels-is now a feasible approach and can add another dimension to the study of genetic variation.
Assuntos
Antígenos HLA/genética , Complexo Principal de Histocompatibilidade/genética , Alelos , Evolução Molecular , Variação Genética/genética , Estudo de Associação Genômica Ampla , Genômica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Antígenos de Histocompatibilidade Classe I/genética , Antígenos de Histocompatibilidade Classe II/genética , Teste de Histocompatibilidade/métodos , Humanos , Polimorfismo Genético/genética , Seleção Genética/genéticaRESUMO
OBJECTIVES: Quilombo remnants are relics of communities founded by runaway or abandoned African slaves, but often with subsequent extensive and complex admixture patterns with European and Native Americans. We combine a genetic study of Y-chromosome markers with anthropological surveys in order to obtain a portrait of quilombo structure and history in the region that has the largest number of quilombo remnants in the state of São Paulo. METHODS: Samples from 289 individuals from quilombo remnants were genotyped using a set of 17 microsatellites on the Y chromosome (AmpFlSTR-Yfiler). A subset of 82 samples was also genotyped using SNPs array (Axiom Human Origins-Affymetrix). We estimated haplotype and haplogroup frequencies, haplotype diversity and sharing, and pairwise genetic distances through FST and RST indexes. RESULTS: We identified 95 Y chromosome haplotypes, classified into 15 haplogroups. About 63% are European, 32% are African, and 6% Native American. The most common were: R1b (European, 34.2%), E1b1a (African, 32.3%), J1 (European, 6.9%), and Q (Native American, 6.2%). Genetic differentiation among communities was low (FST = 0.0171; RST = 0.0161), and haplotype sharing was extensive. Genetic, genealogical and oral surveys allowed us to detect five main founder haplotypes, which explained a total of 27.7% of the Y chromosome lineages. CONCLUSIONS: Our results showed a high European patrilineal genetic contribution among the founders of quilombos, high amounts of gene flow, and a recent common origin of these populations. Common haplotypes and genealogical data indicate the origin of quilombos from a few male individuals. Our study reinforces the importance of a dual approach, involving the analysis of both anthropological and genetic data.
Assuntos
Cromossomos Humanos Y/genética , Haplótipos , Herança Paterna , Polimorfismo de Nucleotídeo Único , População Negra/genética , Brasil , Humanos , Repetições de Microssatélites , População RuralRESUMO
The classical class I HLA loci of humans show an excess of nonsynonymous with respect to synonymous substitutions at codons of the antigen recognition site (ARS), a hallmark of adaptive evolution. Additionally, high polymporphism, linkage disequilibrium, and disease associations suggest that one or more balancing selection regimes have acted upon these genes. However, several questions about these selective regimes remain open. First, it is unclear if stronger evidence for selection on deep timescales is due to changes in the intensity of selection over time or to a lack of power of most methods to detect selection on recent timescales. Another question concerns the functional entities which define the selected phenotype. While most analyses focus on selection acting on individual alleles, it is also plausible that phylogenetically defined groups of alleles ("lineages") are targets of selection. To address these questions, we analyzed how dN/dS (ω) varies with respect to divergence times between alleles and phylogenetic placement (position of branches). We find that ω for ARS codons of class I HLA genes increases with divergence time and is higher for inter-lineage branches. Throughout our analyses, we used non-selected codons to control for possible effects of inflation of ω associated to intra-specific analysis, and showed that our results are not artifactual. Our findings indicate the importance of considering the timescale effect when analysing ω over a wide spectrum of divergences. Finally, our results support the divergent allele advantage model, whereby heterozygotes with more divergent alleles have higher fitness than those carrying similar alleles.
Assuntos
Alelos , Evolução Molecular , Genes MHC Classe I , Humanos , Modelos GenéticosRESUMO
The killer immunoglobulin-like receptor (KIR) gene cluster shows extensive genetic diversity, as do the HLA class I loci, which encode ligands for KIR molecules. We genotyped 1,642 individuals from 30 geographically distinct populations to examine population-level evidence for coevolution of these two functionally related but unlinked gene clusters. We observed strong negative correlations between the presence of activating KIR genes and their corresponding HLA ligand groups across populations, especially KIR3DS1 and its putative HLA-B Bw4-80I ligands (r = -0.66, P = 0.038). In contrast, we observed weak positive relationships between the various inhibitory KIR genes and their ligands. We observed a negative correlation between distance from East Africa and frequency of activating KIR genes and their corresponding ligands, suggesting a balance between selection on HLA and KIR loci. Most KIR-HLA genetic association studies indicate a primary influence of activating KIR-HLA genotypes in disease risk; concomitantly, activating receptor-ligand pairs in this study show the strongest signature of coevolution of these two complex genetic systems as compared with inhibitory receptor-ligand pairs.
Assuntos
Evolução Molecular , Variação Genética , Antígenos HLA/genética , Receptores KIR/genética , Alelos , Frequência do Gene , Genética Populacional , Genótipo , Antígenos HLA-B/genética , Haplótipos , Humanos , Desequilíbrio de Ligação , Polimorfismo Genético , Receptores KIR3DS1/genéticaRESUMO
Supertypes are groups of human leukocyte antigen (HLA) alleles which bind overlapping sets of peptides due to sharing specific residues at the anchor positions-the B and F pockets-of the peptide-binding region (PBR). HLA alleles within the same supertype are expected to be functionally similar, while those from different supertypes are expected to be functionally distinct, presenting different sets of peptides. In this study, we applied the supertype classification to the HLA-A and HLA-B data of 55 worldwide populations in order to investigate the effect of natural selection on supertype rather than allelic variation at these loci. We compared the nucleotide diversity of the B and F pockets with that of the other PBR regions through a resampling procedure and compared the patterns of within-population heterozygosity (He) and between-population differentiation (G ST) observed when using the supertype definition to those estimated when using randomized groups of alleles. At HLA-A, low levels of variation are observed at B and F pockets and randomized He and G ST do not differ from the observed data. By contrast, HLA-B concentrates most of the differences between supertypes, the B pocket showing a particularly high level of variation. Moreover, at HLA-B, the reassignment of alleles into random groups does not reproduce the patterns of population differentiation observed with supertypes. We thus conclude that differently from HLA-A, for which supertype and allelic variation show similar patterns of nucleotide diversity within and between populations, HLA-B has likely evolved through specific adaptations of its B pocket to local pathogens.
Assuntos
Evolução Biológica , Genética Populacional , Antígenos HLA-A/genética , Antígenos HLA-B/genética , Polimorfismo Genético/genética , Seleção Genética/genética , Simulação por Computador , Bases de Dados Factuais , Antígenos HLA-A/classificação , Antígenos HLA-A/imunologia , Antígenos HLA-B/classificação , Antígenos HLA-B/imunologia , Teste de Histocompatibilidade , Humanos , Epitopos Imunodominantes , Agências InternacionaisRESUMO
This article deals with the estimation of inbreeding and substructure levels in a set of 10 (later regrouped as eight) African-derived quilombo communities from the Ribeira River Valley in the southern portion of the state of São Paulo, Brazil. Inbreeding levels were assessed through F-values estimated from the direct analysis of genealogical data and from the statistical analysis of a large set of 30 molecular markers. The levels of population substructure found were modest, as was the degree of inbreeding: in the set of all communities considered together, F-values were 0.00136 and 0.00248 when using raw and corrected data from their complete genealogical structures, respectively, and 0.022 and 0.036 when using the information taken from the statistical analysis of all 30 loci and of 14 single-nucleotide polymorphic loci, respectively. The overall frequency of consanguineous marriages in the set of all communities considered together was â¼ 2%. Although modest, the values of the estimated parameters are much larger than those obtained for the overall Brazilian population and in general much smaller than the ones recorded for other Brazilian isolates. To circumvent problems related to heterogeneous sampling and virtual absence of reliable records of biological relationships, we had to develop or adapt several methods for making valid estimates of the prescribed parameters.
Assuntos
População Negra , Consanguinidade , Filogenia , Brasil/epidemiologia , Frequência do Gene , Variação Genética , Genética Populacional , Humanos , PrevalênciaRESUMO
The SNP-HLA Reference Consortium (SHLARC), a component of the 18th International HLA and Immunogenetics Workshop, is aimed at collecting diverse and extensive human leukocyte antigen (HLA) data to create custom reference panels and enhance HLA imputation techniques. Genome-wide association studies (GWAS) have significantly contributed to identifying genetic associations with various diseases. The HLA genomic region has emerged as the top locus in GWAS, particularly in immune-related disorders. However, the limited information provided by single nucleotide polymorphisms (SNPs), the hallmark of GWAS, poses challenges, especially in the HLA region, where strong linkage disequilibrium (LD) spans several megabases. HLA imputation techniques have been developed using statistical inference in response to these challenges. These techniques enable the prediction of HLA alleles from genotyped GWAS SNPs. Here we present the SHLARC activities, a collaborative effort to create extensive, and multi-ethnic reference panels to enhance HLA imputation accuracy.
Assuntos
Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Humanos , Imunogenética , Alelos , Antígenos HLA/genética , GenótipoRESUMO
The MHC class I region contains crucial genes for the innate and adaptive immune response, playing a key role in susceptibility to many autoimmune and infectious diseases. Genome-wide association studies have identified numerous disease-associated SNPs within this region. However, these associations do not fully capture the immune-biological relevance of specific HLA alleles. HLA imputation techniques may leverage available SNP arrays by predicting allele genotypes based on the linkage disequilibrium between SNPs and specific HLA alleles. Successful imputation requires diverse and large reference panels, especially for admixed populations. This study employed a bioinformatics approach to call SNPs and HLA alleles in multi-ethnic samples from the 1000 genomes (1KG) dataset and admixed individuals from Brazil (SABE), utilising 30X whole-genome sequencing data. Using HIBAG, we created three reference panels: 1KG (n = 2504), SABE (n = 1171), and the full model (n = 3675) encompassing all samples. In extensive cross-validation of these reference panels, the multi-ethnic 1KG reference exhibited overall superior performance than the reference with only Brazilian samples. However, the best results were achieved with the full model. Additionally, we expanded the scope of imputation by developing reference panels for non-classical, MICA, MICB and HLA-H genes, previously unavailable for multi-ethnic populations. Validation in an independent Brazilian dataset showcased the superiority of our reference panels over the Michigan Imputation Server, particularly in predicting HLA-B alleles among Brazilians. Our investigations underscored the need to enhance or adapt reference panels to encompass the target population's genetic diversity, emphasising the significance of multiethnic references for accurate imputation across different populations.
Assuntos
Alelos , Etnicidade , Frequência do Gene , Polimorfismo de Nucleotídeo Único , Humanos , Brasil , Etnicidade/genética , Antígenos HLA/genética , Desequilíbrio de Ligação , Estudo de Associação Genômica Ampla/métodos , Genótipo , Genética Populacional/métodos , Antígenos de Histocompatibilidade Classe I/genética , Biologia Computacional/métodosRESUMO
Despite evidence that at the interspecific scale, exonic splicing silencers (ESSs) are under negative selection in constitutive exons, little is known about the effects of slightly deleterious polymorphisms on these splicing regulators. Through the application of a modified version of the McDonald-Kreitman test, we compared the normalized proportions of human polymorphisms and human/rhesus substitutions affecting exonic splicing regulators (ESRs) on sequences of constitutive and alternative exons. Our results show a depletion of substitutions and an enrichment of SNPs associated with ESS gain in constitutive exons. Moreover, we show that this evolutionary pattern is also present in a set of ESRs previously involved in the transition from constitutive to skipped exons in the mammalian lineage. The similarity between these two sets of ESRs suggests that the transition from constitutive to skipped exons in mammals is more frequently associated with the inhibition than with the promotion of splicing signals. This is in accordance with the hypothesis of a constitutive origin of exon skipping and corroborates previous findings about the antagonistic role of certain exonic splicing enhancers.
Assuntos
Evolução Biológica , Éxons , Polimorfismo de Nucleotídeo Único , Splicing de RNA , Sequências Reguladoras de Ácido Nucleico , Seleção Genética , Animais , Elementos Facilitadores Genéticos , Humanos , Mamíferos/genética , Modelos GenéticosRESUMO
With the availability of a large amount of genomic data it is expected that the influence of single nucleotide variations (SNVs) in many biological phenomena will be elucidated. Here, we approached the problem of how SNVs affect alternative splicing. First, we observed that SNVs and exonic splicing regulators (ESRs) independently show a biased distribution in alternative exons. More importantly, SNVs map more frequently in ESRs located in alternative exons than in ESRs located in constitutive exons. By looking at SNVs associated with alternative exon/intron borders (by their common presence in the same cDNA molecule), we observed that a specific type of ESR, the exonic splicing silencers (ESSs), are more frequently modified by SNVs. Our results establish a clear association between genetic diversity and alternative splicing involving ESSs.
Assuntos
Processamento Alternativo , Éxons , Polimorfismo de Nucleotídeo Único , Sequências Reguladoras de Ácido Ribonucleico , Humanos , ÍntronsRESUMO
The identification of genomic regions and genes that have evolved under natural selection is a fundamental objective in the field of evolutionary genetics. While various approaches have been established for the detection of targets of positive selection, methods for identifying targets of balancing selection, a form of natural selection that preserves genetic and phenotypic diversity within populations, have yet to be fully developed. Despite this, balancing selection is increasingly acknowledged as a significant driver of diversity within populations, and the identification of its signatures in genomes is essential for understanding its role in evolution. In recent years, a plethora of sophisticated methods has been developed for the detection of patterns of linked variation produced by balancing selection, such as high levels of polymorphism, altered allele-frequency distributions, and polymorphism sharing across divergent populations. In this review, we provide a comprehensive overview of classical and contemporary methods, offer guidance on the choice of appropriate methods, and discuss the importance of avoiding artifacts and of considering alternative evolutionary processes. The increasing availability of genome-scale datasets holds the potential to assist in the identification of new targets and the quantification of the prevalence of balancing selection, thus enhancing our understanding of its role in natural populations.
Assuntos
Variação Genética , Polimorfismo Genético , Frequência do Gene , Genoma , Seleção Genética , Genética PopulacionalRESUMO
Genes involved in host-pathogen interactions are often strongly affected by positive natural selection. The Duffy antigen, coded by the Duffy antigen receptor for chemokines (DARC) gene, serves as a receptor for Plasmodium vivax in humans and for Plasmodium knowlesi in some nonhuman primates. In the majority of sub-Saharan Africans, a nucleic acid variant in GATA-1 of the gene promoter is responsible for the nonexpression of the Duffy antigen on red blood cells and consequently resistance to invasion by P. vivax. The Duffy antigen also acts as a receptor for chemokines and is expressed in red blood cells and many other tissues of the body. Because of this dual role, we sequenced a ~3,000-bp region encompassing the entire DARC gene as well as part of its 5' and 3' flanking regions in a phylogenetic sample of primates and used statistical methods to evaluate the nature of selection pressures acting on the gene during its evolution. We analyzed both coding and regulatory regions of the DARC gene. The regulatory analysis showed accelerated rates of substitution at several sites near known motifs. Our tests of positive selection in the coding region using maximum likelihood by branch sites and maximum likelihood by codon sites did not yield statistically significant evidence for the action of positive selection. However, the maximum likelihood test in which the gene was subdivided into different structural regions showed that the known binding region for P. vivax/P. knowlesi is under very different selective pressures than the remainder of the gene. In fact, most of the gene appears to be under strong purifying selection, but this is not evident in the binding region. We suggest that the binding region is under the influence of two opposing selective pressures, positive selection possibly exerted by the parasite and purifying selection exerted by chemokines.
Assuntos
Resistência à Doença/genética , Sistema do Grupo Sanguíneo Duffy/genética , Evolução Molecular , Plasmodium vivax/patogenicidade , Primatas/genética , Receptores de Superfície Celular/genética , Sequência de Aminoácidos , Animais , Sítios de Ligação , Sistema do Grupo Sanguíneo Duffy/metabolismo , Fator de Transcrição GATA1/metabolismo , Humanos , Malária/genética , Malária/parasitologia , Dados de Sequência Molecular , Filogenia , Receptores de Superfície Celular/metabolismo , Sequências Reguladoras de Ácido Nucleico , Seleção Genética , Alinhamento de SequênciaRESUMO
Nonsyndromic cleft lip with or without cleft palate (NSCL/P) is a complex disorder with a worldwide incidence estimated at 1:700. Among the putative susceptibility loci, the IRF6 gene and a region at 8q24.21 have been corroborated in different populations. To test the role of IRF6 in NSCL/P predisposition in the Brazilian population, we conducted a structured association study with the SNPs rs642961 and rs590223, respectively, located at 5' and 3' of the IRF6 gene and not in strong linkage disequilibrium (LD), in patients from five different Brazilian locations. We also evaluated the effect of these SNPs in IRF6 expression in mesenchymal stem cells (MSC). We observed association between rs642961 and cleft lip only (CLO) (P=0.009; odds ratio (OR) for AA genotype=1.83 [95% Confidence interval (CI), 0.64-5.31]; OR for AG genotype=1.72 [95% CI, 1.03-2.84]). This association seems to be driven by the affected patients from Barbalha, a location which presents the highest heritability estimate (H2=0.85), and the A allele at rs642961 is acting through a dominant model. No association was detected for the SNP rs590223. We did not find any correlation between expression levels and genotypes of the two loci, and it is possible that these SNPs have a functional role in some specific period of embryogenesis.