RESUMO
Gene flow between previously differentiated populations during the founding of an admixed or hybrid population has the potential to introduce adaptive alleles into the new population. If the adaptive allele is common in one source population, but not the other, then as the adaptive allele rises in frequency in the admixed population, genetic ancestry from the source containing the adaptive allele will increase nearby as well. Patterns of genetic ancestry have therefore been used to identify post-admixture positive selection in humans and other animals, including examples in immunity, metabolism, and animal coloration. A common method identifies regions of the genome that have local ancestry "outliers" compared with the distribution across the rest of the genome, considering each locus independently. However, we lack theoretical models for expected distributions of ancestry under various demographic scenarios, resulting in potential false positives and false negatives. Further, ancestry patterns between distant sites are often not independent. As a result, current methods tend to infer wide genomic regions containing many genes as under selection, limiting biological interpretation. Instead, we develop a deep learning object detection method applied to images generated from local ancestry-painted genomes. This approach preserves information from the surrounding genomic context and avoids potential pitfalls of user-defined summary statistics. We find the method is robust to a variety of demographic misspecifications using simulated data. Applied to human genotype data from Cabo Verde, we localize a known adaptive locus to a single narrow region compared with multiple or long windows obtained using two other ancestry-based methods.
Assuntos
Genética Populacional , Genômica , Animais , Humanos , Genômica/métodos , Genótipo , Fluxo Gênico , CromossomosRESUMO
Throughout human history, large-scale migrations have facilitated the formation of populations with ancestry from multiple previously separated populations. This process leads to subsequent shuffling of genetic ancestry through recombination, producing variation in ancestry between populations, among individuals in a population, and along the genome within an individual. Recent methodological and empirical developments have elucidated the genomic signatures of this admixture process, bringing previously understudied admixed populations to the forefront of population and medical genetics. Under this theme, we present a collection of recent PLOS Genetics publications that exemplify recent progress in human genetic admixture studies, and we discuss potential areas for future work.
Assuntos
Variação Genética , Genética Populacional , Genética Humana , Modelos Genéticos , Alelos , Genoma Humano , Geografia , Haplótipos , Humanos , Locos de Características Quantitativas , Seleção GenéticaRESUMO
We hypothesized that typical tissue and clinical chemistry (ClinChem) end points measured in rat toxicity studies exhibit chemical-independent biological thresholds beyond which cancer occurs. Using the rat in vivo TG-GATES study, 75 chemicals were examined across chemical-dose-time comparisons that could be linked to liver tumor outcomes. Thresholds for liver weight to body weight (LW/BW) and 21 serum ClinChem end points were defined as the maximum and minimum values for those exposures that did not lead to liver tumors in rats. Upper thresholds were identified for LW/BW (117%), aspartate aminotransferase (195%), alanine aminotransferase (141%), alkaline phosphatase (152%), and total bilirubin (115%), and lower thresholds were identified for phospholipids (82%), relative albumin (93%), total cholesterol (82%), and total protein (94%). Thresholds derived from the TG-GATES data set were consistent across other acute and subchronic rat studies. A training set of ClinChem and LW/BW thresholds derived from a 38 chemical training set from TG-GATES was predictive of liver tumor outcomes for a test set of 37 independent TG-GATES chemicals (91%). The thresholds were most predictive when applied to 7d treatments (98%). These findings provide support that biological thresholds for common end points in rodent studies can be used to predict chemical tumorigenic potential.
Assuntos
Carcinogênese , Neoplasias Hepáticas , Alanina Transaminase , Animais , Aspartato Aminotransferases , Fígado , Neoplasias Hepáticas/induzido quimicamente , RatosRESUMO
Chromosomal inversions shape recombination landscapes, and species differing by inversions may exhibit reduced gene flow in these regions of the genome. Though single crossovers within inversions are not usually recovered from inversion heterozygotes, the recombination barrier imposed by inversions is nuanced by noncrossover gene conversion. Here, we provide a genomewide empirical analysis of gene conversion rates both within species and in species hybrids. We estimate that gene conversion occurs at a rate of 1 × 10-5 to 2.5 × 10-5 converted sites per bp per generation in experimental crosses within Drosophila pseudoobscura and between D. pseudoobscura and its naturally hybridizing sister species D. persimilis. This analysis is the first direct empirical assessment of gene conversion rates within inversions of a species hybrid. Our data show that gene conversion rates in interspecies hybrids are at least as high as within-species estimates of gene conversion rates, and gene conversion occurs regularly within and around inverted regions of species hybrids, even near inversion breakpoints. We also found that several gene conversion events appeared to be mitotic rather than meiotic in origin. Finally, we observed that gene conversion rates are higher in regions of lower local sequence divergence, yet our observed gene conversion rates in more divergent inverted regions were at least as high as in less divergent collinear regions. Given our observed high rates of gene conversion despite the sequence differentiation between species, especially in inverted regions, gene conversion has the potential to reduce the efficacy of inversions as barriers to recombination over evolutionary time.
Assuntos
Inversão Cromossômica/genética , Evolução Molecular , Conversão Gênica/genética , Recombinação Genética , Animais , Drosophila/genética , Feminino , Genoma de Inseto/genética , Heterozigoto , Masculino , Cromossomo X/genéticaRESUMO
With the large collections of gene and genome sequences, there is a need to generate curated comparative genomic databases that enable interpretation of results in an evolutionary context. Such resources can facilitate an understanding of the co-evolution of genes in the context of a genome mapped onto a phylogeny, of a protein structure, and of interactions within a pathway. A phylogenetically indexed gene family database, the adaptive evolution database (TAED), is presented that organizes gene families and their evolutionary histories in a species tree context. Gene families include alignments, phylogenetic trees, lineage-specific dN/dS ratios, reconciliation with the species tree to enable both the mapping and the identification of duplication events, mapping of gene families onto pathways, and mapping of amino acid substitutions onto protein structures. In addition to organization of the data, new phylogenetic visualization tools have been developed to aid in interpreting the data that are also available, including TreeThrasher and TAED Tree Viewer. A new resource of gene families organized by species and taxonomic lineage promises to be a valuable comparative genomics database for molecular biologists, evolutionary biologists, and ecologists. The new visualization tools and database framework will be of interest to both evolutionary biologists and bioinformaticians.
Assuntos
Cordados/genética , Bases de Dados Genéticas , Evolução Molecular , Genômica/métodos , Família Multigênica , Animais , Filogenia , Análise de Sequência de DNA/métodos , SoftwareRESUMO
Crossing over is well known to have profound effects on patterns of genetic diversity and genome evolution. Far less direct attention has been paid to another distinct outcome of meiotic recombination: noncrossover gene conversion (NCGC). Crossing over and NCGC both shuffle combinations of alleles, and this degradation of linkage disequilibrium (LD) has major evolutionary consequences, ranging from immediate effects on nucleotide diversity to long-term consequences that shape genome evolution, species formation and species persistence. Unlike simple crossing over, NCGC has the potential to alter allele frequencies. Gene conversion can also occur in genomic regions where crossing over does not, and it purportedly exhibits more uniform rates across genomes. Considerable progress has been made towards understanding the mechanisms of gene conversion, and this progress enables us to begin exploring how gene conversion affects processes such as molecular evolution and interspecies gene flow. These topics are timely with the recent shift in focus from a primarily neutral null model of molecular evolution and speciation to one incorporating base levels of selection, making it all the more crucial to understand the basis and evolutionary implications of linkage. Here, we discuss the impact of gene conversion on genome structure and evolution and the current methods for detecting these events. We provide a comprehensive review of how gene conversion breaks down LD and affects both short- and long-term evolutionary processes, and we contrast its impact to that expected from crossing over alone.
Assuntos
Evolução Molecular , Conversão Gênica , Especiação Genética , Troca Genética , Genoma , Desequilíbrio de Ligação , Recombinação GenéticaRESUMO
The nuclear factor-kappa B (NF-κB) is a transcription factor with important roles in inflammation, immune response, and oncogenesis. Dysregulation of NF-κB signaling is associated with inflammation and certain cancers. We developed a gene expression biomarker predictive of NF-κB modulation and used the biomarker to screen a large compendia of gene expression data. The biomarker consists of 108 genes responsive to tumor necrosis factor α in the absence but not the presence of IκB, an inhibitor of NF-κB. Using a set of 450 profiles from cells treated with immunomodulatory factors with known NF-κB activity, the balanced accuracy for prediction of NF-κB activation was > 90%. The biomarker was used to screen a microarray compendium consisting of 12,061 microarray comparisons from human cells exposed to 2,672 individual chemicals to identify chemicals that could cause toxic effects through NF-κB. There were 215 and 49 chemicals that were identified as putative or known NF-κB activators or suppressors, respectively. NF-κB activators were also identified using two high-throughput screening assays; 165 out of the ~3,800 chemicals (ToxCast assay) and 55 out of ~7,500 unique compounds (Tox21 assay) were identified as potential activators. A set of 32 chemicals not previously associated with NF-κB activation and which partially overlapped between the different screens were selected for validation in wild-type and NFKB1-null HeLa cells. Using RT-qPCR and targeted RNA-Seq, 31 of the 32 chemicals were confirmed to be NF-κB activators. These results comprehensively identify a set of chemicals that could cause toxic effects through NF-κB.
Assuntos
Biomarcadores/metabolismo , Regulação da Expressão Gênica/genética , NF-kappa B/metabolismo , Linhagem Celular , Bases de Dados de Compostos Químicos , Regulação da Expressão Gênica/efeitos dos fármacos , Ensaios de Triagem em Larga Escala , Humanos , Proteínas I-kappa B/genética , Proteínas I-kappa B/metabolismo , NF-kappa B/agonistas , NF-kappa B/antagonistas & inibidores , Subunidade p50 de NF-kappa B/deficiência , Subunidade p50 de NF-kappa B/genética , Bibliotecas de Moléculas Pequenas/química , Bibliotecas de Moléculas Pequenas/farmacologia , Fator de Necrose Tumoral alfa/farmacologiaRESUMO
Genetic data can provide insights into population history, but first, we must understand the patterns that complex histories leave in genomes. Here, we consider the admixed human population of Cabo Verde to understand the patterns of genetic variation left by social and demographic processes. First settled in the late 1400s, Cabo Verdeans are admixed descendants of Portuguese colonizers and enslaved West African people. We consider Cabo Verde's well-studied historical record alongside genome-wide SNP data from 563 individuals from 4 regions within the archipelago. We use genetic ancestry to test for patterns of nonrandom mating and sex-specific gene flow, and we examine the consequences of these processes for common demographic inference methods and genetic patterns. Notably, multiple population genetic tools that assume random mating underestimate the timing of admixture, but incorporating nonrandom mating produces estimates more consistent with historical records. We consider how admixture interrupts common summaries of genomic variation such as runs of homozygosity. While summaries of runs of homozygosity may be difficult to interpret in admixed populations, differentiating runs of homozygosity by length class shows that runs of homozygosity reflect historical differences between the islands in their contributions from the source populations and postadmixture population dynamics. Finally, we find higher African ancestry on the X chromosome than on the autosomes, consistent with an excess of European males and African females contributing to the gene pool. Considering these genomic insights into population history in the context of Cabo Verde's historical record, we can identify how assumptions in genetic models impact inference of population history more broadly.
Assuntos
População Negra , Genética Populacional , População Negra/genética , Cabo Verde , Demografia , Feminino , Variação Genética , Humanos , MasculinoRESUMO
Population genetic analyses often use summary statistics to describe patterns of genetic variation and provide insight into evolutionary processes. Among the most fundamental of these summary statistics are π and dXY , which are used to describe genetic diversity within and between populations, respectively. Here, we address a widespread issue in π and dXY calculation: systematic bias generated by missing data of various types. Many popular methods for calculating π and dXY operate on data encoded in the variant call format (VCF), which condenses genetic data by omitting invariant sites. When calculating π and dXY using a VCF, it is often implicitly assumed that missing genotypes (including those at sites not represented in the VCF) are homozygous for the reference allele. Here, we show how this assumption can result in substantial downward bias in estimates of π and dXY that is directly proportional to the amount of missing data. We discuss the pervasive nature and importance of this problem in population genetics, and introduce a user-friendly UNIX command line utility, pixy, that solves this problem via an algorithm that generates unbiased estimates of π and dXY in the face of missing data. We compare pixy to existing methods using both simulated and empirical data, and show that pixy alone produces unbiased estimates of π and dXY regardless of the form or amount of missing data. In summary, our software solves a long-standing problem in applied population genetics and highlights the importance of properly accounting for missing data in population genetic analyses.
Assuntos
Genética Populacional , Nucleotídeos , Algoritmos , Alelos , Genética Populacional/métodos , Genótipo , Nucleotídeos/genética , SoftwareRESUMO
By shaping meiotic recombination, chromosomal inversions can influence genetic exchange between hybridizing species. Despite the recognized importance of inversions in evolutionary processes such as divergence and speciation, teasing apart the effects of inversions over time remains challenging. For example, are their effects on sequence divergence primarily generated through creating blocks of linkage disequilibrium prespeciation or through preventing gene flux after speciation? We provide a comprehensive look into the influence of inversions on gene flow throughout the evolutionary history of a classic system: Drosophila pseudoobscura and Drosophila persimilis. We use extensive whole-genome sequence data to report patterns of introgression and divergence with respect to chromosomal arrangements. Overall, we find evidence that inversions have contributed to divergence patterns between D. pseudoobscura and D. persimilis over three distinct timescales: (1) segregation of ancestral polymorphism early in the speciation process, (2) gene flow after the split of D. pseudoobscura and D. persimilis, but prior to the split of D. pseudoobscura subspecies, and (3) recent gene flow between sympatric D. pseudoobscura and D. persimilis, after the split of D. pseudoobscura subspecies. We discuss these results in terms of our understanding of evolution in this classic system and provide cautions for interpreting divergence measures in other systems.
Assuntos
Inversão Cromossômica , Drosophila , Animais , Cromossomos , Drosophila/genética , Fluxo Gênico , GenomaRESUMO
Humans have undergone large migrations over the past hundreds to thousands of years, exposing ourselves to new environments and selective pressures. Yet, evidence of ongoing or recent selection in humans is difficult to detect. Many of these migrations also resulted in gene flow between previously separated populations. These recently admixed populations provide unique opportunities to study rapid evolution in humans. Developing methods based on distributions of local ancestry, we demonstrate that this sort of genetic exchange has facilitated detectable adaptation to a malaria parasite in the admixed population of Cabo Verde within the last ~20 generations. We estimate that the selection coefficient is approximately 0.08, one of the highest inferred in humans. Notably, we show that this strong selection at a single locus has likely affected patterns of ancestry genome-wide, potentially biasing demographic inference. Our study provides evidence of adaptation in a human population on historical timescales.
Assuntos
Adaptação Fisiológica/genética , Fluxo Gênico , Malária/parasitologia , Seleção Genética , Cabo Verde , HumanosRESUMO
Drosophila pseudoobscura is a classic model system for the study of evolutionary genetics and genomics. Given this long-standing interest, many genome sequences have accumulated for D. pseudoobscura and closely related species D. persimilis, D. miranda, and D. lowei. To facilitate the exploration of genetic variation within species and comparative genomics across species, we present PseudoBase, a database that couples extensive publicly available genomic data with simple visualization and query tools via an intuitive graphical interface, amenable for use in both research and educational settings. All genetic variation (SNPs and indels) within the database is derived from the same workflow, so variants are easily comparable across data sets. Features include an embedded JBrowse interface, ability to pull out alignments of individual genes/regions, and batch access for gene lists. Here, we introduce PseudoBase, and we demonstrate how this resource facilitates use of extensive genomic data from flies of the Drosophila pseudoobscura subgroup.
Assuntos
Bases de Dados Genéticas , Drosophila/classificação , Drosophila/genética , Genômica , Animais , Genoma , Especificidade da EspécieRESUMO
Genetic studies of secondary sexual traits provide insights into whether and how selection drove their divergence among populations, and these studies often focus on the fraction of variation attributable to genes on the X-chromosome. However, such studies may sometimes misinterpret the amount of variation attributable to the X-chromosome if using only simple reciprocal F1 crosses, or they may presume sexual selection has affected the observed phenotypic variation. We examined the genetics of a secondary sexual trait, male sex comb size, in Drosophila subobscura. This species bears unusually large sex combs for its species group, and therefore, this trait may be a good candidate for having been affected by natural or sexual selection. We observed significant heritable variation in number of teeth of the distal sex comb across strains. While reciprocal F1 crosses seemed to implicate a disproportionate X-chromosome effect, further examination in the F2 progeny showed that transgressive autosomal effects inflated the estimate of variation associated with the X-chromosome in the F1. Instead, the X-chromosome appears to confer the smallest contribution of all major chromosomes to the observed phenotypic variation. Further, we failed to detect effects on copulation latency or duration associated with the observed phenotypic variation. Overall, this study presents an examination of the genetics underlying segregating phenotypic variation within species and illustrates two common pitfalls associated with some past studies of the genetic basis of secondary sexual traits.