RESUMO
Primate genomics holds the key to understanding fundamental aspects of human evolution and disease. However, genetic diversity and functional genomics data sets are currently available for only a few of the more than 500 extant primate species. Concerted efforts are under way to characterize primate genomes, genetic polymorphism and divergence, and functional landscapes across the primate phylogeny. The resulting data sets will enable the connection of genotypes to phenotypes and provide new insight into aspects of the genetics of primate traits, including human diseases. In this Review, we describe the existing genome assemblies as well as genetic variation and functional genomic data sets. We highlight some of the challenges with sample acquisition. Finally, we explore how technological advances in single-cell functional genomics and induced pluripotent stem cell-derived organoids will facilitate our understanding of the molecular foundations of primate biology.
Assuntos
Evolução Molecular , Genômica , Animais , Humanos , Genômica/métodos , Primatas/genética , Genoma , Filogenia , Variação GenéticaRESUMO
The relative importance of genetic drift and local adaptation in facilitating speciation remains unclear. This is particularly true for seabirds, which can disperse over large geographic distances, providing opportunities for intermittent gene flow among distant colonies that span the temperature and salinity gradients of the oceans. Here, we delve into the genomic basis of adaptation and speciation of banded penguins, Galápagos (Spheniscus mendiculus), Humboldt (Spheniscus humboldti), Magellanic (Spheniscus magellanicus), and African penguins (Spheniscus demersus), by analyzing 114 genomes from the main 16 breeding colonies. We aim to identify the molecular mechanism and genomic adaptive traits that have facilitated their diversifications. Through positive selection and gene family expansion analyses, we identified candidate genes that may be related to reproductive isolation processes mediated by ecological thermal niche divergence. We recover signals of positive selection on key loci associated with spermatogenesis, especially during the recent peripatric divergence of the Galápagos penguin from the Humboldt penguin. High temperatures in tropical habitats may have favored selection on loci associated with spermatogenesis to maintain sperm viability, leading to reproductive isolation among young species. Our results suggest that genome-wide selection on loci associated with molecular pathways that underpin thermoregulation, osmoregulation, hypoxia, and social behavior appears to have been crucial in local adaptation of banded penguins. Overall, these results contribute to our understanding of how the complexity of biotic, but especially abiotic, factors, along with the high dispersal capabilities of these marine species, may promote both neutral and adaptive lineage divergence even in the presence of gene flow.
Assuntos
Seleção Genética , Spheniscidae , Animais , Spheniscidae/genética , Genômica , Especiação Genética , Fluxo Gênico , Genoma , Isolamento ReprodutivoRESUMO
Hibernation in bears involves a suite of metabolical and physiological changes, including the onset of insulin resistance, that are driven in part by sweeping changes in gene expression in multiple tissues. Feeding bears glucose during hibernation partially restores active season physiological phenotypes, including partial resensitization to insulin, but the molecular mechanisms underlying this transition remain poorly understood. Here, we analyze tissue-level gene expression in adipose, liver, and muscle to identify genes that respond to midhibernation glucose feeding and thus potentially drive postfeeding metabolical and physiological shifts. We show that midhibernation feeding stimulates differential expression in all analyzed tissues of hibernating bears and that a subset of these genes responds specifically by shifting expression toward levels typical of the active season. Inferences of upstream regulatory molecules potentially driving these postfeeding responses implicate peroxisome proliferator-activated receptor gamma (PPARG) and other known regulators of insulin sensitivity, providing new insight into high-level regulatory mechanisms involved in shifting metabolic phenotypes between hibernation and active states.
Assuntos
Hibernação , Resistência à Insulina , Ursidae , Animais , Ursidae/genética , Ursidae/metabolismo , Hibernação/genética , Estações do Ano , Glucose/metabolismo , Resistência à Insulina/genética , Expressão GênicaRESUMO
Predicting the potential fate of a species in the face of climate change requires knowing the distribution of molecular adaptations across the geographic range of the species. In this work, we analysed 79 genomes of Theobroma cacao, an Amazonian tree known for the fruit from which chocolate is produced, to evaluate how local and regional molecular signatures of adaptation are distributed across the natural range of the species. We implemented novel techniques that incorporate summary statistics from multiple selection scans to infer selective sweeps. The majority of the molecular adaptations in the genome are not shared among populations. We show that ~71.5% of genes under selection also show significant associations with changes in environmental variables. Our results support the interpretation that these genes contribute to local adaptation of the populations in response to abiotic factors. We also found strong patterns of molecular adaptation in a diverse array of disease resistance genes (6.5% of selective sweeps), suggesting that differential adaptation to pathogens also contributes significantly to local adaptations. Our results are consistent with the interpretation that local selective pressures are more important than regional selective pressures in explaining adaptation across the range of a species.
Assuntos
Cacau , Chocolate , Aclimatação , Cacau/genética , Mudança Climática , Seleção Genética , ÁrvoresRESUMO
BACKGROUND: Recombination plays an important evolutionary role by breaking up haplotypes and shuffling genetic variation. This process impacts the ability of selection to eliminate deleterious mutations or increase the frequency of beneficial mutations in a population. To understand the role of recombination generating and maintaining haplotypic variation in a population, we can construct fine-scale recombination maps. Such maps have been used to study a variety of model organisms and proven to be informative of how selection and demographics shape species-wide variation. Here we present a fine-scale recombination map for ten populations of Theobroma cacao - a non-model, long-lived, woody crop. We use this map to elucidate the dynamics of recombination rates in distinct populations of the same species, one of which is domesticated. RESULTS: Mean recombination rates in range between 2.5 and 8.6 cM/Mb for most populations of T. cacao with the exception of the domesticated Criollo (525 cM/Mb) and Guianna, a more recently established population (46.5 cM/Mb). We found little overlap in the location of hotspots of recombination across populations. We also found that hotspot regions contained fewer known retroelement sequences than expected and were overrepresented near transcription start and termination sites. We find mutations in FIGL-1, a protein shown to downregulate cross-over frequency in Arabidopsis, statistically associated to higher recombination rates in domesticated Criollo. CONCLUSIONS: We generated fine-scale recombination maps for ten populations of Theobroma cacao and used them to understand what processes are associated with population-level variation in this species. Our results provide support to the hypothesis of increased recombination rates in domesticated plants (Criollo population). We propose a testable mechanistic hypothesis for the change in recombination rate in domesticated populations in the form of mutations to a previously identified recombination-suppressing protein. Finally, we establish a number of possible correlates of recombination hotspots that help explain general patterns of recombination in this species.
Assuntos
Cacau/genética , Variação Genética , Recombinação Genética , Domesticação , Evolução Molecular , Genética Populacional , Genoma de Planta , Desequilíbrio de Ligação , Modelos Genéticos , Mutação , Motivos de Nucleotídeos , Proteínas de Plantas/genéticaRESUMO
Aneuploidy is prevalent in human embryos and is the leading cause of pregnancy loss. Many aneuploidies arise during oogenesis, increasing with maternal age. Superimposed on these meiotic aneuploidies are frequent errors occurring during early mitotic divisions, contributing to widespread chromosomal mosaicism. Here we reanalyzed a published dataset comprising preimplantation genetic testing for aneuploidy in 24 653 blastomere biopsies from day-3 cleavage-stage embryos, as well as 17 051 trophectoderm biopsies from day-5 blastocysts. We focused on complex abnormalities that affected multiple chromosomes simultaneously, seeking insights into their formation. In addition to well-described patterns such as triploidy and haploidy, we identified 4.7% of blastomeres possessing characteristic hypodiploid karyotypes. We inferred this signature to have arisen from tripolar chromosome segregation in normally fertilized diploid zygotes or their descendant diploid cells. This could occur via segregation on a tripolar mitotic spindle or by rapid sequential bipolar mitoses without an intervening S-phase. Both models are consistent with time-lapse data from an intersecting set of 77 cleavage-stage embryos, which were enriched for the tripolar signature among embryos exhibiting abnormal cleavage. The tripolar signature was strongly associated with common maternal genetic variants spanning the centrosomal regulator PLK4, driving the association we previously reported with overall mitotic errors. Our findings are consistent with the known capacity of PLK4 to induce tripolar mitosis or precocious M-phase upon dysregulation. Together, our data support tripolar chromosome segregation as a key mechanism generating complex aneuploidy in cleavage-stage embryos and implicate maternal genotype at a quantitative trait locus spanning PLK4 as a factor influencing its occurrence.
Assuntos
Aneuploidia , Oogênese/genética , Proteínas Serina-Treonina Quinases/genética , Fuso Acromático/genética , Adolescente , Adulto , Blastocisto/patologia , Blastômeros/patologia , Segregação de Cromossomos/genética , Feminino , Testes Genéticos , Variação Genética , Genótipo , Humanos , Cariótipo , Idade Materna , Pessoa de Meia-Idade , Mitose/genética , Gravidez , Fuso Acromático/patologiaRESUMO
Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 (14)C years before present (bp) (13,000 to 12,600 calendar years bp). Nearly 50 years of archaeological research point to the Clovis complex as having developed south of the North American ice sheets from an ancestral technology. However, both the origins and the genetic legacy of the people who manufactured Clovis tools remain under debate. It is generally believed that these people ultimately derived from Asia and were directly related to contemporary Native Americans. An alternative, Solutrean, hypothesis posits that the Clovis predecessors emigrated from southwestern Europe during the Last Glacial Maximum. Here we report the genome sequence of a male infant (Anzick-1) recovered from the Anzick burial site in western Montana. The human bones date to 10,705 ± 35 (14)C years bp (approximately 12,707-12,556 calendar years bp) and were directly associated with Clovis tools. We sequenced the genome to an average depth of 14.4× and show that the gene flow from the Siberian Upper Palaeolithic Mal'ta population into Native American ancestors is also shared by the Anzick-1 individual and thus happened before 12,600 years bp. We also show that the Anzick-1 individual is more closely related to all indigenous American populations than to any other group. Our data are compatible with the hypothesis that Anzick-1 belonged to a population directly ancestral to many contemporary Native Americans. Finally, we find evidence of a deep divergence in Native American populations that predates the Anzick-1 individual.
Assuntos
Genoma Humano/genética , Indígenas Norte-Americanos/genética , Filogenia , Arqueologia , Ásia/etnologia , Osso e Ossos , Sepultamento , Cromossomos Humanos Y/genética , DNA Mitocondrial/genética , Emigração e Imigração/história , Europa (Continente)/etnologia , Fluxo Gênico/genética , Haplótipos/genética , História Antiga , Humanos , Lactente , Masculino , Modelos Genéticos , Dados de Sequência Molecular , Montana , Dinâmica Populacional , Datação RadiométricaRESUMO
The human DARC (Duffy antigen receptor for chemokines) gene encodes a membrane-bound chemokine receptor crucial for the infection of red blood cells by Plasmodium vivax, a major causative agent of malaria. Of the three major allelic classes segregating in human populations, the FY*O allele has been shown to protect against P. vivax infection and is at near fixation in sub-Saharan Africa, while FY*B and FY*A are common in Europe and Asia, respectively. Due to the combination of strong geographic differentiation and association with malaria resistance, DARC is considered a canonical example of positive selection in humans. Despite this, details of the timing and mode of selection at DARC remain poorly understood. Here, we use sequencing data from over 1,000 individuals in twenty-one human populations, as well as ancient human genomes, to perform a fine-scale investigation of the evolutionary history of DARC. We estimate the time to most recent common ancestor (TMRCA) of the most common FY*O haplotype to be 42 kya (95% CI: 34-49 kya). We infer the FY*O null mutation swept to fixation in Africa from standing variation with very low initial frequency (0.1%) and a selection coefficient of 0.043 (95% CI:0.011-0.18), which is among the strongest estimated in the human genome. We estimate the TMRCA of the FY*A mutation in non-Africans to be 57 kya (95% CI: 48-65 kya) and infer that, prior to the sweep of FY*O, all three alleles were segregating in Africa, as highly diverged populations from Asia and ≠Khomani San hunter-gatherers share the same FY*A haplotypes. We test multiple models of admixture that may account for this observation and reject recent Asian or European admixture as the cause.
Assuntos
Resistência à Doença/genética , Sistema do Grupo Sanguíneo Duffy/genética , Genética Populacional , Malária Vivax/genética , Receptores de Superfície Celular/genética , África , Alelos , Animais , Ásia , Sistema do Grupo Sanguíneo Duffy/metabolismo , Frequência do Gene , Genoma Humano , Geografia , Gorilla gorilla , Haplótipos , Humanos , Mutação , Pan paniscus , Pan troglodytes , Polimorfismo de Nucleotídeo Único , Pongo , Regiões Promotoras Genéticas , Receptores de Superfície Celular/metabolismoRESUMO
BACKGROUND: Carbonic anhydrase (CA) catalyzes the hydration of CO2 in the first biochemical step of C4 photosynthesis, and has been considered a potentially rate-limiting step when CO2 availability within a leaf is low. Previous work in Zea mays (maize) with a double knockout of the two highest-expressed ß-CA genes, CA1 and CA2, reduced total leaf CA activity to less than 3% of wild-type. Surprisingly, this did not limit photosynthesis in maize at ambient or higher CO2concentrations. However, the ca1ca2 mutants exhibited reduced rates of photosynthesis at sub-ambient CO2, and accumulated less biomass when grown under sub-ambient CO2 (9.2 Pa). To further clarify the importance of CA for C4 photosynthesis, we assessed gene expression changes in wild-type, ca1 and ca1ca2 mutants in response to changes in pCO2 from 920 to 9.2 Pa. RESULTS: Leaf samples from each genotype were collected for RNA-seq analysis at high CO2 and at two time points after the low CO2 transition, in order to identify early and longer-term responses to CO2 deprivation. Despite the existence of multiple isoforms of CA, no other CA genes were upregulated in CA mutants. Although photosynthetic genes were downregulated in response to low CO2, differential expression was not observed between genotypes. However, multiple indicators of carbon starvation were present in the mutants, including amino acid synthesis, carbohydrate metabolism, and sugar signaling. In particular, multiple genes previously implicated in low carbon stress such as asparagine synthetase, amino acid transporters, trehalose-6-phosphate synthase, as well as many transcription factors, were strongly upregulated. Furthermore, genes in the CO2 stomatal signaling pathway were differentially expressed in the CA mutants under low CO2. CONCLUSIONS: Using a transcriptomic approach, we showed that carbonic anhydrase mutants do not compensate for the lack of CA activity by upregulating other CA or photosynthetic genes, but rather experienced extreme carbon stress when grown under low CO2. Our results also support a role for CA in the CO2 stomatal signaling pathway. This study provides insight into the importance of CA for C4 photosynthesis and its role in stomatal signaling.
Assuntos
Dióxido de Carbono/metabolismo , Anidrases Carbônicas/genética , Genes de Plantas , Fotossíntese/genética , Estômatos de Plantas/metabolismo , Zea mays/enzimologia , Zea mays/genética , Alelos , Aquaporinas/metabolismo , Sequência de Bases , Metabolismo dos Carboidratos , Anidrases Carbônicas/fisiologia , Parede Celular/metabolismo , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Técnicas de Inativação de Genes , Genótipo , Isoenzimas/genética , Isoenzimas/fisiologia , Óxido Nítrico/metabolismo , Folhas de Planta/metabolismo , Homologia de Sequência do Ácido Nucleico , Transdução de SinaisRESUMO
BACKGROUND: The genus Elaeis has two species of economic importance for the oil palm agroindustry: Elaeis oleifera (O), native to the Americas, and Elaeis guineensis (G), native to Africa. This work provides to our knowledge, the first association mapping study in an interspecific OxG oil palm population, which shows tolerance to pests and diseases, high oil quality, and acceptable fruit bunch production. RESULTS: Using genotyping-by-sequencing (GBS), we identified a total of 3776 single nucleotide polymorphisms (SNPs) that were used to perform a genome-wide association analysis (GWAS) in 378 OxG hybrid population for 10 agronomic traits. Twelve genomic regions (SNPs) were located near candidate genes implicated in multiple functional categories, such as tissue growth, cellular trafficking, and physiological processes. CONCLUSIONS: We provide new insights on genomic regions that mapped on candidate genes involved in plant architecture and yield. These potential candidate genes need to be confirmed for future targeted functional analyses. Associated markers to the traits of interest may be valuable resources for the development of marker-assisted selection in oil palm breeding.
Assuntos
Arecaceae/genética , Produção Agrícola , Produtos Agrícolas/genética , Genótipo , Arecaceae/anatomia & histologia , Arecaceae/fisiologia , Produtos Agrícolas/anatomia & histologia , Produtos Agrícolas/fisiologia , Estudo de Associação Genômica Ampla , Hibridização Genética , Melhoramento VegetalRESUMO
Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.
Assuntos
Alelos , Técnicas de Genotipagem , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Software , Estatística como Assunto , Genética Populacional , Humanos , Polimorfismo de Nucleotídeo Único/genéticaRESUMO
Puccinia striiformis f. sp. tritici causes devastating stripe (yellow) rust on wheat and P. striiformis f. sp. hordei causes stripe rust on barley. Several P. striiformis f. sp. tritici genomes are available, but no P. striiformis f. sp. hordei genome is available. More genomes of P. striiformis f. sp. tritici and P. striiformis f. sp. hordei are needed to understand the genome evolution and molecular mechanisms of their pathogenicity. We sequenced P. striiformis f. sp. tritici isolate 93-210 and P. striiformis f. sp. hordei isolate 93TX-2, using PacBio and Illumina technologies and RNA sequencing. Their genomic sequences were assembled to contigs with high continuity and showed significant structural differences. The circular mitochondria genomes of both were complete. These genomes provide high-quality resources for deciphering the genomic basis of rapid evolution and host adaptation, identifying genes for avirulence and other important traits, and studying host-pathogen interactions.
Assuntos
Basidiomycota/genética , Genoma Fúngico/genética , Genômica , Hordeum/microbiologia , Doenças das Plantas/microbiologia , Triticum/microbiologia , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Fenótipo , Análise de Sequência de RNARESUMO
BACKGROUND: Plant fungal pathogens can rapidly evolve and adapt to new environmental conditions in response to sudden changes of host populations in agro-ecosystems. However, the genomic basis of their host adaptation, especially at the forma specialis level, remains unclear. RESULTS: We sequenced two isolates each representing Puccinia striiformis f. sp. tritici (Pst) and P. striiformis f. sp. hordei (Psh), different formae speciales of the stripe rust fungus P. striiformis highly adapted to wheat and barley, respectively. The divergence of Pst and Psh, estimated to start 8.12 million years ago, has been driven by high nucleotide mutation rates. The high genomic variation within dikaryotic urediniospores of P. striiformis has provided raw genetic materials for genome evolution. No specific gene families have enriched in either isolate, but extensive gene loss events have occurred in both Pst and Psh after the divergence from their most recent common ancestor. A large number of isolate-specific genes were identified, with unique genomic features compared to the conserved genes, including 1) significantly shorter in length; 2) significantly less expressed; 3) significantly closer to transposable elements; and 4) redundant in pathways. The presence of specific genes in one isolate (or forma specialis) was resulted from the loss of the homologues in the other isolate (or forma specialis) by the replacements of transposable elements or losses of genomic fragments. In addition, different patterns and numbers of telomeric repeats were observed between the isolates. CONCLUSIONS: Host adaptation of P. striiformis at the forma specialis level is a complex pathogenic trait, involving not only virulence-related genes but also other genes. Gene loss, which might be adaptive and driven by transposable element activities, provides genomic basis for host adaptation of different formae speciales of P. striiformis.
Assuntos
Adaptação Fisiológica/genética , Basidiomycota/genética , Basidiomycota/fisiologia , Genômica , Interações Hospedeiro-Patógeno/genética , Doenças das Plantas/microbiologia , Evolução Molecular , Hordeum/microbiologia , Sequências Repetitivas de Ácido Nucleico/genética , Telômero/genética , Triticum/microbiologiaRESUMO
Frosty pod rot (FPR) disease on cocoa, caused by Moniliophthora roreri, is one of the most devastating cocoa disease in the Western Hemisphere. In Colombia, the disease is particularly severe in the Magdalena Valley, which is considered the possible center of origin for the pathogen species. We analyzed the genetic diversity of isolates from the departments of Santander, Antioquia, Tolima, and Huila in Colombia using 23 simple-sequence repeats (SSR) markers. In total, 117 different multilocus genotypes were found among 120 isolates, each one representing a unique haplotype. High mutation rates in the SSR and gene flow can explain the high levels of diversity. Also, the observed and standardized indexes of association (IA and rd) indicate that the populations of M. roreri are clonal. Furthermore, given the high haplotype diversity and the significant linkage disequilibrium observed, we hypothesize that M. roreri could be a primarily asexual species undergoing sporadic recombination or partial recombination through parasexuality. A Bayesian clustering analysis implemented by STRUCTURE showed that the most probable number of genetic groups in the data was three, confirming the geographical differentiation among isolates. Similar results were obtained by a discriminant analysis of principal components, a principal coordinate analysis, and a neighbor-joining tree from microsatellite loci base on Nei distance. Cacao genotypes and environmental variables did contribute to the genetic differentiation of the groups. We discuss how this information could be used to improve the management of FPR at the regional level.
RESUMO
Mycobacterium tuberculosis (M.tb), the cause of tuberculosis (TB), is estimated to infect a new host every second. While analyses of genetic data from natural populations of M.tb have emphasized the role of genetic drift in shaping patterns of diversity, the influence of natural selection on this successful pathogen is less well understood. We investigated the effects of natural selection on patterns of diversity in 63 globally extant genomes of M.tb and related pathogenic mycobacteria. We found evidence of strong purifying selection, with an estimated genome-wide selection coefficient equal to -9.5 × 10(-4) (95% CI -1.1 × 10(-3) to -6.8 × 10(-4)); this is several orders of magnitude higher than recent estimates for eukaryotic and prokaryotic organisms. We also identified different patterns of variation across categories of gene function. Genes involved in transport and metabolism of inorganic ions exhibited very low levels of non-synonymous polymorphism, equivalent to categories under strong purifying selection (essential and translation-associated genes). The highest levels of non-synonymous variation were seen in a group of transporter genes, likely due to either diversifying selection or local selective sweeps. In addition to selection, we identified other important influences on M.tb genetic diversity, such as a 25-fold expansion of global M.tb populations coincident with explosive growth in human populations (estimated timing 1684 C.E., 95% CI 1620-1713 C.E.). These results emphasize the parallel demographic histories of this obligate pathogen and its human host, and suggest that the dominant effect of selection on M.tb is removal of novel variants, with exceptions in an interesting group of genes involved in transportation and defense. We speculate that the hostile environment within a host imposes strict demands on M.tb physiology, and thus a substantial fitness cost for most new mutations. In this respect, obligate bacterial pathogens may differ from other host-associated microbes such as symbionts.
Assuntos
Evolução Molecular , Mycobacterium tuberculosis/genética , Polimorfismo Genético/genética , Seleção Genética/genética , Tuberculose/microbiologia , Genoma Bacteriano , Humanos , Mycobacterium tuberculosis/classificação , Filogenia , Recombinação Genética , Tuberculose/genéticaRESUMO
Streptococcus mutans is widely recognized as one of the key etiological agents of human dental caries. Despite its role in this important disease, our present knowledge of gene content variability across the species and its relationship to adaptation is minimal. Estimates of its demographic history are not available. In this study, we generated genome sequences of 57 S. mutans isolates, as well as representative strains of the most closely related species to S. mutans (S. ratti, S. macaccae, and S. criceti), to identify the overall structure and potential adaptive features of the dispensable and core components of the genome. We also performed population genetic analyses on the core genome of the species aimed at understanding the demographic history, and impact of selection shaping its genetic variation. The maximum gene content divergence among strains was approximately 23%, with the majority of strains diverging by 5-15%. The core genome consisted of 1,490 genes and the pan-genome approximately 3,296. Maximum likelihood analysis of the synonymous site frequency spectrum (SFS) suggested that the S. mutans population started expanding exponentially approximately 10,000 years ago (95% confidence interval [CI]: 3,268-14,344 years ago), coincidental with the onset of human agriculture. Analysis of the replacement SFS indicated that a majority of these substitutions are under strong negative selection, and the remainder evolved neutrally. A set of 14 genes was identified as being under positive selection, most of which were involved in either sugar metabolism or acid tolerance. Analysis of the core genome suggested that among 73 genes present in all isolates of S. mutans but absent in other species of the mutans taxonomic group, the majority can be associated with metabolic processes that could have contributed to the successful adaptation of S. mutans to its new niche, the human mouth, and with the dietary changes that accompanied the origin of agriculture.
Assuntos
Evolução Molecular , Metagenômica , Streptococcus mutans/genética , Adaptação Biológica/genética , Metabolismo dos Carboidratos/genética , Cárie Dentária/microbiologia , Frequência do Gene , Genoma Bacteriano , Humanos , Funções Verossimilhança , Desequilíbrio de Ligação , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Recombinação Genética , Seleção GenéticaRESUMO
Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.
Assuntos
Análise Mutacional de DNA/métodos , Genes Sintéticos , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Trombofilia/genética , Alelos , Sequência de Bases , Feminino , Predisposição Genética para Doença , Genoma Humano , Genótipo , Haplótipos , Humanos , Masculino , Linhagem , Padrões de Referência , Medição de Risco , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
Malaria parasites are known to infect a variety of vertebrate hosts, including ungulates. However, ungulates of Amazonia have not been investigated. We report for the first time, the presence of parasite lineages closely related to Plasmodium odocoilei clade 1 and clade 2 in free-ranging South American red-brocket deer (Mazama americana; 44.4%, 4/9) and gray-brocket deer (Mazama nemorivaga; 50.0%, 1/2). We performed PCR-based analysis of blood samples from 47 ungulates of five different species collected during subsistence hunting by an indigenous community in the Peruvian Amazon. We detected Plasmodium malariae/brasilianum lineage in a sample from red-brocket deer. However, no parasite DNA was detected in collared peccary (Pecari tajacu; 0.0%, 0/10), white-lipped peccary (Tayassu pecari; 0.0%, 0/15), and tapir (Tapirus terrestris; 0.0%, 0/11). Concordant phylogenetic analyses suggested a possible co-evolutionary relationship between the Plasmodium lineages found in American deer and their hosts.
Assuntos
Cervos , Plasmodium , Animais , Filogenia , Peru/epidemiologia , Plasmodium/genética , PerissodáctilosRESUMO
BACKGROUND: We address the task of extracting accurate haplotypes from genotype data of individuals of large F1 populations for mapping studies. While methods for inferring parental haplotype assignments on large F1 populations exist in theory, these approaches do not work in practice at high levels of accuracy. RESULTS: We have designed iXora (Identifying crossovers and recombining alleles), a robust method for extracting reliable haplotypes of a mapping population, as well as parental haplotypes, that runs in linear time. Each allele in the progeny is assigned not just to a parent, but more precisely to a haplotype inherited from the parent. iXora shows an improvement of at least 15% in accuracy over similar systems in literature. Furthermore, iXora provides an easy-to-use, comprehensive environment for association studies and hypothesis checking in populations of related individuals. CONCLUSIONS: iXora provides detailed resolution in parental inheritance, along with the capability of handling very large populations, which allows for accurate haplotype extraction and trait association. iXora is available for non-commercial use from http://researcher.ibm.com/project/3430.
Assuntos
Haplótipos , Locos de Características Quantitativas , Troca Genética , Humanos , Recombinação GenéticaRESUMO
OBJECTIVES: Complex physiological adaptations often involve the coordination of molecular responses across multiple tissues. Establishing transcriptomic resources for non-traditional model organisms with phenotypes of interest can provide a foundation for understanding the genomic basis of these phenotypes, and the degree to which these resemble, or contrast, those of traditional model organisms. Here, we present a one-of-a-kind gene expression dataset generated from multiple tissues of two hibernating brown bears (Ursus arctos). DATA DESCRIPTION: This dataset is comprised of 26 samples collected from 13 tissues of two hibernating brown bears. These samples were collected opportunistically and are typically not possible to attain, resulting in a highly unique and valuable gene expression dataset. In combination with previously published datasets, this new transcriptomic resource will facilitate detailed investigation of hibernation physiology in bears, and the potential to translate aspects of this biology to treat human disease.