Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 30
Filter
1.
Development ; 150(2)2023 01 15.
Article in English | MEDLINE | ID: mdl-36621005

ABSTRACT

Gene duplication events can drive evolution by providing genetic material for new gene functions, and they create opportunities for diverse developmental strategies to emerge between species. To study the contribution of duplicated genes to human early development, we examined the evolution and function of NANOGP1, a tandem duplicate of the transcription factor NANOG. We found that NANOGP1 and NANOG have overlapping but distinct expression profiles, with high NANOGP1 expression restricted to early epiblast cells and naïve-state pluripotent stem cells. Sequence analysis and epitope-tagging revealed that NANOGP1 is protein coding with an intact homeobox domain. The duplication that created NANOGP1 occurred earlier in primate evolution than previously thought and has been retained only in great apes, whereas Old World monkeys have disabled the gene in different ways, including homeodomain point mutations. NANOGP1 is a strong inducer of naïve pluripotency; however, unlike NANOG, it is not required to maintain the undifferentiated status of human naïve pluripotent cells. By retaining expression, sequence and partial functional conservation with its ancestral copy, NANOGP1 exemplifies how gene duplication and subfunctionalisation can contribute to transcription factor activity in human pluripotency and development.


Subject(s)
Genes, Homeobox , Pluripotent Stem Cells , Animals , Humans , Homeodomain Proteins/genetics , Homeodomain Proteins/metabolism , Nanog Homeobox Protein/genetics , Nanog Homeobox Protein/metabolism , Pluripotent Stem Cells/metabolism , Transcription Factors/genetics , Transcription Factors/metabolism
2.
PLoS Genet ; 17(3): e1009221, 2021 03.
Article in English | MEDLINE | ID: mdl-33651813

ABSTRACT

Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to-megabase scales, the genomic and evolutionary consequences of this mutational process are not well characterised at smaller scales, where they are often interpreted as clusters of independent substitutions, insertions and deletions. Here we present an improved statistical approach using pair hidden Markov models, and use it to detect and describe short-range template switches underlying clusters of mutations in the multi-way alignment of hominid genomes. Using robust statistics derived from evolutionary genomic simulations, we show that template switch events have been widespread in the evolution of the great apes' genomes and provide a parsimonious explanation for the presence of many complex mutation clusters in their phylogenetic context. Larger-scale mechanisms of genome rearrangement are typically associated with structural features around breakpoints, and accordingly we show that atypical patterns of secondary structure formation and DNA bending are present at the initial template switch loci. Our methods improve on previous non-probabilistic approaches for computational detection of template switch mutations, allowing the statistical significance of events to be assessed. By specifying realistic evolutionary parameters based on the genomes and taxa involved, our methods can be readily adapted to other intra- or inter-species comparisons.


Subject(s)
DNA Replication , Genome , Hominidae/genetics , Markov Chains , Models, Genetic , Templates, Genetic , Algorithms , Animals , Genomics/methods , Humans , Poly A-U , Quantitative Trait Loci
3.
PLoS Genet ; 14(9): e1007641, 2018 09.
Article in English | MEDLINE | ID: mdl-30226838

ABSTRACT

Human populations outside of Africa have experienced at least two bouts of introgression from archaic humans, from Neanderthals and Denisovans. In Papuans there is prior evidence of both these introgressions. Here we present a new approach to detect segments of individual genomes of archaic origin without using an archaic reference genome. The approach is based on a hidden Markov model that identifies genomic regions with a high density of single nucleotide variants (SNVs) not seen in unadmixed populations. We show using simulations that this provides a powerful approach to identifying segments of archaic introgression with a low rate of false detection, given data from a suitable outgroup population is available, without the archaic introgression but containing a majority of the variation that arose since initial separation from the archaic lineage. Furthermore our approach is able to infer admixture proportions and the times both of admixture and of initial divergence between the human and archaic populations. We apply the model to detect archaic introgression in 89 Papuans and show how the identified segments can be assigned to likely Neanderthal or Denisovan origin. We report more Denisovan admixture than previous studies and find a shift in size distribution of fragments of Neanderthal and Denisovan origin that is compatible with a difference in admixture time. Furthermore, we identify small amounts of Denisova ancestry in South East Asians and South Asians.


Subject(s)
Genome, Human/genetics , Hominidae/genetics , Hybridization, Genetic/genetics , Neanderthals/genetics , Animals , Asian People/genetics , Black People/genetics , Fossils , Humans , Native Hawaiian or Other Pacific Islander/genetics , Phylogeny , White People/genetics
4.
PLoS Genet ; 13(1): e1006549, 2017 01.
Article in English | MEDLINE | ID: mdl-28095480

ABSTRACT

The rate of germline mutation varies widely between species but little is known about the extent of variation in the germline mutation rate between individuals of the same species. Here we demonstrate that an allele that increases the rate of germline mutation can result in a distinctive signature in the genomic region linked to the affected locus, characterized by a number of haplotypes with a locally high proportion of derived alleles, against a background of haplotypes carrying a typical proportion of derived alleles. We searched for this signature in human haplotype data from phase 3 of the 1000 Genomes Project and report a number of candidate mutator loci, several of which are located close to or within genes involved in DNA repair or the DNA damage response. To investigate whether mutator alleles remained active at any of these loci, we used de novo mutation counts from human parent-offspring trios in the 1000 Genomes and Genome of the Netherlands cohorts, looking for an elevated number of de novo mutations in the offspring of parents carrying a candidate mutator haplotype at each of these loci. We found some support for two of the candidate loci, including one locus just upstream of the BRSK2 gene, which is expressed in the testis and has been reported to be involved in the response to DNA damage.


Subject(s)
Gene Frequency , Genome, Human , Germ-Line Mutation/genetics , Haplotypes , DNA Repair/genetics , Genetic Loci , Humans , Mutation Rate , Pedigree , Protein Serine-Threonine Kinases/genetics
5.
Nature ; 559(7714): 336-338, 2018 07.
Article in English | MEDLINE | ID: mdl-30006623
6.
Am J Hum Genet ; 96(6): 986-91, 2015 Jun 04.
Article in English | MEDLINE | ID: mdl-26027499

ABSTRACT

The predominantly African origin of all modern human populations is well established, but the route taken out of Africa is still unclear. Two alternative routes, via Egypt and Sinai or across the Bab el Mandeb strait into Arabia, have traditionally been proposed as feasible gateways in light of geographic, paleoclimatic, archaeological, and genetic evidence. Distinguishing among these alternatives has been difficult. We generated 225 whole-genome sequences (225 at 8× depth, of which 8 were increased to 30×; Illumina HiSeq 2000) from six modern Northeast African populations (100 Egyptians and five Ethiopian populations each represented by 25 individuals). West Eurasian components were masked out, and the remaining African haplotypes were compared with a panel of sub-Saharan African and non-African genomes. We showed that masked Northeast African haplotypes overall were more similar to non-African haplotypes and more frequently present outside Africa than were any sets of haplotypes derived from a West African population. Furthermore, the masked Egyptian haplotypes showed these properties more markedly than the masked Ethiopian haplotypes, pointing to Egypt as the more likely gateway in the exodus to the rest of the world. Using five Ethiopian and three Egyptian high-coverage masked genomes and the multiple sequentially Markovian coalescent (MSMC) approach, we estimated the genetic split times of Egyptians and Ethiopians from non-African populations at 55,000 and 65,000 years ago, respectively, whereas that of West Africans was estimated to be 75,000 years ago. Both the haplotype and MSMC analyses thus suggest a predominant northern route out of Africa via Egypt.


Subject(s)
Biological Evolution , Black People/genetics , Genome, Human/genetics , Human Migration/history , Base Sequence , Egypt, Ancient , Ethiopia , Geography , Haplotypes/genetics , High-Throughput Nucleotide Sequencing/methods , History, Ancient , Humans , Markov Chains , Models, Genetic , Molecular Sequence Data , Principal Component Analysis
7.
Nat Rev Genet ; 13(10): 745-53, 2012 10.
Article in English | MEDLINE | ID: mdl-22965354

ABSTRACT

It is now possible to make direct measurements of the mutation rate in modern humans using next-generation sequencing. These measurements reveal a value that is approximately half of that previously derived from fossil calibration, and this has implications for our understanding of demographic events in human evolution and other aspects of population genetics. Here, we discuss the implications of a lower-than-expected mutation rate in relation to the timescale of human evolution.


Subject(s)
Evolution, Molecular , Mutation Rate , Animals , Comprehension , DNA, Mitochondrial/genetics , Genetic Speciation , Geography , Hominidae/genetics , Humans/genetics , Models, Biological , Neanderthals/genetics , Phylogeny
8.
Nature ; 483(7388): 169-75, 2012 Mar 07.
Article in English | MEDLINE | ID: mdl-22398555

ABSTRACT

Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.


Subject(s)
Evolution, Molecular , Genetic Speciation , Genome/genetics , Gorilla gorilla/genetics , Animals , Female , Gene Expression Regulation , Genetic Variation/genetics , Genomics , Humans , Macaca mulatta/genetics , Molecular Sequence Data , Pan troglodytes/genetics , Phylogeny , Pongo/genetics , Proteins/genetics , Sequence Alignment , Species Specificity , Transcription, Genetic
9.
10.
Bioinformatics ; 32(11): 1749-51, 2016 06 01.
Article in English | MEDLINE | ID: mdl-26826718

ABSTRACT

UNLABELLED: Runs of homozygosity (RoHs) are genomic stretches of a diploid genome that show identical alleles on both chromosomes. Longer RoHs are unlikely to have arisen by chance but are likely to denote autozygosity, whereby both copies of the genome descend from the same recent ancestor. Early tools to detect RoH used genotype array data, but substantially more information is available from sequencing data. Here, we present and evaluate BCFtools/RoH, an extension to the BCFtools software package, that detects regions of autozygosity in sequencing data, in particular exome data, using a hidden Markov model. By applying it to simulated data and real data from the 1000 Genomes Project we estimate its accuracy and show that it has higher sensitivity and specificity than existing methods under a range of sequencing error rates and levels of autozygosity. AVAILABILITY AND IMPLEMENTATION: BCFtools/RoH and its associated binary/source files are freely available from https://github.com/samtools/BCFtools CONTACT: vn2@sanger.ac.uk or pd3@sanger.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
High-Throughput Nucleotide Sequencing , Exome , Genomics , Genotype , Homozygote , Software
11.
Nature ; 470(7332): 59-65, 2011 Feb 03.
Article in English | MEDLINE | ID: mdl-21293372

ABSTRACT

Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.


Subject(s)
DNA Copy Number Variations/genetics , Genetics, Population , Genome, Human/genetics , Genomics , Gene Duplication/genetics , Genetic Predisposition to Disease/genetics , Genotype , Humans , Mutagenesis, Insertional/genetics , Reproducibility of Results , Sequence Analysis, DNA , Sequence Deletion/genetics
12.
Evol Anthropol ; 24(4): 149-64, 2015.
Article in English | MEDLINE | ID: mdl-26267436

ABSTRACT

Current fossil, genetic, and archeological data indicate that Homo sapiens originated in Africa in the late Middle Pleistocene. By the end of the Late Pleistocene, our species was distributed across every continent except Antarctica, setting the foundations for the subsequent demographic and cultural changes of the Holocene. The intervening processes remain intensely debated and a key theme in hominin evolutionary studies. We review archeological, fossil, environmental, and genetic data to evaluate the current state of knowledge on the dispersal of Homo sapiens out of Africa. The emerging picture of the dispersal process suggests dynamic behavioral variability, complex interactions between populations, and an intricate genetic and cultural legacy. This evolutionary and historical complexity challenges simple narratives and suggests that hybrid models and the testing of explicit hypotheses are required to understand the expansion of Homo sapiens into Eurasia.


Subject(s)
Biological Evolution , Fossils , Human Migration , Africa , Asia , Australia , DNA, Mitochondrial , Female , Humans , Male , Paleontology , Technology
13.
PLoS Genet ; 8(12): e1003125, 2012.
Article in English | MEDLINE | ID: mdl-23284294

ABSTRACT

We present a hidden Markov model (HMM) for inferring gradual isolation between two populations during speciation, modelled as a time interval with restricted gene flow. The HMM describes the history of adjacent nucleotides in two genomic sequences, such that the nucleotides can be separated by recombination, can migrate between populations, or can coalesce at variable time points, all dependent on the parameters of the model, which are the effective population sizes, splitting times, recombination rate, and migration rate. We show by extensive simulations that the HMM can accurately infer all parameters except the recombination rate, which is biased downwards. Inference is robust to variation in the mutation rate and the recombination rate over the sequence and also robust to unknown phase of genomes unless they are very closely related. We provide a test for whether divergence is gradual or instantaneous, and we apply the model to three key divergence processes in great apes: (a) the bonobo and common chimpanzee, (b) the eastern and western gorilla, and (c) the Sumatran and Bornean orang-utan. We find that the bonobo and chimpanzee appear to have undergone a clear split, whereas the divergence processes of the gorilla and orang-utan species occurred over several hundred thousands years with gene flow stopping quite recently. We also apply the model to the Homo/Pan speciation event and find that the most likely scenario involves an extended period of gene flow during speciation.


Subject(s)
Evolution, Molecular , Genetic Speciation , Genetic Variation , Genome , Animals , Gene Flow , Genetics, Population , Gorilla gorilla/genetics , Humans , Markov Chains , Models, Theoretical , Pan paniscus/genetics , Pan troglodytes/genetics , Phylogeny , Pongo/genetics , Population Density
14.
Nature ; 456(7218): 53-9, 2008 Nov 06.
Article in English | MEDLINE | ID: mdl-18987734

ABSTRACT

DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.


Subject(s)
Genome, Human/genetics , Genomics/methods , Sequence Analysis, DNA/methods , Chromosomes, Human, X/genetics , Consensus Sequence/genetics , Genomics/economics , Genotype , Humans , Male , Nigeria , Polymorphism, Single Nucleotide/genetics , Sensitivity and Specificity , Sequence Analysis, DNA/economics
15.
Trends Mol Med ; 30(6): 541-551, 2024 Jun.
Article in English | MEDLINE | ID: mdl-38677980

ABSTRACT

Population differences in cardiometabolic disease remain unexplained. Misleading assumptions over genetic explanations are partly due to terminology used to distinguish populations, specifically ancestry, race, and ethnicity. These terms differentially implicate environmental and biological causal pathways, which should inform their use. Genetic variation alone accounts for a limited fraction of population differences in cardiometabolic disease. Research effort should focus on societally driven, lifelong environmental determinants of population differences in disease. Rather than pursuing population stratifiers to personalize medicine, we advocate removing socioeconomic barriers to receipt of and adherence to healthcare interventions, which will have markedly greater impact on improving cardiometabolic outcomes. This requires multidisciplinary collaboration and public and policymaker engagement to address inequalities driven by society rather than biology per se.


Subject(s)
Cardiovascular Diseases , Ethnicity , Racial Groups , Humans , Cardiovascular Diseases/epidemiology , Metabolic Diseases/epidemiology , Metabolic Diseases/genetics , Genetic Predisposition to Disease , Socioeconomic Factors , Healthcare Disparities/ethnology
16.
Nat Methods ; 5(12): 1005-10, 2008 Dec.
Article in English | MEDLINE | ID: mdl-19034268

ABSTRACT

The Wellcome Trust Sanger Institute is one of the world's largest genome centers, and a substantial amount of our sequencing is performed with 'next-generation' massively parallel sequencing technologies: in June 2008 the quantity of purity-filtered sequence data generated by our Genome Analyzer (Illumina) platforms reached 1 terabase, and our average weekly Illumina production output is currently 64 gigabases. Here we describe a set of improvements we have made to the standard Illumina protocols to make the library preparation more reliable in a high-throughput environment, to reduce bias, tighten insert size distribution and reliably obtain high yields of data.


Subject(s)
Academies and Institutes , Chromosome Mapping/instrumentation , Genomics/instrumentation , Polymerase Chain Reaction/instrumentation , Sequence Analysis, DNA/instrumentation , Equipment Design
17.
Science ; 367(6484)2020 03 20.
Article in English | MEDLINE | ID: mdl-32193295

ABSTRACT

Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented common genetic variation private to southern Africa, central Africa, Oceania, and the Americas, but an absence of such variants fixed between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the past 10,000 years, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations.


Subject(s)
Genetic Variation , Genetics, Population , Genome, Human , Whole Genome Sequencing , Africa , Americas , Animals , Asia , DNA Copy Number Variations , Haplotypes , Hominidae/genetics , Humans , INDEL Mutation , Neanderthals/genetics , Oceania , Phylogeny , Polymorphism, Single Nucleotide , Population Density , Racial Groups/genetics
18.
Genome Biol ; 19(1): 193, 2018 11 15.
Article in English | MEDLINE | ID: mdl-30428903

ABSTRACT

BACKGROUND: Integrating demography and adaptive evolution is pivotal to understanding the evolutionary history and conservation of great apes. However, little is known about the adaptive evolution of our closest relatives, in particular if and to what extent adaptions to environmental differences have occurred. Here, we used whole-genome sequencing data from critically endangered orangutans from North Sumatra (Pongo abelii) and Borneo (P. pygmaeus) to investigate adaptive responses of each species to environmental differences during the Pleistocene. RESULTS: Taking into account the markedly disparate demographic histories of each species after their split ~ 1 Ma ago, we show that persistent environmental differences on each island had a strong impact on the adaptive evolution of the genus Pongo. Across a range of tests for positive selection, we find a consistent pattern of between-island and species differences. In the more productive Sumatran environment, the most notable signals of positive selection involve genes linked to brain and neuronal development, learning, and glucose metabolism. On Borneo, however, positive selection comprised genes involved in lipid metabolism, as well as cardiac and muscle activities. CONCLUSIONS: We find strikingly different sets of genes appearing to have evolved under strong positive selection in each species. In Sumatran orangutans, selection patterns were congruent with well-documented cognitive and behavioral differences between the species, such as a larger and more complex cultural repertoire and higher degrees of sociality. However, in Bornean orangutans, selective responses to fluctuating environmental conditions appear to have produced physiological adaptations to generally lower and temporally more unpredictable food supplies.


Subject(s)
Adaptation, Biological , Biological Evolution , Genetic Variation , Genetics, Population , Genome , Pongo/genetics , Animals , Genetic Speciation , Phylogeny , Pongo/classification
19.
Trends Ecol Evol ; 33(8): 582-594, 2018 08.
Article in English | MEDLINE | ID: mdl-30007846

ABSTRACT

We challenge the view that our species, Homo sapiens, evolved within a single population and/or region of Africa. The chronology and physical diversity of Pleistocene human fossils suggest that morphologically varied populations pertaining to the H. sapiens clade lived throughout Africa. Similarly, the African archaeological record demonstrates the polycentric origin and persistence of regionally distinct Pleistocene material culture in a variety of paleoecological settings. Genetic studies also indicate that present-day population structure within Africa extends to deep times, paralleling a paleoenvironmental record of shifting and fractured habitable zones. We argue that these fields support an emerging view of a highly structured African prehistory that should be considered in human evolutionary inferences, prompting new interpretations, questions, and interdisciplinary research directions.


Subject(s)
Biological Evolution , Hominidae/classification , Africa , Animals , Archaeology , Ecosystem , Fossils , Genetics, Population , Geography , Hominidae/anatomy & histology , Hominidae/genetics , Humans
20.
Science ; 360(6392): 1024-1027, 2018 06 01.
Article in English | MEDLINE | ID: mdl-29853687

ABSTRACT

Little is known regarding the first people to enter the Americas and their genetic legacy. Genomic analysis of the oldest human remains from the Americas showed a direct relationship between a Clovis-related ancestral population and all modern Central and South Americans as well as a deep split separating them from North Americans in Canada. We present 91 ancient human genomes from California and Southwestern Ontario and demonstrate the existence of two distinct ancestries in North America, which possibly split south of the ice sheets. A contribution from both of these ancestral populations is found in all modern Central and South Americans. The proportions of these two ancestries in ancient and modern populations are consistent with a coastal dispersal and multiple admixture events.


Subject(s)
Biological Evolution , Emigration and Immigration , Genome, Human , Population/genetics , California , Humans , Ontario
SELECTION OF CITATIONS
SEARCH DETAIL