Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
BMC Bioinformatics ; 22(1): 459, 2021 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-34563119

RESUMO

BACKGROUND: We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual's ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds of thousands, then annotating those models with population information from reference panels of known ancestry. RESULTS: The running time of ARCHes does not depend on the size of a reference panel because training and testing are separate processes, and the inferred population-annotated haplotype models can be written to disk and reused to label large test sets in parallel (in our experiments, it averages less than one minute to assign ancestry from 32 populations using 10 CPU). We test ARCHes on public data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP) as well as simulated examples of known admixture. CONCLUSIONS: Our results demonstrate that ARCHes outperforms RFMix at correctly assigning both global and local ancestry at finer population scales regardless of the amount of population admixture.


Assuntos
Genética Populacional , Genoma Humano , Haplótipos , Humanos , Polimorfismo de Nucleotídeo Único
2.
Nature ; 464(7289): 713-20, 2010 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-20360734

RESUMO

Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to have an important role in genetic susceptibility to common disease. To address this we undertook a large, direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed approximately 19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated approximately 50% of all common CNVs larger than 500 base pairs. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with disease-IRGM for Crohn's disease, HLA for Crohn's disease, rheumatoid arthritis and type 1 diabetes, and TSPAN8 for type 2 diabetes-although in each case the locus had previously been identified in single nucleotide polymorphism (SNP)-based studies, reflecting our observation that most common CNVs that are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs that can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases.


Assuntos
Variações do Número de Cópias de DNA/genética , Doença , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla , Artrite Reumatoide/genética , Estudos de Casos e Controles , Doença de Crohn/genética , Diabetes Mellitus/genética , Frequência do Gene/genética , Humanos , Hibridização de Ácido Nucleico , Análise de Sequência com Séries de Oligonucleotídeos , Projetos Piloto , Polimorfismo de Nucleotídeo Único/genética , Controle de Qualidade
3.
PLoS Genet ; 9(12): e1004023, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24385924

RESUMO

There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and especially Native American ancestry to these populations. Estimated Native American ancestry is 48% in MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern American ancestry of the Taíno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas 16 thousand years ago (kya), supports that the MXL Ancestors split 12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The model also features effective populations of 62,000 in Mexico, 8,700 in Colombia, and 1,900 in Puerto Rico. Modeling Identity-by-descent (IBD) and ancestry tract length, we show that post-contact populations also differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe. Finally, we compare IBD and ancestry assignments to find evidence for relatedness among European founders to the three populations.


Assuntos
Frequência do Gene/genética , Genética Populacional , Migração Humana , Indígenas Norte-Americanos/genética , População Negra/genética , Mapeamento Cromossômico , Exoma , Genoma Humano , Hispânico ou Latino/genética , Projeto Genoma Humano , Humanos , Americanos Mexicanos/genética , México , Porto Rico , Grupos Raciais/genética , População Branca/genética
4.
PLoS Genet ; 9(11): e1003925, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24244192

RESUMO

The Caribbean basin is home to some of the most complex interactions in recent history among previously diverged human populations. Here, we investigate the population genetic history of this region by characterizing patterns of genome-wide variation among 330 individuals from three of the Greater Antilles (Cuba, Puerto Rico, Hispaniola), two mainland (Honduras, Colombia), and three Native South American (Yukpa, Bari, and Warao) populations. We combine these data with a unique database of genomic variation in over 3,000 individuals from diverse European, African, and Native American populations. We use local ancestry inference and tract length distributions to test different demographic scenarios for the pre- and post-colonial history of the region. We develop a novel ancestry-specific PCA (ASPCA) method to reconstruct the sub-continental origin of Native American, European, and African haplotypes from admixed genomes. We find that the most likely source of the indigenous ancestry in Caribbean islanders is a Native South American component shared among inland Amazonian tribes, Central America, and the Yucatan peninsula, suggesting extensive gene flow across the Caribbean in pre-Columbian times. We find evidence of two pulses of African migration. The first pulse--which today is reflected by shorter, older ancestry tracts--consists of a genetic component more similar to coastal West African regions involved in early stages of the trans-Atlantic slave trade. The second pulse--reflected by longer, younger tracts--is more similar to present-day West-Central African populations, supporting historical records of later transatlantic deportation. Surprisingly, we also identify a Latino-specific European component that has significantly diverged from its parental Iberian source populations, presumably as a result of small European founder population size. We demonstrate that the ancestral components in admixed genomes can be traced back to distinct sub-continental source populations with far greater resolution than previously thought, even when limited pre-Columbian Caribbean haplotypes have survived.


Assuntos
População Negra/genética , Fluxo Gênico , Genética Populacional , Indígenas Norte-Americanos/genética , População Branca/genética , Região do Caribe , DNA Mitocondrial/genética , Demografia , Genômica , Haplótipos , Hispânico ou Latino/genética , Humanos
5.
Am J Hum Genet ; 91(4): 660-71, 2012 Oct 05.
Artigo em Inglês | MEDLINE | ID: mdl-23040495

RESUMO

Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago.


Assuntos
Genoma Humano , Haplótipos/genética , População/genética , Grupos Raciais/genética , Genética Populacional/métodos , Heterozigoto , Humanos , Polimorfismo de Nucleotídeo Único
6.
PLoS Genet ; 8(1): e1002397, 2012 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22253600

RESUMO

North African populations are distinct from sub-Saharan Africans based on cultural, linguistic, and phenotypic attributes; however, the time and the extent of genetic divergence between populations north and south of the Sahara remain poorly understood. Here, we interrogate the multilayered history of North Africa by characterizing the effect of hypothesized migrations from the Near East, Europe, and sub-Saharan Africa on current genetic diversity. We present dense, genome-wide SNP genotyping array data (730,000 sites) from seven North African populations, spanning from Egypt to Morocco, and one Spanish population. We identify a gradient of likely autochthonous Maghrebi ancestry that increases from east to west across northern Africa; this ancestry is likely derived from "back-to-Africa" gene flow more than 12,000 years ago (ya), prior to the Holocene. The indigenous North African ancestry is more frequent in populations with historical Berber ethnicity. In most North African populations we also see substantial shared ancestry with the Near East, and to a lesser extent sub-Saharan Africa and Europe. To estimate the time of migration from sub-Saharan populations into North Africa, we implement a maximum likelihood dating method based on the distribution of migrant tracts. In order to first identify migrant tracts, we assign local ancestry to haplotypes using a novel, principal component-based analysis of three ancestral populations. We estimate that a migration of western African origin into Morocco began about 40 generations ago (approximately 1,200 ya); a migration of individuals with Nilotic ancestry into Egypt occurred about 25 generations ago (approximately 750 ya). Our genomic data reveal an extraordinarily complex history of migrations, involving at least five ancestral populations, into North Africa.


Assuntos
População Negra/genética , Fluxo Gênico/genética , Variação Genética , Dinâmica Populacional , População , África Subsaariana/etnologia , África do Norte , População Negra/história , DNA Mitocondrial/genética , Antigo Egito , Emigração e Imigração , Europa (Continente) , Pool Gênico , Genômica , Genótipo , Haplótipos , História Antiga , Humanos , Oriente Médio , Marrocos , Polimorfismo de Nucleotídeo Único , População Branca/genética , População Branca/história
7.
PLoS Genet ; 7(9): e1002280, 2011 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-21935354

RESUMO

Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.


Assuntos
Análise Mutacional de DNA/métodos , Genes Sintéticos , Variação Genética , Estudo de Associação Genômica Ampla/métodos , Trombofilia/genética , Alelos , Sequência de Bases , Feminino , Predisposição Genética para Doença , Genoma Humano , Genótipo , Haplótipos , Humanos , Masculino , Linhagem , Padrões de Referência , Medição de Risco , Alinhamento de Sequência , Análise de Sequência de DNA
8.
J Neurol Neurosurg Psychiatry ; 83(8): 793-5, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-22626946

RESUMO

OBJECTIVE: Pregnancy has a well documented effect on relapse risk in multiple sclerosis (MS). Prospective studies have reported a significant decline by two-thirds in the rate of relapses during the third trimester of pregnancy and a significant increase by two-thirds during the first 3 months postpartum. However, it is unclear as to whether there are any long term effects on disability. METHODS: Data were collated from clinical records and family histories systematically collected from the University of British Columbia MS Clinic. RESULTS: Clinical and term pregnancy data were available from 2105 female MS patients. MS patients having children after MS onset took the longest time to reach an Expanded Disability Status Scale (EDSS) score of 6 (mean 22.9 years) and patients having children before MS onset were the quickest (mean 13.2 years). However, these effects were not related to term pregnancy and were fully accounted for by age of MS onset. CONCLUSIONS: Pregnancy had no effect on the time to reach an EDSS score 6. As MS predominantly affects women of childbearing age, women with MS can be reassured that term pregnancies do not appear to have any long term effects on disability.


Assuntos
Esclerose Múltipla/etiologia , Complicações na Gravidez/epidemiologia , Atividades Cotidianas , Adulto , Idade de Início , Estudos de Coortes , Feminino , Humanos , Idade Materna , Esclerose Múltipla/patologia , Paridade , Gravidez , Complicações na Gravidez/patologia , Resultado da Gravidez , Adulto Jovem
9.
Hum Biol ; 84(4): 343-64, 2012 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23249312

RESUMO

Identifying ancestry along each chromosome in admixed individuals provides a wealth of information for understanding the population genetic history of admixture events and is valuable for admixture mapping and identifying recent targets of selection. We present PCAdmix (available at https://sites.google.com/site/pcadmix/home ), a Principal Components-based algorithm for determining ancestry along each chromosome from a high-density, genome-wide set of phased single-nucleotide polymorphism (SNP) genotypes of admixed individuals. We compare our method to HAPMIX on simulated data from two ancestral populations, and we find high concordance between the methods. Our method also has better accuracy than LAMP when applied to three-population admixture, a situation as yet unaddressed by HAPMIX. Finally, we apply our method to a data set of four Latino populations with European, African, and Native American ancestry. We find evidence of assortative mating in each of the four populations, and we identify regions of shared ancestry that may be recent targets of selection and could serve as candidate regions for admixture-based association mapping.


Assuntos
Cromossomos Humanos , Genótipo , Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Dinâmica Populacional , Análise de Componente Principal/métodos , Grupos Raciais/genética , Algoritmos , Simulação por Computador , Genômica , Humanos , Filogeografia , Estados Unidos
10.
Nat Commun ; 12(1): 6442, 2021 11 08.
Artigo em Inglês | MEDLINE | ID: mdl-34750360

RESUMO

The genetic architecture of atrial fibrillation (AF) encompasses low impact, common genetic variants and high impact, rare variants. Here, we characterize a high impact AF-susceptibility allele, KCNQ1 R231H, and describe its transcontinental geographic distribution and history. Induced pluripotent stem cell-derived cardiomyocytes procured from risk allele carriers exhibit abbreviated action potential duration, consistent with a gain-of-function effect. Using identity-by-descent (IBD) networks, we estimate the broad- and fine-scale population ancestry of risk allele carriers and their relatives. Analysis of ancestral migration routes reveals ancestors who inhabited Denmark in the 1700s, migrated to the Northeastern United States in the early 1800s, and traveled across the Midwest to arrive in Utah in the late 1800s. IBD/coalescent-based allele dating analysis reveals a relatively recent origin of the AF risk allele (~5000 years). Thus, our approach broadens the scope of study for disease susceptibility alleles to the context of human migration and ancestral origins.


Assuntos
Fibrilação Atrial/genética , Predisposição Genética para Doença/genética , Canal de Potássio KCNQ1/genética , Mutação de Sentido Incorreto , Polimorfismo de Nucleotídeo Único , Potenciais de Ação , Alelos , Dinamarca , Emigrantes e Imigrantes , Feminino , Genótipo , Geografia , Humanos , Células-Tronco Pluripotentes Induzidas/citologia , Células-Tronco Pluripotentes Induzidas/metabolismo , Masculino , Pessoa de Meia-Idade , Miócitos Cardíacos/citologia , Miócitos Cardíacos/metabolismo , Miócitos Cardíacos/fisiologia , Linhagem , Fatores de Risco , Utah
11.
J Hum Genet ; 54(9): 547-9, 2009 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-19629136

RESUMO

Multiple sclerosis (MS) is a complex neurological trait. Allelic variation in the MHC class II region exerts the single strongest effect on MS genetic risk. The clinical onset of the disease is extremely variable, and can range from the first to the ninth decade of life. Epidemiological studies have suggested a modest genetic component to the age of onset (AO) of MS. Previous studies have shown that HLA-DRB1*1501 may be associated with a younger AO. Here, we sought to uncover any effect of HLA-DRB1*1501 on the AO of MS in a large Canadian cohort. A total of 1816 MS patients were genotyped for HLA-DRB1. Patients carrying HLA-DRB1*1501 were shown to have a small, but significantly lower, AO than patients without the allele (P=0.03). HLA-DRB1*1501 was also shown to reduce the mean AO in both progressive and relapsing forms of the disease. An investigation of parent-of-origin effects indicated that the lower AO for HLA-DRB1*1501 patients arises from maternally transmitted HLA-DRB1*1501 haplotypes (maternal HLA-DRB1*1501 mean AO=28.4 years, paternal=30.3 years; P=0.009). HLA-DRB1*1501 exerts a modest, but significant effect on the AO of all forms of MS. Parent-of-origin effects at the MHC are further implicated in MS disease pathogenesis.


Assuntos
Antígenos HLA-DR/genética , Haplótipos/genética , Esclerose Múltipla/genética , Adulto , Idade de Início , Alelos , Canadá , Feminino , Predisposição Genética para Doença , Genótipo , Antígenos HLA-DR/imunologia , Cadeias HLA-DRB1 , Humanos , Masculino , Pais , Fenótipo , Fatores de Risco
12.
G3 (Bethesda) ; 9(9): 2863-2878, 2019 09 04.
Artigo em Inglês | MEDLINE | ID: mdl-31484785

RESUMO

We present a massive investigation into the genetic basis of human lifespan. Beginning with a genome-wide association (GWA) study using a de-identified snapshot of the unique AncestryDNA database - more than 300,000 genotyped individuals linked to pedigrees of over 400,000,000 people - we mapped six genome-wide significant loci associated with parental lifespan. We compared these results to a GWA analysis of the traditional lifespan proxy trait, age, and found only one locus, APOE, to be associated with both age and lifespan. By combining the AncestryDNA results with those of an independent UK Biobank dataset, we conducted a meta-analysis of more than 650,000 individuals and identified fifteen parental lifespan-associated loci. Beyond just those significant loci, our genome-wide set of polymorphisms accounts for up to 8% of the variance in human lifespan; this value represents a large fraction of the heritability estimated from phenotypic correlations between relatives.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Longevidade/genética , Idoso , Idoso de 80 Anos ou mais , Apolipoproteínas E/genética , Proteínas de Transporte/genética , Bases de Dados Genéticas , Feminino , Humanos , Masculino , Proteínas Nucleares/genética , Linhagem , Polimorfismo de Nucleotídeo Único , Estudos Prospectivos , Proteínas Proto-Oncogênicas/genética
13.
Genetics ; 210(3): 1109-1124, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30401766

RESUMO

Human life span is a phenotype that integrates many aspects of health and environment into a single ultimate quantity: the elapsed time between birth and death. Though it is widely believed that long life runs in families for genetic reasons, estimates of life span "heritability" are consistently low (∼15-30%). Here, we used pedigree data from Ancestry public trees, including hundreds of millions of historical persons, to estimate the heritability of human longevity. Although "nominal heritability" estimates based on correlations among genetic relatives agreed with prior literature, the majority of that correlation was also captured by correlations among nongenetic (in-law) relatives, suggestive of highly assortative mating around life span-influencing factors (genetic and/or environmental). We used structural equation modeling to account for assortative mating, and concluded that the true heritability of human longevity for birth cohorts across the 1800s and early 1900s was well below 10%, and that it has been generally overestimated due to the effect of assortative mating.


Assuntos
Longevidade/genética , Reprodução , Feminino , Humanos , Masculino , Modelos Genéticos , Linhagem
14.
Nat Commun ; 8: 14238, 2017 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-28169989

RESUMO

Despite strides in characterizing human history from genetic polymorphism data, progress in identifying genetic signatures of recent demography has been limited. Here we identify very recent fine-scale population structure in North America from a network of over 500 million genetic (identity-by-descent, IBD) connections among 770,000 genotyped individuals of US origin. We detect densely connected clusters within the network and annotate these clusters using a database of over 20 million genealogical records. Recent population patterns captured by IBD clustering include immigrants such as Scandinavians and French Canadians; groups with continental admixture such as Puerto Ricans; settlers such as the Amish and Appalachians who experienced geographic or cultural isolation; and broad historical trends, including reduced north-south gene flow. Our results yield a detailed historical portrait of North America after European settlement and support substantial genetic heterogeneity in the United States beyond that uncovered by previous studies.


Assuntos
Demografia/estatística & dados numéricos , Genética Populacional/métodos , Dinâmica Populacional/tendências , População/genética , Análise por Conglomerados , Demografia/métodos , Emigrantes e Imigrantes , Fluxo Gênico/genética , Técnicas de Genotipagem , Haplótipos/genética , Humanos , Polimorfismo de Nucleotídeo Único , Dinâmica Populacional/estatística & dados numéricos , Análise de Sequência de DNA , Estados Unidos/etnologia
15.
Nat Genet ; 44(12): 1294-301, 2012 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-23104008

RESUMO

To further investigate susceptibility loci identified by genome-wide association studies, we genotyped 5,500 SNPs across 14 associated regions in 8,000 samples from a control group and 3 diseases: type 2 diabetes (T2D), coronary artery disease (CAD) and Graves' disease. We defined, using Bayes theorem, credible sets of SNPs that were 95% likely, based on posterior probability, to contain the causal disease-associated SNPs. In 3 of the 14 regions, TCF7L2 (T2D), CTLA4 (Graves' disease) and CDKN2A-CDKN2B (T2D), much of the posterior probability rested on a single SNP, and, in 4 other regions (CDKN2A-CDKN2B (CAD) and CDKAL1, FTO and HHEX (T2D)), the 95% sets were small, thereby excluding most SNPs as potentially causal. Very few SNPs in our credible sets had annotated functions, illustrating the limitations in understanding the mechanisms underlying susceptibility to common diseases. Our results also show the value of more detailed mapping to target sequences for functional studies.


Assuntos
Doença da Artéria Coronariana/genética , Diabetes Mellitus Tipo 2/genética , Loci Gênicos , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Doença de Graves/genética , Dioxigenase FTO Dependente de alfa-Cetoglutarato , Teorema de Bayes , Antígeno CTLA-4/genética , Quinase 5 Dependente de Ciclina/genética , Inibidor de Quinase Dependente de Ciclina p15/genética , Genes p16 , Proteínas de Homeodomínio/genética , Humanos , Polimorfismo de Nucleotídeo Único , Proteínas/genética , Proteína 2 Semelhante ao Fator 7 de Transcrição/genética , Fatores de Transcrição/genética , tRNA Metiltransferases
16.
Genome Biol ; 9(11): R165, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-19025653

RESUMO

Whole genome tiling arrays are a key tool for profiling global genetic and expression variation. In this study we present our methods for detecting transcript level variation, splicing variation and allele specific expression in Arabidopsis thaliana. We also developed a generalized hidden Markov model for profiling transcribed fragment variation de novo. Our study demonstrates that whole genome tiling arrays are a powerful platform for dissecting natural transcriptome variation at multi-dimension and high resolution.


Assuntos
Arabidopsis/genética , Perfilação da Expressão Gênica , Genoma de Planta , Polimorfismo Genético , Processamento Alternativo , Arabidopsis/metabolismo , Regulação da Expressão Gênica de Plantas , Cadeias de Markov
17.
Mol Biol Evol ; 23(6): 1136-43, 2006 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-16527865

RESUMO

In Saccharomyces, an ancient whole-genome duplication (WGD) and widespread duplicate gene deletion resulted in extensive reorganization of adjacent gene relationships. We have studied the evolution of adjacent gene pairs' identity, orientation, and spacing following whole-genome duplication and deletion (WGD-D) using comparative genomic analyses and simulations. Surveying adjacent gene organization across the Saccharomyces species complex, we find a genome-wide bias toward divergently and convergently transcribed gene pairs in all species but a reduction in this bias in the species that underwent WGD-D. Among neutral models of WGD-D, only single-gene deletion can produce the appropriate reduction in orientation bias and recapitulate the pattern of short, highly dispersed deletions we observe in Saccharomyces cerevisiae. To characterize the dynamics of WGD-D, we trace the conservation and creation of adjacent gene pairs along the S. cerevisiae lineage. We find that newly created adjacencies have a tandem orientation bias, while adjacencies conserved from prior to WGD-D have the same divergent-convergent bias as found in the species that diverged before WGD. We also find that adjacent gene pairs produced by WGD-D gained greater intergenic spacing but that this is reduced in the older adjacencies. Given this, and the preponderance of short deleted blocks, we argue that the deletion phase of WGD-D occurred primarily by small inactivating mutations followed by numerous small deletions. Newly created adjacent gene pairs also have an initial increase in mean log2 expression ratios and maximal expression levels, suggesting that increased intergenic spacing caused a genome-wide reduction in transcriptional interference.


Assuntos
Deleção de Genes , Duplicação Gênica , Genoma Fúngico , Saccharomyces cerevisiae/genética , Saccharomyces/genética , Evolução Molecular , Expressão Gênica , Genes Fúngicos , Análise de Sequência com Séries de Oligonucleotídeos
18.
Proc Natl Acad Sci U S A ; 103(39): 14412-6, 2006 Sep 26.
Artigo em Inglês | MEDLINE | ID: mdl-16971485

RESUMO

Many Saccharomyces cerevisiae duplicate genes that were derived from an ancient whole-genome duplication (WGD) unexpectedly show a small synonymous divergence (K(S)), a higher sequence similarity to each other than to orthologues in Saccharomyces bayanus, or slow evolution compared with the orthologue in Kluyveromyces waltii, a non-WGD species. This decelerated evolution was attributed to gene conversion between duplicates. Using approximately 300 WGD gene pairs in four species and their orthologues in non-WGD species, we show that codon-usage bias and protein-sequence conservation are two important causes for decelerated evolution of duplicate genes, whereas gene conversion is effective only in the presence of strong codon-usage bias or protein-sequence conservation. Furthermore, we find that change in mutation pattern or in tDNA copy number changed codon-usage bias and increased the K(S) distance between K. waltii and S. cerevisiae. Intriguingly, some proteins showed fast evolution before the radiation of WGD species but little or no sequence divergence between orthologues and paralogues thereafter, indicating that functional conservation after the radiation may also be responsible for decelerated evolution in duplicates.


Assuntos
Códon/genética , Duplicação Gênica , Filogenia , Leveduras/genética , DNA Fúngico/genética , Genes Fúngicos/genética
19.
Proc Natl Acad Sci U S A ; 103(7): 2232-6, 2006 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-16461903

RESUMO

The question of how duplicate genes are retained in a population remains controversial. The duplication-degeneration-complementation model, which involves no positive selection, stipulates a higher retention rate of duplicate genes in a small population than in a large one. This model has been accepted by many evolutionists. However, we found considerably more retentions and fewer losses of duplicate genes in the mouse genome than in the human genome, although the population size of rodents is in general larger than that of primates. Indeed, in nearly every interval of synonymous divergence between duplicate genes, the number of gene retentions in mouse is larger than that in human. Our findings suggest a more important role of positive selection in duplicate retention than duplication-degeneration-complementation. In addition, certain functional categories show a higher tendency of lineage-specific expansion than expected, suggesting lineage-specific selection or functional bias in retained duplicates.


Assuntos
Duplicação Gênica , Genes Duplicados/genética , Genoma Humano/genética , Genoma/genética , Seleção Genética , Animais , Linhagem da Célula/genética , Humanos , Camundongos , Modelos Genéticos
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa