Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Más filtros

Banco de datos
Tipo del documento
Intervalo de año de publicación
1.
BMC Bioinformatics ; 22(1): 459, 2021 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-34563119

RESUMEN

BACKGROUND: We present ARCHes, a fast and accurate haplotype-based approach for inferring an individual's ancestry composition. Our approach works by modeling haplotype diversity from a large, admixed cohort of hundreds of thousands, then annotating those models with population information from reference panels of known ancestry. RESULTS: The running time of ARCHes does not depend on the size of a reference panel because training and testing are separate processes, and the inferred population-annotated haplotype models can be written to disk and reused to label large test sets in parallel (in our experiments, it averages less than one minute to assign ancestry from 32 populations using 10 CPU). We test ARCHes on public data from the 1000 Genomes Project and the Human Genome Diversity Project (HGDP) as well as simulated examples of known admixture. CONCLUSIONS: Our results demonstrate that ARCHes outperforms RFMix at correctly assigning both global and local ancestry at finer population scales regardless of the amount of population admixture.


Asunto(s)
Genética de Población , Genoma Humano , Haplotipos , Humanos , Polimorfismo de Nucleótido Simple
2.
Nature ; 464(7289): 713-20, 2010 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-20360734

RESUMEN

Copy number variants (CNVs) account for a major proportion of human genetic polymorphism and have been predicted to have an important role in genetic susceptibility to common disease. To address this we undertook a large, direct genome-wide study of association between CNVs and eight common human diseases. Using a purpose-designed array we typed approximately 19,000 individuals into distinct copy-number classes at 3,432 polymorphic CNVs, including an estimated approximately 50% of all common CNVs larger than 500 base pairs. We identified several biological artefacts that lead to false-positive associations, including systematic CNV differences between DNAs derived from blood and cell lines. Association testing and follow-up replication analyses confirmed three loci where CNVs were associated with disease-IRGM for Crohn's disease, HLA for Crohn's disease, rheumatoid arthritis and type 1 diabetes, and TSPAN8 for type 2 diabetes-although in each case the locus had previously been identified in single nucleotide polymorphism (SNP)-based studies, reflecting our observation that most common CNVs that are well-typed on our array are well tagged by SNPs and so have been indirectly explored through SNP studies. We conclude that common CNVs that can be typed on existing platforms are unlikely to contribute greatly to the genetic basis of common human diseases.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Enfermedad , Predisposición Genética a la Enfermedad/genética , Estudio de Asociación del Genoma Completo , Artritis Reumatoide/genética , Estudios de Casos y Controles , Enfermedad de Crohn/genética , Diabetes Mellitus/genética , Frecuencia de los Genes/genética , Humanos , Hibridación de Ácido Nucleico , Análisis de Secuencia por Matrices de Oligonucleótidos , Proyectos Piloto , Polimorfismo de Nucleótido Simple/genética , Control de Calidad
3.
PLoS Genet ; 9(12): e1004023, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24385924

RESUMEN

There is great scientific and popular interest in understanding the genetic history of populations in the Americas. We wish to understand when different regions of the continent were inhabited, where settlers came from, and how current inhabitants relate genetically to earlier populations. Recent studies unraveled parts of the genetic history of the continent using genotyping arrays and uniparental markers. The 1000 Genomes Project provides a unique opportunity for improving our understanding of population genetic history by providing over a hundred sequenced low coverage genomes and exomes from Colombian (CLM), Mexican-American (MXL), and Puerto Rican (PUR) populations. Here, we explore the genomic contributions of African, European, and especially Native American ancestry to these populations. Estimated Native American ancestry is 48% in MXL, 25% in CLM, and 13% in PUR. Native American ancestry in PUR is most closely related to populations surrounding the Orinoco River basin, confirming the Southern American ancestry of the Taíno people of the Caribbean. We present new methods to estimate the allele frequencies in the Native American fraction of the populations, and model their distribution using a demographic model for three ancestral Native American populations. These ancestral populations likely split in close succession: the most likely scenario, based on a peopling of the Americas 16 thousand years ago (kya), supports that the MXL Ancestors split 12.2kya, with a subsequent split of the ancestors to CLM and PUR 11.7kya. The model also features effective populations of 62,000 in Mexico, 8,700 in Colombia, and 1,900 in Puerto Rico. Modeling Identity-by-descent (IBD) and ancestry tract length, we show that post-contact populations also differ markedly in their effective sizes and migration patterns, with Puerto Rico showing the smallest effective size and the earlier migration from Europe. Finally, we compare IBD and ancestry assignments to find evidence for relatedness among European founders to the three populations.


Asunto(s)
Frecuencia de los Genes/genética , Genética de Población , Migración Humana , Indígenas Norteamericanos/genética , Población Negra/genética , Mapeo Cromosómico , Exoma , Genoma Humano , Hispánicos o Latinos/genética , Proyecto Genoma Humano , Humanos , Americanos Mexicanos/genética , México , Puerto Rico , Grupos Raciales/genética , Población Blanca/genética
4.
PLoS Genet ; 9(11): e1003925, 2013 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-24244192

RESUMEN

The Caribbean basin is home to some of the most complex interactions in recent history among previously diverged human populations. Here, we investigate the population genetic history of this region by characterizing patterns of genome-wide variation among 330 individuals from three of the Greater Antilles (Cuba, Puerto Rico, Hispaniola), two mainland (Honduras, Colombia), and three Native South American (Yukpa, Bari, and Warao) populations. We combine these data with a unique database of genomic variation in over 3,000 individuals from diverse European, African, and Native American populations. We use local ancestry inference and tract length distributions to test different demographic scenarios for the pre- and post-colonial history of the region. We develop a novel ancestry-specific PCA (ASPCA) method to reconstruct the sub-continental origin of Native American, European, and African haplotypes from admixed genomes. We find that the most likely source of the indigenous ancestry in Caribbean islanders is a Native South American component shared among inland Amazonian tribes, Central America, and the Yucatan peninsula, suggesting extensive gene flow across the Caribbean in pre-Columbian times. We find evidence of two pulses of African migration. The first pulse--which today is reflected by shorter, older ancestry tracts--consists of a genetic component more similar to coastal West African regions involved in early stages of the trans-Atlantic slave trade. The second pulse--reflected by longer, younger tracts--is more similar to present-day West-Central African populations, supporting historical records of later transatlantic deportation. Surprisingly, we also identify a Latino-specific European component that has significantly diverged from its parental Iberian source populations, presumably as a result of small European founder population size. We demonstrate that the ancestral components in admixed genomes can be traced back to distinct sub-continental source populations with far greater resolution than previously thought, even when limited pre-Columbian Caribbean haplotypes have survived.


Asunto(s)
Población Negra/genética , Flujo Génico , Genética de Población , Indígenas Norteamericanos/genética , Población Blanca/genética , Región del Caribe , ADN Mitocondrial/genética , Demografía , Genómica , Haplotipos , Hispánicos o Latinos/genética , Humanos
5.
Am J Hum Genet ; 91(4): 660-71, 2012 Oct 05.
Artículo en Inglés | MEDLINE | ID: mdl-23040495

RESUMEN

Full sequencing of individual human genomes has greatly expanded our understanding of human genetic variation and population history. Here, we present a systematic analysis of 50 human genomes from 11 diverse global populations sequenced at high coverage. Our sample includes 12 individuals who have admixed ancestry and who have varying degrees of recent (within the last 500 years) African, Native American, and European ancestry. We found over 21 million single-nucleotide variants that contribute to a 1.75-fold range in nucleotide heterozygosity across diverse human genomes. This heterozygosity ranged from a high of one heterozygous site per kilobase in west African genomes to a low of 0.57 heterozygous sites per kilobase in segments inferred to have diploid Native American ancestry from the genomes of Mexican and Puerto Rican individuals. We show evidence of all three continental ancestries in the genomes of Mexican, Puerto Rican, and African American populations, and the genome-wide statistics are highly consistent across individuals from a population once ancestry proportions have been accounted for. Using a generalized linear model, we identified subtle variations across populations in the proportion of neutral versus deleterious variation and found that genome-wide statistics vary in admixed populations even once ancestry proportions have been factored in. We further infer that multiple periods of gene flow shaped the diversity of admixed populations in the Americas-70% of the European ancestry in today's African Americans dates back to European gene flow happening only 7-8 generations ago.


Asunto(s)
Genoma Humano , Haplotipos/genética , Población/genética , Grupos Raciales/genética , Genética de Población/métodos , Heterocigoto , Humanos , Polimorfismo de Nucleótido Simple
6.
PLoS Genet ; 8(1): e1002397, 2012 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-22253600

RESUMEN

North African populations are distinct from sub-Saharan Africans based on cultural, linguistic, and phenotypic attributes; however, the time and the extent of genetic divergence between populations north and south of the Sahara remain poorly understood. Here, we interrogate the multilayered history of North Africa by characterizing the effect of hypothesized migrations from the Near East, Europe, and sub-Saharan Africa on current genetic diversity. We present dense, genome-wide SNP genotyping array data (730,000 sites) from seven North African populations, spanning from Egypt to Morocco, and one Spanish population. We identify a gradient of likely autochthonous Maghrebi ancestry that increases from east to west across northern Africa; this ancestry is likely derived from "back-to-Africa" gene flow more than 12,000 years ago (ya), prior to the Holocene. The indigenous North African ancestry is more frequent in populations with historical Berber ethnicity. In most North African populations we also see substantial shared ancestry with the Near East, and to a lesser extent sub-Saharan Africa and Europe. To estimate the time of migration from sub-Saharan populations into North Africa, we implement a maximum likelihood dating method based on the distribution of migrant tracts. In order to first identify migrant tracts, we assign local ancestry to haplotypes using a novel, principal component-based analysis of three ancestral populations. We estimate that a migration of western African origin into Morocco began about 40 generations ago (approximately 1,200 ya); a migration of individuals with Nilotic ancestry into Egypt occurred about 25 generations ago (approximately 750 ya). Our genomic data reveal an extraordinarily complex history of migrations, involving at least five ancestral populations, into North Africa.


Asunto(s)
Población Negra/genética , Flujo Génico/genética , Variación Genética , Dinámica Poblacional , Población , África del Sur del Sahara/etnología , África del Norte , Población Negra/historia , ADN Mitocondrial/genética , Antiguo Egipto , Emigración e Inmigración , Europa (Continente) , Pool de Genes , Genómica , Genotipo , Haplotipos , Historia Antigua , Humanos , Medio Oriente , Marruecos , Polimorfismo de Nucleótido Simple , Población Blanca/genética , Población Blanca/historia
7.
PLoS Genet ; 7(9): e1002280, 2011 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-21935354

RESUMEN

Whole-genome sequencing harbors unprecedented potential for characterization of individual and family genetic variation. Here, we develop a novel synthetic human reference sequence that is ethnically concordant and use it for the analysis of genomes from a nuclear family with history of familial thrombophilia. We demonstrate that the use of the major allele reference sequence results in improved genotype accuracy for disease-associated variant loci. We infer recombination sites to the lowest median resolution demonstrated to date (< 1,000 base pairs). We use family inheritance state analysis to control sequencing error and inform family-wide haplotype phasing, allowing quantification of genome-wide compound heterozygosity. We develop a sequence-based methodology for Human Leukocyte Antigen typing that contributes to disease risk prediction. Finally, we advance methods for analysis of disease and pharmacogenomic risk across the coding and non-coding genome that incorporate phased variant data. We show these methods are capable of identifying multigenic risk for inherited thrombophilia and informing the appropriate pharmacological therapy. These ethnicity-specific, family-based approaches to interpretation of genetic variation are emblematic of the next generation of genetic risk assessment using whole-genome sequencing.


Asunto(s)
Análisis Mutacional de ADN/métodos , Genes Sintéticos , Variación Genética , Estudio de Asociación del Genoma Completo/métodos , Trombofilia/genética , Alelos , Secuencia de Bases , Femenino , Predisposición Genética a la Enfermedad , Genoma Humano , Genotipo , Haplotipos , Humanos , Masculino , Linaje , Estándares de Referencia , Medición de Riesgo , Alineación de Secuencia , Análisis de Secuencia de ADN
8.
J Neurol Neurosurg Psychiatry ; 83(8): 793-5, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-22626946

RESUMEN

OBJECTIVE: Pregnancy has a well documented effect on relapse risk in multiple sclerosis (MS). Prospective studies have reported a significant decline by two-thirds in the rate of relapses during the third trimester of pregnancy and a significant increase by two-thirds during the first 3 months postpartum. However, it is unclear as to whether there are any long term effects on disability. METHODS: Data were collated from clinical records and family histories systematically collected from the University of British Columbia MS Clinic. RESULTS: Clinical and term pregnancy data were available from 2105 female MS patients. MS patients having children after MS onset took the longest time to reach an Expanded Disability Status Scale (EDSS) score of 6 (mean 22.9 years) and patients having children before MS onset were the quickest (mean 13.2 years). However, these effects were not related to term pregnancy and were fully accounted for by age of MS onset. CONCLUSIONS: Pregnancy had no effect on the time to reach an EDSS score 6. As MS predominantly affects women of childbearing age, women with MS can be reassured that term pregnancies do not appear to have any long term effects on disability.


Asunto(s)
Esclerosis Múltiple/etiología , Complicaciones del Embarazo/epidemiología , Actividades Cotidianas , Adulto , Edad de Inicio , Estudios de Cohortes , Femenino , Humanos , Edad Materna , Esclerosis Múltiple/patología , Paridad , Embarazo , Complicaciones del Embarazo/patología , Resultado del Embarazo , Adulto Joven
9.
Hum Biol ; 84(4): 343-64, 2012 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23249312

RESUMEN

Identifying ancestry along each chromosome in admixed individuals provides a wealth of information for understanding the population genetic history of admixture events and is valuable for admixture mapping and identifying recent targets of selection. We present PCAdmix (available at https://sites.google.com/site/pcadmix/home ), a Principal Components-based algorithm for determining ancestry along each chromosome from a high-density, genome-wide set of phased single-nucleotide polymorphism (SNP) genotypes of admixed individuals. We compare our method to HAPMIX on simulated data from two ancestral populations, and we find high concordance between the methods. Our method also has better accuracy than LAMP when applied to three-population admixture, a situation as yet unaddressed by HAPMIX. Finally, we apply our method to a data set of four Latino populations with European, African, and Native American ancestry. We find evidence of assortative mating in each of the four populations, and we identify regions of shared ancestry that may be recent targets of selection and could serve as candidate regions for admixture-based association mapping.


Asunto(s)
Cromosomas Humanos , Genotipo , Modelos Genéticos , Polimorfismo de Nucleótido Simple , Dinámica Poblacional , Análisis de Componente Principal/métodos , Grupos Raciales/genética , Algoritmos , Simulación por Computador , Genómica , Humanos , Filogeografía , Estados Unidos
10.
Nat Commun ; 12(1): 6442, 2021 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-34750360

RESUMEN

The genetic architecture of atrial fibrillation (AF) encompasses low impact, common genetic variants and high impact, rare variants. Here, we characterize a high impact AF-susceptibility allele, KCNQ1 R231H, and describe its transcontinental geographic distribution and history. Induced pluripotent stem cell-derived cardiomyocytes procured from risk allele carriers exhibit abbreviated action potential duration, consistent with a gain-of-function effect. Using identity-by-descent (IBD) networks, we estimate the broad- and fine-scale population ancestry of risk allele carriers and their relatives. Analysis of ancestral migration routes reveals ancestors who inhabited Denmark in the 1700s, migrated to the Northeastern United States in the early 1800s, and traveled across the Midwest to arrive in Utah in the late 1800s. IBD/coalescent-based allele dating analysis reveals a relatively recent origin of the AF risk allele (~5000 years). Thus, our approach broadens the scope of study for disease susceptibility alleles to the context of human migration and ancestral origins.


Asunto(s)
Fibrilación Atrial/genética , Predisposición Genética a la Enfermedad/genética , Canal de Potasio KCNQ1/genética , Mutación Missense , Polimorfismo de Nucleótido Simple , Potenciales de Acción , Alelos , Dinamarca , Emigrantes e Inmigrantes , Femenino , Genotipo , Geografía , Humanos , Células Madre Pluripotentes Inducidas/citología , Células Madre Pluripotentes Inducidas/metabolismo , Masculino , Persona de Mediana Edad , Miocitos Cardíacos/citología , Miocitos Cardíacos/metabolismo , Miocitos Cardíacos/fisiología , Linaje , Factores de Riesgo , Utah
11.
J Hum Genet ; 54(9): 547-9, 2009 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-19629136

RESUMEN

Multiple sclerosis (MS) is a complex neurological trait. Allelic variation in the MHC class II region exerts the single strongest effect on MS genetic risk. The clinical onset of the disease is extremely variable, and can range from the first to the ninth decade of life. Epidemiological studies have suggested a modest genetic component to the age of onset (AO) of MS. Previous studies have shown that HLA-DRB1*1501 may be associated with a younger AO. Here, we sought to uncover any effect of HLA-DRB1*1501 on the AO of MS in a large Canadian cohort. A total of 1816 MS patients were genotyped for HLA-DRB1. Patients carrying HLA-DRB1*1501 were shown to have a small, but significantly lower, AO than patients without the allele (P=0.03). HLA-DRB1*1501 was also shown to reduce the mean AO in both progressive and relapsing forms of the disease. An investigation of parent-of-origin effects indicated that the lower AO for HLA-DRB1*1501 patients arises from maternally transmitted HLA-DRB1*1501 haplotypes (maternal HLA-DRB1*1501 mean AO=28.4 years, paternal=30.3 years; P=0.009). HLA-DRB1*1501 exerts a modest, but significant effect on the AO of all forms of MS. Parent-of-origin effects at the MHC are further implicated in MS disease pathogenesis.


Asunto(s)
Antígenos HLA-DR/genética , Haplotipos/genética , Esclerosis Múltiple/genética , Adulto , Edad de Inicio , Alelos , Canadá , Femenino , Predisposición Genética a la Enfermedad , Genotipo , Antígenos HLA-DR/inmunología , Cadenas HLA-DRB1 , Humanos , Masculino , Padres , Fenotipo , Factores de Riesgo
12.
G3 (Bethesda) ; 9(9): 2863-2878, 2019 09 04.
Artículo en Inglés | MEDLINE | ID: mdl-31484785

RESUMEN

We present a massive investigation into the genetic basis of human lifespan. Beginning with a genome-wide association (GWA) study using a de-identified snapshot of the unique AncestryDNA database - more than 300,000 genotyped individuals linked to pedigrees of over 400,000,000 people - we mapped six genome-wide significant loci associated with parental lifespan. We compared these results to a GWA analysis of the traditional lifespan proxy trait, age, and found only one locus, APOE, to be associated with both age and lifespan. By combining the AncestryDNA results with those of an independent UK Biobank dataset, we conducted a meta-analysis of more than 650,000 individuals and identified fifteen parental lifespan-associated loci. Beyond just those significant loci, our genome-wide set of polymorphisms accounts for up to 8% of the variance in human lifespan; this value represents a large fraction of the heritability estimated from phenotypic correlations between relatives.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Longevidad/genética , Anciano , Anciano de 80 o más Años , Apolipoproteínas E/genética , Proteínas Portadoras/genética , Bases de Datos Genéticas , Femenino , Humanos , Masculino , Proteínas Nucleares/genética , Linaje , Polimorfismo de Nucleótido Simple , Estudios Prospectivos , Proteínas Proto-Oncogénicas/genética
13.
Genetics ; 210(3): 1109-1124, 2018 11.
Artículo en Inglés | MEDLINE | ID: mdl-30401766

RESUMEN

Human life span is a phenotype that integrates many aspects of health and environment into a single ultimate quantity: the elapsed time between birth and death. Though it is widely believed that long life runs in families for genetic reasons, estimates of life span "heritability" are consistently low (∼15-30%). Here, we used pedigree data from Ancestry public trees, including hundreds of millions of historical persons, to estimate the heritability of human longevity. Although "nominal heritability" estimates based on correlations among genetic relatives agreed with prior literature, the majority of that correlation was also captured by correlations among nongenetic (in-law) relatives, suggestive of highly assortative mating around life span-influencing factors (genetic and/or environmental). We used structural equation modeling to account for assortative mating, and concluded that the true heritability of human longevity for birth cohorts across the 1800s and early 1900s was well below 10%, and that it has been generally overestimated due to the effect of assortative mating.


Asunto(s)
Longevidad/genética , Reproducción , Femenino , Humanos , Masculino , Modelos Genéticos , Linaje
14.
Nat Commun ; 8: 14238, 2017 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-28169989

RESUMEN

Despite strides in characterizing human history from genetic polymorphism data, progress in identifying genetic signatures of recent demography has been limited. Here we identify very recent fine-scale population structure in North America from a network of over 500 million genetic (identity-by-descent, IBD) connections among 770,000 genotyped individuals of US origin. We detect densely connected clusters within the network and annotate these clusters using a database of over 20 million genealogical records. Recent population patterns captured by IBD clustering include immigrants such as Scandinavians and French Canadians; groups with continental admixture such as Puerto Ricans; settlers such as the Amish and Appalachians who experienced geographic or cultural isolation; and broad historical trends, including reduced north-south gene flow. Our results yield a detailed historical portrait of North America after European settlement and support substantial genetic heterogeneity in the United States beyond that uncovered by previous studies.


Asunto(s)
Demografía/estadística & datos numéricos , Genética de Población/métodos , Dinámica Poblacional/tendencias , Población/genética , Análisis por Conglomerados , Demografía/métodos , Emigrantes e Inmigrantes , Flujo Génico/genética , Técnicas de Genotipaje , Haplotipos/genética , Humanos , Polimorfismo de Nucleótido Simple , Dinámica Poblacional/estadística & datos numéricos , Análisis de Secuencia de ADN , Estados Unidos/etnología
15.
Nat Genet ; 44(12): 1294-301, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23104008

RESUMEN

To further investigate susceptibility loci identified by genome-wide association studies, we genotyped 5,500 SNPs across 14 associated regions in 8,000 samples from a control group and 3 diseases: type 2 diabetes (T2D), coronary artery disease (CAD) and Graves' disease. We defined, using Bayes theorem, credible sets of SNPs that were 95% likely, based on posterior probability, to contain the causal disease-associated SNPs. In 3 of the 14 regions, TCF7L2 (T2D), CTLA4 (Graves' disease) and CDKN2A-CDKN2B (T2D), much of the posterior probability rested on a single SNP, and, in 4 other regions (CDKN2A-CDKN2B (CAD) and CDKAL1, FTO and HHEX (T2D)), the 95% sets were small, thereby excluding most SNPs as potentially causal. Very few SNPs in our credible sets had annotated functions, illustrating the limitations in understanding the mechanisms underlying susceptibility to common diseases. Our results also show the value of more detailed mapping to target sequences for functional studies.


Asunto(s)
Enfermedad de la Arteria Coronaria/genética , Diabetes Mellitus Tipo 2/genética , Sitios Genéticos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Enfermedad de Graves/genética , Dioxigenasa FTO Dependiente de Alfa-Cetoglutarato , Teorema de Bayes , Antígeno CTLA-4/genética , Quinasa 5 Dependiente de la Ciclina/genética , Inhibidor p15 de las Quinasas Dependientes de la Ciclina/genética , Genes p16 , Proteínas de Homeodominio/genética , Humanos , Polimorfismo de Nucleótido Simple , Proteínas/genética , Proteína 2 Similar al Factor de Transcripción 7/genética , Factores de Transcripción/genética , ARNt Metiltransferasas
16.
Genome Biol ; 9(11): R165, 2008.
Artículo en Inglés | MEDLINE | ID: mdl-19025653

RESUMEN

Whole genome tiling arrays are a key tool for profiling global genetic and expression variation. In this study we present our methods for detecting transcript level variation, splicing variation and allele specific expression in Arabidopsis thaliana. We also developed a generalized hidden Markov model for profiling transcribed fragment variation de novo. Our study demonstrates that whole genome tiling arrays are a powerful platform for dissecting natural transcriptome variation at multi-dimension and high resolution.


Asunto(s)
Arabidopsis/genética , Perfilación de la Expresión Génica , Genoma de Planta , Polimorfismo Genético , Empalme Alternativo , Arabidopsis/metabolismo , Regulación de la Expresión Génica de las Plantas , Cadenas de Markov
17.
Mol Biol Evol ; 23(6): 1136-43, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16527865

RESUMEN

In Saccharomyces, an ancient whole-genome duplication (WGD) and widespread duplicate gene deletion resulted in extensive reorganization of adjacent gene relationships. We have studied the evolution of adjacent gene pairs' identity, orientation, and spacing following whole-genome duplication and deletion (WGD-D) using comparative genomic analyses and simulations. Surveying adjacent gene organization across the Saccharomyces species complex, we find a genome-wide bias toward divergently and convergently transcribed gene pairs in all species but a reduction in this bias in the species that underwent WGD-D. Among neutral models of WGD-D, only single-gene deletion can produce the appropriate reduction in orientation bias and recapitulate the pattern of short, highly dispersed deletions we observe in Saccharomyces cerevisiae. To characterize the dynamics of WGD-D, we trace the conservation and creation of adjacent gene pairs along the S. cerevisiae lineage. We find that newly created adjacencies have a tandem orientation bias, while adjacencies conserved from prior to WGD-D have the same divergent-convergent bias as found in the species that diverged before WGD. We also find that adjacent gene pairs produced by WGD-D gained greater intergenic spacing but that this is reduced in the older adjacencies. Given this, and the preponderance of short deleted blocks, we argue that the deletion phase of WGD-D occurred primarily by small inactivating mutations followed by numerous small deletions. Newly created adjacent gene pairs also have an initial increase in mean log2 expression ratios and maximal expression levels, suggesting that increased intergenic spacing caused a genome-wide reduction in transcriptional interference.


Asunto(s)
Eliminación de Gen , Duplicación de Gen , Genoma Fúngico , Saccharomyces cerevisiae/genética , Saccharomyces/genética , Evolución Molecular , Expresión Génica , Genes Fúngicos , Análisis de Secuencia por Matrices de Oligonucleótidos
18.
Proc Natl Acad Sci U S A ; 103(39): 14412-6, 2006 Sep 26.
Artículo en Inglés | MEDLINE | ID: mdl-16971485

RESUMEN

Many Saccharomyces cerevisiae duplicate genes that were derived from an ancient whole-genome duplication (WGD) unexpectedly show a small synonymous divergence (K(S)), a higher sequence similarity to each other than to orthologues in Saccharomyces bayanus, or slow evolution compared with the orthologue in Kluyveromyces waltii, a non-WGD species. This decelerated evolution was attributed to gene conversion between duplicates. Using approximately 300 WGD gene pairs in four species and their orthologues in non-WGD species, we show that codon-usage bias and protein-sequence conservation are two important causes for decelerated evolution of duplicate genes, whereas gene conversion is effective only in the presence of strong codon-usage bias or protein-sequence conservation. Furthermore, we find that change in mutation pattern or in tDNA copy number changed codon-usage bias and increased the K(S) distance between K. waltii and S. cerevisiae. Intriguingly, some proteins showed fast evolution before the radiation of WGD species but little or no sequence divergence between orthologues and paralogues thereafter, indicating that functional conservation after the radiation may also be responsible for decelerated evolution in duplicates.


Asunto(s)
Codón/genética , Duplicación de Gen , Filogenia , Levaduras/genética , ADN de Hongos/genética , Genes Fúngicos/genética
19.
Proc Natl Acad Sci U S A ; 103(7): 2232-6, 2006 Feb 14.
Artículo en Inglés | MEDLINE | ID: mdl-16461903

RESUMEN

The question of how duplicate genes are retained in a population remains controversial. The duplication-degeneration-complementation model, which involves no positive selection, stipulates a higher retention rate of duplicate genes in a small population than in a large one. This model has been accepted by many evolutionists. However, we found considerably more retentions and fewer losses of duplicate genes in the mouse genome than in the human genome, although the population size of rodents is in general larger than that of primates. Indeed, in nearly every interval of synonymous divergence between duplicate genes, the number of gene retentions in mouse is larger than that in human. Our findings suggest a more important role of positive selection in duplicate retention than duplication-degeneration-complementation. In addition, certain functional categories show a higher tendency of lineage-specific expansion than expected, suggesting lineage-specific selection or functional bias in retained duplicates.


Asunto(s)
Duplicación de Gen , Genes Duplicados/genética , Genoma Humano/genética , Genoma/genética , Selección Genética , Animales , Linaje de la Célula/genética , Humanos , Ratones , Modelos Genéticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA