Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
PLoS One ; 5(12): e14338, 2010 Dec 31.
Artículo en Inglés | MEDLINE | ID: mdl-21217814

RESUMEN

The prevalence of common chronic non-communicable diseases (CNCDs) far overshadows the prevalence of both monogenic and infectious diseases combined. All CNCDs, also called complex genetic diseases, have a heritable genetic component that can be used for pre-symptomatic risk assessment. Common single nucleotide polymorphisms (SNPs) that tag risk haplotypes across the genome currently account for a non-trivial portion of the germ-line genetic risk and we will likely continue to identify the remaining missing heritability in the form of rare variants, copy number variants and epigenetic modifications. Here, we describe a novel measure for calculating the lifetime risk of a disease, called the genetic composite index (GCI), and demonstrate its predictive value as a clinical classifier. The GCI only considers summary statistics of the effects of genetic variation and hence does not require the results of large-scale studies simultaneously assessing multiple risk factors. Combining GCI scores with environmental risk information provides an additional tool for clinical decision-making. The GCI can be populated with heritable risk information of any type, and thus represents a framework for CNCD pre-symptomatic risk assessment that can be populated as additional risk information is identified through next-generation technologies.


Asunto(s)
Enfermedades Genéticas Congénitas/genética , Polimorfismo de Nucleótido Simple , Artritis Reumatoide/genética , Enfermedad Crónica , Enfermedad de Crohn/genética , Diabetes Mellitus Tipo 2/genética , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo , Genotipo , Humanos , Curva ROC , Reproducibilidad de los Resultados , Riesgo , Medición de Riesgo
2.
Genome Res ; 17(6): 760-74, 2007 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-17567995

RESUMEN

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for the initially targeted 1% of the human genome. Here, we present orthologous sequence generation, alignment, and evolutionary constraint analyses of 23 mammalian species for all ENCODE targets. Alignments were generated using four different methods; comparisons of these methods reveal large-scale consistency but substantial differences in terms of small genomic rearrangements, sensitivity (sequence coverage), and specificity (alignment accuracy). We describe the quantitative and qualitative trade-offs concomitant with alignment method choice and the levels of technical error that need to be accounted for in applications that require multisequence alignments. Using the generated alignments, we identified constrained regions using three different methods. While the different constraint-detecting methods are in general agreement, there are important discrepancies relating to both the underlying alignments and the specific algorithms. However, by integrating the results across the alignments and constraint-detecting methods, we produced constraint annotations that were found to be robust based on multiple independent measures. Analyses of these annotations illustrate that most classes of experimentally annotated functional elements are enriched for constrained sequences; however, large portions of each class (with the exception of protein-coding sequences) do not overlap constrained regions. The latter elements might not be under primary sequence constraint, might not be constrained across all mammals, or might have expendable molecular functions. Conversely, 40% of the constrained sequences do not overlap any of the functional elements that have been experimentally identified. Together, these findings demonstrate and quantify how many genomic functional elements await basic molecular characterization.


Asunto(s)
Evolución Molecular , Genoma Humano , Mamíferos/genética , Sistemas de Lectura Abierta , Filogenia , Alineación de Secuencia , Animales , Proyecto Genoma Humano , Humanos
3.
Genome Biol ; 8(6): R118, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17578567

RESUMEN

BACKGROUND: Gene regulation is considered one of the driving forces of evolution. Although protein-coding DNA sequences and RNA genes have been subject to recent evolutionary events in the human lineage, it has been hypothesized that the large phenotypic divergence between humans and chimpanzees has been driven mainly by changes in gene regulation rather than altered protein-coding gene sequences. Comparative analysis of vertebrate genomes has revealed an abundance of evolutionarily conserved but noncoding sequences. These conserved noncoding (CNC) sequences may well harbor critical regulatory variants that have driven recent human evolution. RESULTS: Here we identify 1,356 CNC sequences that appear to have undergone dramatic human-specific changes in selective pressures, at least 15% of which have substitution rates significantly above that expected under neutrality. The 1,356 'accelerated CNC' (ANC) sequences are enriched in recent segmental duplications, suggesting a recent change in selective constraint following duplication. In addition, single nucleotide polymorphisms within ANC sequences have a significant excess of high frequency derived alleles and high F(ST) values relative to controls, indicating that acceleration and positive selection are recent in human populations. Finally, a significant number of single nucleotide polymorphisms within ANC sequences are associated with changes in gene expression. The probability of variation in an ANC sequence being associated with a gene expression phenotype is fivefold higher than variation in a control CNC sequence. CONCLUSION: Our analysis suggests that ANC sequences have until very recently played a role in human evolution, potentially through lineage-specific changes in gene regulation.


Asunto(s)
Evolución Molecular , Regulación de la Expresión Génica , Genoma Humano , Secuencias Reguladoras de Ácidos Nucleicos , Animales , Secuencia de Bases , Secuencia Conservada , Genoma , Humanos , Macaca , Pan troglodytes , Polimorfismo de Nucleótido Simple , Selección Genética , Análisis de Secuencia de ADN
4.
Nucleic Acids Res ; 35(Database issue): D716-20, 2007 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17151077

RESUMEN

The variation resources within the University of California Santa Cruz Genome Browser include polymorphism data drawn from public collections and analyses of these data, along with their display in the context of other genomic annotations. Primary data from dbSNP is included for many organisms, with added information including genomic alleles and orthologous alleles for closely related organisms. Display filtering and coloring is available by variant type, functional class or other annotations. Annotation of potential errors is highlighted and a genomic alignment of the variant's flanking sequence is displayed. HapMap allele frequencies and linkage disequilibrium (LD) are available for each HapMap population, along with non-human primate alleles. The browsing and analysis tools, downloadable data files and links to documentation and other information can be found at http://genome.ucsc.edu/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Polimorfismo de Nucleótido Simple , Alelos , Animales , Frecuencia de los Genes , Genómica , Genotipo , Humanos , Internet , Desequilibrio de Ligamiento , Ratones , Ratas , Recombinación Genética , Alineación de Secuencia , Interfaz Usuario-Computador
5.
Nucleic Acids Res ; 35(Database issue): D663-7, 2007 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-17166863

RESUMEN

The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website (http://genome.ucsc.edu/ENCODE) is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Genoma Humano , Genómica , Secuencia de Bases , Humanos , Internet , Alineación de Secuencia , Programas Informáticos , Interfaz Usuario-Computador
6.
Nat Genet ; 38(2): 223-7, 2006 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-16380714

RESUMEN

Noncoding genetic variants are likely to influence human biology and disease, but recognizing functional noncoding variants is difficult. Approximately 3% of noncoding sequence is conserved among distantly related mammals, suggesting that these evolutionarily conserved noncoding regions (CNCs) are selectively constrained and contain functional variation. However, CNCs could also merely represent regions with lower local mutation rates. Here we address this issue and show that CNCs are selectively constrained in humans by analyzing HapMap genotype data. Specifically, new (derived) alleles of SNPs within CNCs are rarer than new alleles in nonconserved regions (P = 3 x 10(-18)), indicating that evolutionary pressure has suppressed CNC-derived allele frequencies. Intronic CNCs and CNCs near genes show greater allele frequency shifts, with magnitudes comparable to those for missense variants. Thus, conserved noncoding variants are more likely to be functional. Allele frequency distributions highlight selectively constrained genomic regions that should be intensively surveyed for functionally important variation.


Asunto(s)
Secuencia Conservada/genética , Mutación/genética , Selección Genética , Frecuencia de los Genes/genética , Humanos , Polimorfismo de Nucleótido Simple/genética , Grupos de Población/genética
7.
Genome Res ; 15(11): 1519-34, 2005 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-16251462

RESUMEN

We use genotype data generated by the International HapMap Project to dissect the relationship between sequence features and the degree of linkage disequilibrium in the genome. We show that variation in linkage disequilibrium is broadly similar across populations and examine sequence landscape in regions of strong and weak disequilibrium. Linkage disequilibrium is generally low within approximately 15 Mb of the telomeres of each chromosome and noticeably elevated in large, duplicated regions of the genome as well as within approximately 5 Mb of centromeres and other heterochromatic regions. At a broad scale (100-1000 kb resolution), our results show that regions of strong linkage disequilibrium are typically GC poor and have reduced polymorphism. In addition, these regions are enriched for LINE repeats, but have fewer SINE, DNA, and simple repeats than the rest of the genome. At a fine scale, we examine the sequence composition of "hotspots" for the rapid breakdown of linkage disequilibrium and show that they are enriched in SINEs, in simple repeats, and in sequences that are conserved between species. Regions of high and low linkage disequilibrium (the top and bottom quartiles of the genome) have a higher density of genes and coding bases than the rest of the genome. Closer examination of the data shows that whereas some types of genes (including genes involved in immune response and sensory perception) are typically located in regions of low linkage disequilibrium, other genes (including those involved in DNA and RNA metabolism, response to DNA damage, and the cell cycle) are preferentially located in regions of strong linkage disequilibrium. Our results provide a detailed analysis of the relationship between sequence features and linkage disequilibrium and suggest an evolutionary justification for the heterogeneity in linkage disequilibrium in the genome.


Asunto(s)
Cromosomas Humanos/genética , Variación Genética , Genoma Humano/genética , Genómica/métodos , Desequilibrio de Ligamiento/genética , Modelos Genéticos , Composición de Base , Biología Computacional/métodos , Frecuencia de los Genes , Haplotipos/genética , Humanos , Análisis Multivariante , Elementos de Nucleótido Esparcido Corto/genética , Estadísticas no Paramétricas
8.
Genome Res ; 15(11): 1553-65, 2005 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-16251465

RESUMEN

The allele frequency spectrum of polymorphisms in DNA sequences can be used to test for signatures of natural selection that depart from the expected frequency spectrum under the neutral theory. We observed a significant (P = 0.001) correlation between the Tajima's D test statistic in full resequencing data and Tajima's D in a dense, genome-wide data set of genotyped polymorphisms for a set of 179 genes. Based on this, we used a sliding window analysis of Tajima's D across the human genome to identify regions putatively subject to strong, recent, selective sweeps. This survey identified seven Contiguous Regions of Tajima's D Reduction (CRTRs) in an African-descent population (AD), 23 in a European-descent population (ED), and 29 in a Chinese-descent population (XD). Only four CRTRs overlapped between populations: three between ED and XD and one between AD and ED. Full resequencing of eight genes within six CRTRs demonstrated frequency spectra inconsistent with neutral expectations for at least one gene within each CRTR. Identification of the functional polymorphism (and/or haplotype) responsible for the selective sweeps within each CRTR may provide interesting insights into the strongest selective pressures experienced by the human genome over recent evolutionary history.


Asunto(s)
Evolución Molecular , Genes/genética , Genoma Humano/genética , Polimorfismo Genético , Selección Genética , Negro o Afroamericano/genética , Asiático/genética , Secuencia de Bases , Cartilla de ADN , Frecuencia de los Genes , Genómica/métodos , Genotipo , Humanos , Datos de Secuencia Molecular , Análisis de Secuencia de ADN , Población Blanca/genética
9.
Bioinformatics ; 21(12): 2814-20, 2005 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-15827081

RESUMEN

MOTIVATION: The NCBI dbSNP database lists over 9 million single nucleotide polymorphisms (SNPs) in the human genome, but currently contains limited annotation information. SNPs that result in amino acid residue changes (nsSNPs) are of critical importance in variation between individuals, including disease and drug sensitivity. RESULTS: We have developed LS-SNP, a genomic scale software pipeline to annotate nsSNPs. LS-SNP comprehensively maps nsSNPs onto protein sequences, functional pathways and comparative protein structure models, and predicts positions where nsSNPs destabilize proteins, interfere with the formation of domain-domain interfaces, have an effect on protein-ligand binding or severely impact human health. It currently annotates 28,043 validated SNPs that produce amino acid residue substitutions in human proteins from the SwissProt/TrEMBL database. Annotations can be viewed via a web interface either in the context of a genomic region or by selecting sets of SNPs, genes, proteins or pathways. These results are useful for identifying candidate functional SNPs within a gene, haplotype or pathway and in probing molecular mechanisms responsible for functional impacts of nsSNPs. AVAILABILITY: http://www.salilab.org/LS-SNP CONTACT: rachelk@salilab.org SUPPLEMENTARY INFORMATION: http://salilab.org/LS-SNP/supp-info.pdf.


Asunto(s)
Mapeo Cromosómico/métodos , Bases de Datos Genéticas , Polimorfismo de Nucleótido Simple/genética , Proteínas/química , Proteínas/genética , Análisis de Secuencia de ADN/métodos , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Algoritmos , Sistemas de Administración de Bases de Datos , Almacenamiento y Recuperación de la Información/métodos , Sistemas de Lectura Abierta/genética , Proteínas/análisis , Alineación de Secuencia/métodos , Integración de Sistemas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...