Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Nature ; 477(7364): 326-9, 2011 Sep 14.
Artículo en Inglés | MEDLINE | ID: mdl-21921916

RESUMEN

Structural variation is widespread in mammalian genomes and is an important cause of disease, but just how abundant and important structural variants (SVs) are in shaping phenotypic variation remains unclear. Without knowing how many SVs there are, and how they arise, it is difficult to discover what they do. Combining experimental with automated analyses, we identified 711,920 SVs at 281,243 sites in the genomes of thirteen classical and four wild-derived inbred mouse strains. The majority of SVs are less than 1 kilobase in size and 98% are deletions or insertions. The breakpoints of 160,000 SVs were mapped to base pair resolution, allowing us to infer that insertion of retrotransposons causes more than half of SVs. Yet, despite their prevalence, SVs are less likely than other sequence variants to cause gene expression or quantitative phenotypic variation. We identified 24 SVs that disrupt coding exons, acting as rare variants of large effect on gene function. One-third of the genes so affected have immunological functions.


Asunto(s)
Variación Genética/genética , Genoma/genética , Ratones Endogámicos/genética , Fenotipo , Animales , Puntos de Rotura del Cromosoma , Exones/genética , Femenino , Expresión Génica , Genómica , Genotipo , Masculino , Ratones , Ratones Endogámicos/inmunología , Mutagénesis Insercional/genética , Sitios de Carácter Cuantitativo/genética , Ratas , Retroelementos/genética , Eliminación de Secuencia/genética
2.
Nature ; 477(7364): 289-94, 2011 Sep 14.
Artículo en Inglés | MEDLINE | ID: mdl-21921910

RESUMEN

We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.


Asunto(s)
Regulación de la Expresión Génica/genética , Variación Genética/genética , Genoma/genética , Ratones Endogámicos/genética , Ratones/genética , Fenotipo , Alelos , Animales , Animales de Laboratorio/genética , Genómica , Ratones/clasificación , Ratones Endogámicos C57BL/genética , Filogenia , Sitios de Carácter Cuantitativo/genética
3.
PLoS Genet ; 8(10): e1002970, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23055942

RESUMEN

The genes involved in conferring susceptibility to anxiety remain obscure. We developed a new method to identify genes at quantitative trait loci (QTLs) in a population of heterogeneous stock mice descended from known progenitor strains. QTLs were partitioned into intervals that can be summarized by a single phylogenetic tree among progenitors and intervals tested for consistency with alleles influencing anxiety at each QTL. By searching for common Gene Ontology functions in candidate genes positioned within those intervals, we identified actin depolymerizing factors (ADFs), including cofilin-1 (Cfl1), as genes involved in regulating anxiety in mice. There was no enrichment for function in the totality of genes under each QTL, indicating the importance of phylogenetic filtering. We confirmed experimentally that forebrain-specific inactivation of Cfl1 decreased anxiety in knockout mice. Our results indicate that similarity of function of mammalian genes can be used to recognize key genetic regulators of anxiety and potentially of other emotional behaviours.


Asunto(s)
Ansiedad/genética , Cofilina 1/genética , Animales , Masculino , Aprendizaje por Laberinto , Ratones , Ratones Noqueados , Anotación de Secuencia Molecular , Mutación , Filogenia , Prosencéfalo/metabolismo , Sitios de Carácter Cuantitativo
4.
Genome Res ; 21(6): 936-9, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-20980556

RESUMEN

High-volume sequencing of DNA and RNA is now within reach of any research laboratory and is quickly becoming established as a key research tool. In many workflows, each of the short sequences ("reads") resulting from a sequencing run are first "mapped" (aligned) to a reference sequence to infer the read from which the genomic location derived, a challenging task because of the high data volumes and often large genomes. Existing read mapping software excel in either speed (e.g., BWA, Bowtie, ELAND) or sensitivity (e.g., Novoalign), but not in both. In addition, performance often deteriorates in the presence of sequence variation, particularly so for short insertions and deletions (indels). Here, we present a read mapper, Stampy, which uses a hybrid mapping algorithm and a detailed statistical model to achieve both speed and sensitivity, particularly when reads include sequence variation. This results in a higher useable sequence yield and improved accuracy compared to that of existing software.


Asunto(s)
Algoritmos , Modelos Estadísticos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Sensibilidad y Especificidad
5.
Bioinformatics ; 29(16): 2046-8, 2013 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-23782611

RESUMEN

MOTIVATION: A common question in genomic analysis is whether two sets of genomic intervals overlap significantly. This question arises, for example, when interpreting ChIP-Seq or RNA-Seq data in functional terms. Because genome organization is complex, answering this question is non-trivial. SUMMARY: We present Genomic Association Test (GAT), a tool for estimating the significance of overlap between multiple sets of genomic intervals. GAT implements a null model that the two sets of intervals are placed independently of one another, but allows each set's density to depend on external variables, for example, isochore structure or chromosome identity. GAT estimates statistical significance based on simulation and controls for multiple tests using the false discovery rate. AVAILABILITY: GAT's source code, documentation and tutorials are available at http://code.google.com/p/genomic-association-tester.


Asunto(s)
Genómica/métodos , Programas Informáticos , Sitios de Unión , Inmunoprecipitación de Cromatina , Simulación por Computador , Desoxirribonucleasa I , Análisis de Secuencia de ADN , Factores de Transcripción/metabolismo
6.
PLoS Genet ; 6(9): e1001085, 2010 Sep 02.
Artículo en Inglés | MEDLINE | ID: mdl-20838427

RESUMEN

Genome-wide association studies using commercially available outbred mice can detect genes involved in phenotypes of biomedical interest. Useful populations need high-frequency alleles to ensure high power to detect quantitative trait loci (QTLs), low linkage disequilibrium between markers to obtain accurate mapping resolution, and an absence of population structure to prevent false positive associations. We surveyed 66 colonies for inbreeding, genetic diversity, and linkage disequilibrium, and we demonstrate that some have haplotype blocks of less than 100 Kb, enabling gene-level mapping resolution. The same alleles contribute to variation in different colonies, so that when mapping progress stalls in one, another can be used in its stead. Colonies are genetically diverse: 45% of the total genetic variation is attributable to differences between colonies. However, quantitative differences in allele frequencies, rather than the existence of private alleles, are responsible for these population differences. The colonies derive from a limited pool of ancestral haplotypes resembling those found in inbred strains: over 95% of sequence variants segregating in outbred populations are found in inbred strains. Consequently it is possible to impute the sequence of any mouse from a dense SNP map combined with inbred strain sequence data, which opens up the possibility of cataloguing and testing all variants for association, a situation that has so far eluded studies in completely outbred populations. We demonstrate the colonies' potential by identifying a deletion in the promoter of H2-Ea as the molecular change that strongly contributes to setting the ratio of CD4+ and CD8+ lymphocytes.


Asunto(s)
Animales no Consanguíneos/genética , Estudio de Asociación del Genoma Completo , Animales , Animales de Laboratorio/genética , Mapeo Cromosómico , Flujo Genético , Marcadores Genéticos , Variación Genética/genética , Genética de Población , Haplotipos/genética , Endogamia , Desequilibrio de Ligamiento/genética , Ratones , Fenotipo , Filogenia , Sitios de Carácter Cuantitativo/genética , Análisis de Secuencia de ADN
7.
PLoS Biol ; 3(1): e7, 2005 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-15630479

RESUMEN

In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH), in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development, including many transcription factors. These highly conserved non-coding sequences are likely to form part of the genomic circuitry that uniquely defines vertebrate development.


Asunto(s)
Regulación del Desarrollo de la Expresión Génica , Genoma Humano , Secuencias Reguladoras de Ácidos Nucleicos , Takifugu/genética , Animales , Secuencia Conservada , Bases de Datos Genéticas , Elementos de Facilitación Genéticos , Proteínas del Ojo/metabolismo , Genoma , Proteínas Fluorescentes Verdes/metabolismo , Proteínas Hedgehog , Proteínas del Grupo de Alta Movilidad/metabolismo , Proteínas de Homeodominio/metabolismo , Humanos , Datos de Secuencia Molecular , Familia de Multigenes , Proteínas de Neoplasias/metabolismo , Factor de Transcripción PAX6 , Factores de Transcripción Paired Box/metabolismo , Proteínas Represoras/metabolismo , Factores de Transcripción SOXB2 , Análisis de Secuencia de ADN , Especificidad de la Especie , Transactivadores/metabolismo , Factores de Transcripción/metabolismo
8.
Genome Biol ; 13(3): R18, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22439878

RESUMEN

BACKGROUND: Accurate catalogs of structural variants (SVs) in mammalian genomes are necessary to elucidate the potential mechanisms that drive SV formation and to assess their functional impact. Next generation sequencing methods for SV detection are an advance on array-based methods, but are almost exclusively limited to four basic types: deletions, insertions, inversions and copy number gains. RESULTS: By visual inspection of 100 Mbp of genome to which next generation sequence data from 17 inbred mouse strains had been aligned, we identify and interpret 21 paired-end mapping patterns, which we validate by PCR. These paired-end mapping patterns reveal a greater diversity and complexity in SVs than previously recognized. In addition, Sanger-based sequence analysis of 4,176 breakpoints at 261 SV sites reveal additional complexity at approximately a quarter of structural variants analyzed. We find micro-deletions and micro-insertions at SV breakpoints, ranging from 1 to 107 bp, and SNPs that extend breakpoint micro-homology and may catalyze SV formation. CONCLUSIONS: An integrative approach using experimental analyses to train computational SV calling is essential for the accurate resolution of the architecture of SVs. We find considerable complexity in SV formation; about a quarter of SVs in the mouse are composed of a complex mixture of deletion, insertion, inversion and copy number gain. Computational methods can be adapted to identify most paired-end mapping patterns.


Asunto(s)
Mapeo Cromosómico/métodos , Genoma , Ratones Endogámicos/genética , Animales , Secuencia de Bases , Puntos de Rotura del Cromosoma , Dosificación de Gen , Variación Genética , Genómica , Ratones , Datos de Secuencia Molecular , Mutagénesis Insercional/genética , Polimorfismo de Nucleótido Simple , Análisis de Secuencia , Eliminación de Secuencia/genética , Inversión de Secuencia/genética
9.
Artículo en Inglés | MEDLINE | ID: mdl-20483234

RESUMEN

We recently identified approximately 1400 conserved non-coding elements (CNEs) shared by the genomes of fugu (Takifugu rubripes) and human that appear to be associated with developmental regulation in vertebrates [Woolfe, A., Goodson, M., Goode, D.K., Snell, P., McEwen, G.K., Vavouri, T., Smith, S.F., North, P., Callaway, H., Kelly, K., Walter, K., Abnizova, I., Gilks, W., Edwards, Y.J.K., Cooke, J.E., Elgar, G., 2005. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3 (1), e7]. This study encompassed a multi-disciplinary approach using bioinformatics, statistical methods and functional assays to identify and characterise the CNEs. Using an in vivo enhancer assay, over 90% of tested CNEs up-regulate tissue-specific GFP expression. Here we review our group's research in the field of characterising non-coding sequences conserved in vertebrates. We take this opportunity to discuss our research in progress and present some results of new and additional analyses. These include a phylogenomics analysis of CNEs, sequence conservation patterns in vertebrate CNEs and the distribution of human SNPs in the CNEs. We highlight the usefulness of the CNE dataset to help correlate genetic variation in health and disease. We also discuss the functional analysis using the enhancer assay and the enrichment of predicted transcription factor binding sites for two CNEs. Public access to the CNEs plus annotation is now possible and is described. The content of this review was presented by Dr. Y.J.K. Edwards at the TODAI International Symposium on Functional Genomics of the Pufferfish, Tokyo, Japan, 3-6 November 2004.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA