Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Nature ; 477(7364): 326-9, 2011 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-21921916

RESUMO

Structural variation is widespread in mammalian genomes and is an important cause of disease, but just how abundant and important structural variants (SVs) are in shaping phenotypic variation remains unclear. Without knowing how many SVs there are, and how they arise, it is difficult to discover what they do. Combining experimental with automated analyses, we identified 711,920 SVs at 281,243 sites in the genomes of thirteen classical and four wild-derived inbred mouse strains. The majority of SVs are less than 1 kilobase in size and 98% are deletions or insertions. The breakpoints of 160,000 SVs were mapped to base pair resolution, allowing us to infer that insertion of retrotransposons causes more than half of SVs. Yet, despite their prevalence, SVs are less likely than other sequence variants to cause gene expression or quantitative phenotypic variation. We identified 24 SVs that disrupt coding exons, acting as rare variants of large effect on gene function. One-third of the genes so affected have immunological functions.


Assuntos
Variação Genética/genética , Genoma/genética , Camundongos Endogâmicos/genética , Fenótipo , Animais , Pontos de Quebra do Cromossomo , Éxons/genética , Feminino , Expressão Gênica , Genômica , Genótipo , Masculino , Camundongos , Camundongos Endogâmicos/imunologia , Mutagênese Insercional/genética , Locos de Características Quantitativas/genética , Ratos , Retroelementos/genética , Deleção de Sequência/genética
2.
Nature ; 477(7364): 289-94, 2011 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-21921910

RESUMO

We report genome sequences of 17 inbred strains of laboratory mice and identify almost ten times more variants than previously known. We use these genomes to explore the phylogenetic history of the laboratory mouse and to examine the functional consequences of allele-specific variation on transcript abundance, revealing that at least 12% of transcripts show a significant tissue-specific expression bias. By identifying candidate functional variants at 718 quantitative trait loci we show that the molecular nature of functional variants and their position relative to genes vary according to the effect size of the locus. These sequences provide a starting point for a new era in the functional analysis of a key model organism.


Assuntos
Regulação da Expressão Gênica/genética , Variação Genética/genética , Genoma/genética , Camundongos Endogâmicos/genética , Camundongos/genética , Fenótipo , Alelos , Animais , Animais de Laboratório/genética , Genômica , Camundongos/classificação , Camundongos Endogâmicos C57BL/genética , Filogenia , Locos de Características Quantitativas/genética
3.
PLoS Genet ; 8(10): e1002970, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23055942

RESUMO

The genes involved in conferring susceptibility to anxiety remain obscure. We developed a new method to identify genes at quantitative trait loci (QTLs) in a population of heterogeneous stock mice descended from known progenitor strains. QTLs were partitioned into intervals that can be summarized by a single phylogenetic tree among progenitors and intervals tested for consistency with alleles influencing anxiety at each QTL. By searching for common Gene Ontology functions in candidate genes positioned within those intervals, we identified actin depolymerizing factors (ADFs), including cofilin-1 (Cfl1), as genes involved in regulating anxiety in mice. There was no enrichment for function in the totality of genes under each QTL, indicating the importance of phylogenetic filtering. We confirmed experimentally that forebrain-specific inactivation of Cfl1 decreased anxiety in knockout mice. Our results indicate that similarity of function of mammalian genes can be used to recognize key genetic regulators of anxiety and potentially of other emotional behaviours.


Assuntos
Ansiedade/genética , Cofilina 1/genética , Animais , Masculino , Aprendizagem em Labirinto , Camundongos , Camundongos Knockout , Anotação de Sequência Molecular , Mutação , Filogenia , Prosencéfalo/metabolismo , Locos de Características Quantitativas
4.
Genome Res ; 21(6): 936-9, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-20980556

RESUMO

High-volume sequencing of DNA and RNA is now within reach of any research laboratory and is quickly becoming established as a key research tool. In many workflows, each of the short sequences ("reads") resulting from a sequencing run are first "mapped" (aligned) to a reference sequence to infer the read from which the genomic location derived, a challenging task because of the high data volumes and often large genomes. Existing read mapping software excel in either speed (e.g., BWA, Bowtie, ELAND) or sensitivity (e.g., Novoalign), but not in both. In addition, performance often deteriorates in the presence of sequence variation, particularly so for short insertions and deletions (indels). Here, we present a read mapper, Stampy, which uses a hybrid mapping algorithm and a detailed statistical model to achieve both speed and sensitivity, particularly when reads include sequence variation. This results in a higher useable sequence yield and improved accuracy compared to that of existing software.


Assuntos
Algoritmos , Modelos Estatísticos , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Sensibilidade e Especificidade
5.
Bioinformatics ; 29(16): 2046-8, 2013 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-23782611

RESUMO

MOTIVATION: A common question in genomic analysis is whether two sets of genomic intervals overlap significantly. This question arises, for example, when interpreting ChIP-Seq or RNA-Seq data in functional terms. Because genome organization is complex, answering this question is non-trivial. SUMMARY: We present Genomic Association Test (GAT), a tool for estimating the significance of overlap between multiple sets of genomic intervals. GAT implements a null model that the two sets of intervals are placed independently of one another, but allows each set's density to depend on external variables, for example, isochore structure or chromosome identity. GAT estimates statistical significance based on simulation and controls for multiple tests using the false discovery rate. AVAILABILITY: GAT's source code, documentation and tutorials are available at http://code.google.com/p/genomic-association-tester.


Assuntos
Genômica/métodos , Software , Sítios de Ligação , Imunoprecipitação da Cromatina , Simulação por Computador , Desoxirribonuclease I , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo
6.
PLoS Genet ; 6(9): e1001085, 2010 Sep 02.
Artigo em Inglês | MEDLINE | ID: mdl-20838427

RESUMO

Genome-wide association studies using commercially available outbred mice can detect genes involved in phenotypes of biomedical interest. Useful populations need high-frequency alleles to ensure high power to detect quantitative trait loci (QTLs), low linkage disequilibrium between markers to obtain accurate mapping resolution, and an absence of population structure to prevent false positive associations. We surveyed 66 colonies for inbreeding, genetic diversity, and linkage disequilibrium, and we demonstrate that some have haplotype blocks of less than 100 Kb, enabling gene-level mapping resolution. The same alleles contribute to variation in different colonies, so that when mapping progress stalls in one, another can be used in its stead. Colonies are genetically diverse: 45% of the total genetic variation is attributable to differences between colonies. However, quantitative differences in allele frequencies, rather than the existence of private alleles, are responsible for these population differences. The colonies derive from a limited pool of ancestral haplotypes resembling those found in inbred strains: over 95% of sequence variants segregating in outbred populations are found in inbred strains. Consequently it is possible to impute the sequence of any mouse from a dense SNP map combined with inbred strain sequence data, which opens up the possibility of cataloguing and testing all variants for association, a situation that has so far eluded studies in completely outbred populations. We demonstrate the colonies' potential by identifying a deletion in the promoter of H2-Ea as the molecular change that strongly contributes to setting the ratio of CD4+ and CD8+ lymphocytes.


Assuntos
Animais não Endogâmicos/genética , Estudo de Associação Genômica Ampla , Animais , Animais de Laboratório/genética , Mapeamento Cromossômico , Deriva Genética , Marcadores Genéticos , Variação Genética/genética , Genética Populacional , Haplótipos/genética , Endogamia , Desequilíbrio de Ligação/genética , Camundongos , Fenótipo , Filogenia , Locos de Características Quantitativas/genética , Análise de Sequência de DNA
7.
PLoS Biol ; 3(1): e7, 2005 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-15630479

RESUMO

In addition to protein coding sequence, the human genome contains a significant amount of regulatory DNA, the identification of which is proving somewhat recalcitrant to both in silico and functional methods. An approach that has been used with some success is comparative sequence analysis, whereby equivalent genomic regions from different organisms are compared in order to identify both similarities and differences. In general, similarities in sequence between highly divergent organisms imply functional constraint. We have used a whole-genome comparison between humans and the pufferfish, Fugu rubripes, to identify nearly 1,400 highly conserved non-coding sequences. Given the evolutionary divergence between these species, it is likely that these sequences are found in, and furthermore are essential to, all vertebrates. Most, and possibly all, of these sequences are located in and around genes that act as developmental regulators. Some of these sequences are over 90% identical across more than 500 bases, being more highly conserved than coding sequence between these two species. Despite this, we cannot find any similar sequences in invertebrate genomes. In order to begin to functionally test this set of sequences, we have used a rapid in vivo assay system using zebrafish embryos that allows tissue-specific enhancer activity to be identified. Functional data is presented for highly conserved non-coding sequences associated with four unrelated developmental regulators (SOX21, PAX6, HLXB9, and SHH), in order to demonstrate the suitability of this screen to a wide range of genes and expression patterns. Of 25 sequence elements tested around these four genes, 23 show significant enhancer activity in one or more tissues. We have identified a set of non-coding sequences that are highly conserved throughout vertebrates. They are found in clusters across the human genome, principally around genes that are implicated in the regulation of development, including many transcription factors. These highly conserved non-coding sequences are likely to form part of the genomic circuitry that uniquely defines vertebrate development.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Genoma Humano , Sequências Reguladoras de Ácido Nucleico , Takifugu/genética , Animais , Sequência Conservada , Bases de Dados Genéticas , Elementos Facilitadores Genéticos , Proteínas do Olho/metabolismo , Genoma , Proteínas de Fluorescência Verde/metabolismo , Proteínas Hedgehog , Proteínas de Grupo de Alta Mobilidade/metabolismo , Proteínas de Homeodomínio/metabolismo , Humanos , Dados de Sequência Molecular , Família Multigênica , Proteínas de Neoplasias/metabolismo , Fator de Transcrição PAX6 , Fatores de Transcrição Box Pareados/metabolismo , Proteínas Repressoras/metabolismo , Fatores de Transcrição SOXB2 , Análise de Sequência de DNA , Especificidade da Espécie , Transativadores/metabolismo , Fatores de Transcrição/metabolismo
8.
Genome Biol ; 13(3): R18, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22439878

RESUMO

BACKGROUND: Accurate catalogs of structural variants (SVs) in mammalian genomes are necessary to elucidate the potential mechanisms that drive SV formation and to assess their functional impact. Next generation sequencing methods for SV detection are an advance on array-based methods, but are almost exclusively limited to four basic types: deletions, insertions, inversions and copy number gains. RESULTS: By visual inspection of 100 Mbp of genome to which next generation sequence data from 17 inbred mouse strains had been aligned, we identify and interpret 21 paired-end mapping patterns, which we validate by PCR. These paired-end mapping patterns reveal a greater diversity and complexity in SVs than previously recognized. In addition, Sanger-based sequence analysis of 4,176 breakpoints at 261 SV sites reveal additional complexity at approximately a quarter of structural variants analyzed. We find micro-deletions and micro-insertions at SV breakpoints, ranging from 1 to 107 bp, and SNPs that extend breakpoint micro-homology and may catalyze SV formation. CONCLUSIONS: An integrative approach using experimental analyses to train computational SV calling is essential for the accurate resolution of the architecture of SVs. We find considerable complexity in SV formation; about a quarter of SVs in the mouse are composed of a complex mixture of deletion, insertion, inversion and copy number gain. Computational methods can be adapted to identify most paired-end mapping patterns.


Assuntos
Mapeamento Cromossômico/métodos , Genoma , Camundongos Endogâmicos/genética , Animais , Sequência de Bases , Pontos de Quebra do Cromossomo , Dosagem de Genes , Variação Genética , Genômica , Camundongos , Dados de Sequência Molecular , Mutagênese Insercional/genética , Polimorfismo de Nucleotídeo Único , Análise de Sequência , Deleção de Sequência/genética , Inversão de Sequência/genética
9.
Artigo em Inglês | MEDLINE | ID: mdl-20483234

RESUMO

We recently identified approximately 1400 conserved non-coding elements (CNEs) shared by the genomes of fugu (Takifugu rubripes) and human that appear to be associated with developmental regulation in vertebrates [Woolfe, A., Goodson, M., Goode, D.K., Snell, P., McEwen, G.K., Vavouri, T., Smith, S.F., North, P., Callaway, H., Kelly, K., Walter, K., Abnizova, I., Gilks, W., Edwards, Y.J.K., Cooke, J.E., Elgar, G., 2005. Highly conserved non-coding sequences are associated with vertebrate development. PLoS Biol. 3 (1), e7]. This study encompassed a multi-disciplinary approach using bioinformatics, statistical methods and functional assays to identify and characterise the CNEs. Using an in vivo enhancer assay, over 90% of tested CNEs up-regulate tissue-specific GFP expression. Here we review our group's research in the field of characterising non-coding sequences conserved in vertebrates. We take this opportunity to discuss our research in progress and present some results of new and additional analyses. These include a phylogenomics analysis of CNEs, sequence conservation patterns in vertebrate CNEs and the distribution of human SNPs in the CNEs. We highlight the usefulness of the CNE dataset to help correlate genetic variation in health and disease. We also discuss the functional analysis using the enhancer assay and the enrichment of predicted transcription factor binding sites for two CNEs. Public access to the CNEs plus annotation is now possible and is described. The content of this review was presented by Dr. Y.J.K. Edwards at the TODAI International Symposium on Functional Genomics of the Pufferfish, Tokyo, Japan, 3-6 November 2004.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA