Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
1.
Cell ; 164(5): 999-1014, 2016 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-26875865

RESUMO

Transcription factors (TFs) are thought to function with partners to achieve specificity and precise quantitative outputs. In the developing heart, heterotypic TF interactions, such as between the T-box TF TBX5 and the homeodomain TF NKX2-5, have been proposed as a mechanism for human congenital heart defects. We report extensive and complex interdependent genomic occupancy of TBX5, NKX2-5, and the zinc finger TF GATA4 coordinately controlling cardiac gene expression, differentiation, and morphogenesis. Interdependent binding serves not only to co-regulate gene expression but also to prevent TFs from distributing to ectopic loci and activate lineage-inappropriate genes. We define preferential motif arrangements for TBX5 and NKX2-5 cooperative binding sites, supported at the atomic level by their co-crystal structure bound to DNA, revealing a direct interaction between the two factors and induced DNA bending. Complex interdependent binding mechanisms reveal tightly regulated TF genomic distribution and define a combinatorial logic for heterotypic TF regulation of differentiation.


Assuntos
Fator de Transcrição GATA4/metabolismo , Proteínas de Homeodomínio/metabolismo , Miocárdio/citologia , Organogênese , Proteínas com Domínio T/metabolismo , Fatores de Transcrição/metabolismo , Animais , Diferenciação Celular , Cristalografia por Raios X , Embrião de Mamíferos/metabolismo , Proteína Homeobox Nkx-2.5 , Proteínas de Homeodomínio/genética , Camundongos , Camundongos Transgênicos , Modelos Moleculares , Miocárdio/metabolismo , Regiões Promotoras Genéticas , Domínios e Motivos de Interação entre Proteínas , Proteínas com Domínio T/genética , Fatores de Transcrição/genética
2.
Cell ; 163(1): 21-3, 2015 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-26406364

RESUMO

We propose that data mining and network analysis utilizing public databases can identify and quantify relationships between scientific discoveries and major advances in medicine (cures). Further development of such approaches could help to increase public understanding and governmental support for life science research and could enhance decision making in the quest for cures.


Assuntos
Pesquisa Biomédica/economia , Mineração de Dados , Publicações , Animais , Disciplinas das Ciências Biológicas/economia , Ensaios Clínicos como Assunto , Tomada de Decisões , Descoberta de Drogas , Humanos , National Institutes of Health (U.S.)/economia , Estados Unidos , United States Food and Drug Administration/economia
3.
Cell ; 151(1): 206-20, 2012 Sep 28.
Artigo em Inglês | MEDLINE | ID: mdl-22981692

RESUMO

Heart development is exquisitely sensitive to the precise temporal regulation of thousands of genes that govern developmental decisions during differentiation. However, we currently lack a detailed understanding of how chromatin and gene expression patterns are coordinated during developmental transitions in the cardiac lineage. Here, we interrogated the transcriptome and several histone modifications across the genome during defined stages of cardiac differentiation. We find distinct chromatin patterns that are coordinated with stage-specific expression of functionally related genes, including many human disease-associated genes. Moreover, we discover a novel preactivation chromatin pattern at the promoters of genes associated with heart development and cardiac function. We further identify stage-specific distal enhancer elements and find enriched DNA binding motifs within these regions that predict sets of transcription factors that orchestrate cardiac differentiation. Together, these findings form a basis for understanding developmentally regulated chromatin transitions during lineage commitment and the molecular etiology of congenital heart disease.


Assuntos
Epigênese Genética , Redes Reguladoras de Genes , Miocárdio/citologia , Animais , Diferenciação Celular , Cromatina/metabolismo , Células-Tronco Embrionárias/metabolismo , Elementos Facilitadores Genéticos , Coração/embriologia , Humanos , Camundongos , Fatores de Transcrição/metabolismo , Transcriptoma
4.
Mol Biol Evol ; 35(8): 2034-2045, 2018 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-29897475

RESUMO

Some of the fastest evolving regions of the human genome are conserved noncoding elements with many human-specific DNA substitutions. These human accelerated regions (HARs) are enriched nearby regulatory genes, and several HARs function as developmental enhancers. To investigate if this evolutionary signature is unique to humans, we quantified evidence of accelerated substitutions in conserved genomic elements across multiple lineages and applied this approach simultaneously to the genomes of five apes: human, chimpanzee, gorilla, orangutan, and gibbon. We find roughly similar numbers and genomic distributions of lineage-specific accelerated regions (linARs) in all five apes. In particular, apes share an enrichment of linARs in regulatory DNA nearby genes involved in development, especially transcription factors and other regulators. Many developmental loci harbor clusters of nonoverlapping linARs from multiple apes, suggesting that accelerated evolution in each species affected distinct regulatory elements that control a shared set of developmental pathways. Our statistical tests distinguish between GC-biased and unbiased accelerated substitution rates, allowing us to quantify the roles of different evolutionary forces in creating linARs. We find evidence of GC-biased gene conversion in each ape, but unbiased acceleration consistent with positive selection or loss of constraint is more common in all five lineages. It therefore appears that similar evolutionary processes created independent accelerated regions in the genomes of different apes, and that these lineage-specific changes to conserved noncoding sequences may have differentially altered expression of a core set of developmental genes across ape evolution.


Assuntos
Evolução Molecular , Hominidae/genética , Algoritmos , Animais , Simulação por Computador , Conversão Gênica , Hominidae/crescimento & desenvolvimento , Humanos , Modelos Genéticos , Seleção Genética
5.
Brief Bioinform ; 18(3): 441-450, 2017 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-27169896

RESUMO

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is an important tool for studying gene regulatory proteins, such as transcription factors and histones. Peak calling is one of the first steps in the analysis of these data. Peak calling consists of two sub-problems: identifying candidate peaks and testing candidate peaks for statistical significance. We surveyed 30 methods and identified 12 features of the two sub-problems that distinguish methods from each other. We picked six methods GEM, MACS2, MUSIC, BCP, Threshold-based method (TM) and ZINBA] that span this feature space and used a combination of 300 simulated ChIP-seq data sets, 3 real data sets and mathematical analyses to identify features of methods that allow some to perform better than the others. We prove that methods that explicitly combine the signals from ChIP and input samples are less powerful than methods that do not. Methods that use windows of different sizes are more powerful than the ones that do not. For statistical testing of candidate peaks, methods that use a Poisson test to rank their candidate peaks are more powerful than those that use a Binomial test. BCP and MACS2 have the best operating characteristics on simulated transcription factor binding data. GEM has the highest fraction of the top 500 peaks containing the binding motif of the immunoprecipitated factor, with 50% of its peaks within 10 base pairs of a motif. BCP and MUSIC perform best on histone data. These findings provide guidance and rationale for selecting the best peak caller for a given application.


Assuntos
Análise de Sequência de DNA , Algoritmos , Sítios de Ligação , Imunoprecipitação da Cromatina , Sequenciamento de Nucleotídeos em Larga Escala , Histonas , Análise de Sequência com Séries de Oligonucleotídeos , Fatores de Transcrição
6.
Mol Biol Evol ; 33(4): 1008-18, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26715627

RESUMO

Mammals have evolved remarkably different sensory, reproductive, metabolic, and skeletal systems. To explore the genetic basis for these differences, we developed a comparative genomics approach to scan whole-genome multiple sequence alignments to identify regions that evolved rapidly in an ancestral lineage but are conserved within extant species. This pattern suggests that ancestral changes in function were maintained in descendants. After applying this test to therian mammals, we identified 4,797 accelerated regions, many of which are noncoding and located near developmental transcription factors. We then used mouse transgenic reporter assays to test if noncoding accelerated regions are enhancers and to determine how therian-specific substitutions affect their activity in vivo. We discovered enhancers with expression specific to the therian version in brain regions involved in the hormonal control of milk ejection, uterine contractions, blood pressure, temperature, and visual processing. This work underscores the idea that changes in developmental gene expression are important for mammalian evolution, and it pinpoints candidate genes for unique aspects of mammalian biology.


Assuntos
Elementos Facilitadores Genéticos , Evolução Molecular , Proteínas de Homeodomínio/genética , Mamíferos/genética , Animais , Encéfalo/metabolismo , Sequência Conservada/genética , Regulação da Expressão Gênica no Desenvolvimento , Genômica , Camundongos
7.
Nature ; 478(7370): 476-82, 2011 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-21993624

RESUMO

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis of 29 eutherian genomes. We confirm that at least 5.5% of the human genome has undergone purifying selection, and locate constrained elements covering ∼4.2% of the genome. We use evolutionary signatures and comparisons with experimental data sets to suggest candidate functions for ∼60% of constrained bases. These elements reveal a small number of new coding exons, candidate stop codon readthrough events and over 10,000 regions of overlapping synonymous constraint within protein-coding exons. We find 220 candidate RNA structural families, and nearly a million elements overlapping potential promoter, enhancer and insulator regions. We report specific amino acid residues that have undergone positive selection, 280,000 non-coding elements exapted from mobile elements and more than 1,000 primate- and human-accelerated elements. Overlap with disease-associated variants indicates that our findings will be relevant for studies of human biology, health and disease.


Assuntos
Evolução Molecular , Genoma Humano/genética , Genoma/genética , Mamíferos/genética , Animais , Doença , Éxons/genética , Genômica , Saúde , Humanos , Anotação de Sequência Molecular , Filogenia , RNA/classificação , RNA/genética , Seleção Genética/genética , Alinhamento de Sequência , Análise de Sequência de DNA
8.
Am J Hum Genet ; 89(3): 382-97, 2011 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-21855840

RESUMO

Assignment of alleles to haplotypes for nearly all the variants on all chromosomes can be performed by genetic analysis of a nuclear family with three or more children. Whole-genome sequence data enable deterministic phasing of nearly all sequenced alleles by permitting assignment of recombinations to precise chromosomal positions and specific meioses. We demonstrate this process of genetic phasing on two families each with four children. We generate haplotypes for all of the children and their parents; these haplotypes span all genotyped positions, including rare variants. Misassignments of phase between variants (switch errors) are nearly absent. Our algorithm can also produce multimegabase haplotypes for nuclear families with just two children and can handle families with missing individuals. We implement our algorithm in a suite of software scripts (Haploscribe). Haplotypes and family genome sequences will become increasingly important for personalized medicine and for fundamental biology.


Assuntos
Algoritmos , Cromossomos Humanos/genética , Variação Genética , Haplótipos/genética , Padrões de Herança/genética , Modelos Genéticos , Software , Humanos , Mutação/genética , Linhagem , Análise de Sequência de DNA/métodos
9.
PLoS Genet ; 7(4): e1002053, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21572512

RESUMO

Transcription factor binding site(s) (TFBS) gain and loss (i.e., turnover) is a well-documented feature of cis-regulatory module (CRM) evolution, yet little attention has been paid to the evolutionary force(s) driving this turnover process. The predominant view, motivated by its widespread occurrence, emphasizes the importance of compensatory mutation and genetic drift. Positive selection, in contrast, although it has been invoked in specific instances of adaptive gene expression evolution, has not been considered as a general alternative to neutral compensatory evolution. In this study we evaluate the two hypotheses by analyzing patterns of single nucleotide polymorphism in the TFBS of well-characterized CRM in two closely related Drosophila species, Drosophila melanogaster and Drosophila simulans. An important feature of the analysis is classification of TFBS mutations according to the direction of their predicted effect on binding affinity, which allows gains and losses to be evaluated independently along the two phylogenetic lineages. The observed patterns of polymorphism and divergence are not compatible with neutral evolution for either class of mutations. Instead, multiple lines of evidence are consistent with contributions of positive selection to TFBS gain and loss as well as purifying selection in its maintenance. In discussion, we propose a model to reconcile the finding of selection driving TFBS turnover with constrained CRM function over long evolutionary time.


Assuntos
Sítios de Ligação/genética , Proteínas de Drosophila/metabolismo , Drosophila melanogaster/metabolismo , Drosophila/metabolismo , Ligação Proteica/genética , Seleção Genética , Fatores de Transcrição/metabolismo , Animais , Evolução Biológica , Bases de Dados Genéticas , Drosophila/genética , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Regulação da Expressão Gênica , Modelos Genéticos , Mutação , Filogenia , Polimorfismo Genético , Análise de Sequência de DNA , Especificidade da Espécie , Fatores de Transcrição/genética
10.
Ann Surg Oncol ; 18(4): 1158-65, 2011 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-21086055

RESUMO

BACKGROUND: The molecular factors that control parathyroid tumorigenesis are poorly understood. In the absence of local invasion or metastasis, distinguishing benign from malignant parathyroid neoplasm is difficult on histologic examination. We studied the microRNA (miRNA) profile in normal, hyperplastic, and benign and malignant parathyroid tumors to better understand the molecular factors that may play a role in parathyroid tumorigenesis and that may serve as diagnostic markers for parathyroid carcinoma. METHODS: miRNA arrays containing 825 human microRNAs with four duplicate probes per miRNA were used to profile parathyroid tumor (12 adenomas, 9 carcinomas, and 15 hyperplastic) samples normalized to four reference normal parathyroid glands. Differentially expressed miRNA were validated by real-time quantitative TaqMan polymerase chain reaction (PCR). RESULTS: One hundred fifty-six miRNAs in parathyroid hyperplasia, 277 microRNAs in parathyroid adenoma, and 167 microRNAs in parathyroid carcinomas were significantly dysregulated as compared with normal parathyroid glands [false discovery rate (FDR) < 0.05]. By supervised clustering analysis, all parathyroid carcinomas clustered together. Three miRNAs (miR-26b, miR-30b, and miR-126*) were significantly dysregulated between parathyroid carcinoma and parathyroid adenoma. Receiver-operating characteristic curve analysis showed mir-126* was the best diagnostic marker, with area under the curve of 0.776. CONCLUSIONS: Most miRNAs are downregulated in parathyroid carcinoma, while in parathyroid hyperplasia most miRNAs are upregulated. miRNA profiling shows distinct differentially expressed miRNAs by tumor type which may serve as helpful adjunct to distinguish parathyroid adenoma from carcinoma.


Assuntos
Adenoma/genética , Biomarcadores Tumorais/genética , Hiperplasia/metabolismo , MicroRNAs/genética , Glândulas Paratireoides/metabolismo , Neoplasias das Paratireoides/genética , Adenoma/metabolismo , Biomarcadores Tumorais/metabolismo , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Hiperplasia/patologia , Análise de Sequência com Séries de Oligonucleotídeos , Glândulas Paratireoides/patologia , Neoplasias das Paratireoides/metabolismo , RNA Mensageiro/genética , Curva ROC , Reação em Cadeia da Polimerase Via Transcriptase Reversa
11.
PLoS Biol ; 5(11): e310, 2007 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-17988176

RESUMO

The population genetic perspective is that the processes shaping genomic variation can be revealed only through simultaneous investigation of sequence polymorphism and divergence within and between closely related species. Here we present a population genetic analysis of Drosophila simulans based on whole-genome shotgun sequencing of multiple inbred lines and comparison of the resulting data to genome assemblies of the closely related species, D. melanogaster and D. yakuba. We discovered previously unknown, large-scale fluctuations of polymorphism and divergence along chromosome arms, and significantly less polymorphism and faster divergence on the X chromosome. We generated a comprehensive list of functional elements in the D. simulans genome influenced by adaptive evolution. Finally, we characterized genomic patterns of base composition for coding and noncoding sequence. These results suggest several new hypotheses regarding the genetic and biological mechanisms controlling polymorphism and divergence across the Drosophila genome, and provide a rich resource for the investigation of adaptive evolution and functional variation in D. simulans.


Assuntos
Drosophila/genética , Variação Genética , Genética Populacional , Genoma de Inseto , Polimorfismo Genético , Animais , Mapeamento Cromossômico , Drosophila/classificação , Proteínas de Drosophila/genética , Evolução Molecular , Genômica , Desequilíbrio de Ligação , Modelos Genéticos , Dados de Sequência Molecular , Cromossomo X
12.
PLoS Genet ; 3(10): 2007-13, 2007 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-17967066

RESUMO

Detailed studies of individual genes have shown that gene expression divergence often results from adaptive evolution of regulatory sequence. Genome-wide analyses, however, have yet to unite patterns of gene expression with polymorphism and divergence to infer population genetic mechanisms underlying expression evolution. Here, we combined genomic expression data--analyzed in a phylogenetic context--with whole genome light-shotgun sequence data from six Drosophila simulans lines and reference sequences from D. melanogaster and D. yakuba. These data allowed us to use molecular population genetics to test for neutral versus adaptive gene expression divergence on a genomic scale. We identified recent and recurrent adaptive evolution along the D. simulans lineage by contrasting sequence polymorphism within D. simulans to divergence from D. melanogaster and D. yakuba. Genes that evolved higher levels of expression in D. simulans have experienced adaptive evolution of the associated 3' flanking and amino acid sequence. Concomitantly, these genes are also decelerating in their rates of protein evolution, which is in agreement with the finding that highly expressed genes evolve slowly. Interestingly, adaptive evolution in 5' cis-regulatory regions did not correspond strongly with expression evolution. Our results provide a genomic view of the intimate link between selection acting on a phenotype and associated genic evolution.


Assuntos
Adaptação Biológica/genética , Drosophila/genética , Regulação da Expressão Gênica , Variação Genética , Genética Populacional , Genômica , Animais , Códon , Evolução Molecular , Genes de Insetos , Genoma de Inseto , Heterozigoto , Fases de Leitura Aberta , Sequências Reguladoras de Ácido Nucleico
13.
Nat Ecol Evol ; 3(8): 1241-1252, 2019 08.
Artigo em Inglês | MEDLINE | ID: mdl-31358948

RESUMO

Monitor lizards are unique among ectothermic reptiles in that they have high aerobic capacity and distinctive cardiovascular physiology resembling that of endothermic mammals. Here, we sequence the genome of the Komodo dragon Varanus komodoensis, the largest extant monitor lizard, and generate a high-resolution de novo chromosome-assigned genome assembly for V. komodoensis using a hybrid approach of long-range sequencing and single-molecule optical mapping. Comparing the genome of V. komodoensis with those of related species, we find evidence of positive selection in pathways related to energy metabolism, cardiovascular homoeostasis, and haemostasis. We also show species-specific expansions of a chemoreceptor gene family related to pheromone and kairomone sensing in V. komodoensis and other lizard lineages. Together, these evolutionary signatures of adaptation reveal the genetic underpinnings of the unique Komodo dragon sensory and cardiovascular systems, and suggest that selective pressure altered haemostasis genes to help Komodo dragons evade the anticoagulant effects of their own saliva. The Komodo dragon genome is an important resource for understanding the biology of monitor lizards and reptiles worldwide.


Assuntos
Sistema Cardiovascular , Lagartos , Aclimatação , Animais , Cromossomos
14.
Genetics ; 177(3): 1959-62, 2007 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-18039888

RESUMO

Dosage compensation refers to the equalization of X-linked gene transcription among heterogametic and homogametic sexes. In Drosophila, the dosage compensation complex (DCC) mediates the twofold hypertranscription of the single male X chromosome. Loss-of-function mutations at any DCC protein-coding gene are male lethal. Here we report a population genetic analysis suggesting that four of the five core DCC proteins--MSL1, MSL2, MSL3, and MOF--are evolving under positive selection in D. melanogaster. Within these four proteins, several domains that range in function from X chromosome localization to protein-protein interactions have elevated, D. melanogaster-specific, amino acid divergence.


Assuntos
Mecanismo Genético de Compensação de Dose , Proteínas de Drosophila/genética , Drosophila melanogaster/genética , Evolução Molecular , Adaptação Fisiológica , Animais , Drosophila/genética , Drosophila/fisiologia , Proteínas de Drosophila/fisiologia , Drosophila melanogaster/fisiologia , Feminino , Masculino , Polimorfismo Genético , Seleção Genética , Especificidade da Espécie , Cromossomo X/genética
15.
Genetics ; 172(3): 1675-81, 2006 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-16361246

RESUMO

The fraction of the genome associated with male reproduction in Drosophila may be unusually dynamic. For example, male reproduction-related genes show higher-than-average rates of protein divergence and gene expression evolution compared to most Drosophila genes. Drosophila male reproduction may also be enriched for novel genetic functions. Our earlier work, based on accessory gland protein genes (Acp's) in D. simulans and D. melanogaster, suggested that the melanogaster subgroup Acp's may be lost and/or gained on a relatively rapid timescale. Here we investigate this possibility more thoroughly through description of the accessory gland transcriptome in two melanogaster subgroup species, D. yakuba and D. erecta. A genomic analysis of previously unknown genes isolated from cDNA libraries of these species revealed several cases of genes present in one or both species, yet absent from ingroup and outgroup species. We found no evidence that these novel genes are attributable primarily to duplication and divergence, which suggests the possibility that Acp's or other genes coding for small proteins may originate from ancestrally noncoding DNA.


Assuntos
Proteínas de Drosophila/genética , Drosophila/genética , Evolução Molecular , Etiquetas de Sequências Expressas , Genitália Masculina/metabolismo , Animais , Linhagem da Célula/genética , Proteínas de Drosophila/metabolismo , Feminino , Genética Populacional , Masculino , Especificidade da Espécie
16.
Am Nat ; 167(4): E88-101, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-16670990

RESUMO

Polyploidization is one of the few mechanisms that can produce instantaneous speciation. Multiple origins of tetraploid lineages from the same two diploid progenitors are common, but here we report the first known instance of a single tetraploid species that originated repeatedly from at least three diploid ancestors. Parallel evolution of advertisement calls in tetraploid lineages of gray tree frogs has allowed these lineages to interbreed, resulting in a single sexually interacting polyploid species despite the separate origins of polyploids from different diploids. Speciation by polyploidization in these frogs has been the source of considerable debate, but the various published hypotheses have assumed that polyploids arose through either autopolyploidy or allopolyploidy of extant diploid species. We utilized molecular markers and advertisement calls to infer the origins of tetraploid gray tree frogs. Previous hypotheses did not sufficiently account for the observed data. Instead, we found that tetraploids originated multiple times from extant diploid gray tree frogs and two other, apparently extinct, lineages of tree frogs. Tetraploid lineages then merged through interbreeding to result in a single species. Thus, polyploid species may have complex origins, especially in systems in which isolating mechanisms (such as advertisement calls) are affected directly through hybridization and polyploidy.


Assuntos
Anuros/genética , Especiação Genética , Poliploidia , Animais , Anuros/classificação , Evolução Biológica , Citocromos b/genética , Extinção Biológica , Geografia , Haplótipos , Hibridização Genética , Filogenia
17.
Stat Interface ; 8(4): 463-476, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26709360

RESUMO

Next-generation sequencing technology enables the identification of thousands of gene regulatory sequences in many cell types and organisms. We consider the problem of testing if two such sequences differ in their number of binding site motifs for a given transcription factor (TF) protein. Binding site motifs impart regulatory function by providing TFs the opportunity to bind to genomic elements and thereby affect the expression of nearby genes. Evolutionary changes to such functional DNA are hypothesized to be major contributors to phenotypic diversity within and between species; but despite the importance of TF motifs for gene expression, no method exists to test for motif loss or gain. Assuming that motif counts are Binomially distributed, and allowing for dependencies between motif instances in evolutionarily related sequences, we derive the probability mass function of the difference in motif counts between two nucleotide sequences. We provide a method to numerically estimate this distribution from genomic data and show through simulations that our estimator is accurate. Finally, we introduce the R package motifDiverge that implements our methodology and illustrate its application to gene regulatory enhancers identified by a mouse developmental time course experiment. While this study was motivated by analysis of regulatory motifs, our results can be applied to any problem involving two correlated Bernoulli trials.

18.
Curr Protoc Hum Genet ; 83: 11.13.1-20, 2014 Oct 01.
Artigo em Inglês | MEDLINE | ID: mdl-25271838

RESUMO

RNA-seq is widely used to determine differential expression of genes or transcripts as well as identify novel transcripts, identify allele-specific expression, and precisely measure translation of transcripts. Thoughtful experimental design and choice of analysis tools are critical to ensure high-quality data and interpretable results. Important considerations for experimental design include number of replicates, whether to collect paired-end or single-end reads, sequence length, and sequencing depth. Common analysis steps in all RNA-seq experiments include quality control, read alignment, assigning reads to genes or transcripts, and estimating gene or transcript abundance. Our aims are two-fold: to make recommendations for common components of experimental design and assess tool capabilities for each of these steps. We also test tools designed to detect differential expression, since this is the most widespread application of RNA-seq. We hope that these analyses will help guide those who are new to RNA-seq and will generate discussion about remaining needs for tool improvement and development.


Assuntos
Análise de Sequência de RNA , Reação em Cadeia da Polimerase , Controle de Qualidade , Splicing de RNA , RNA Mensageiro/genética
19.
PLoS One ; 9(4): e94650, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24736250

RESUMO

The chicken has long served as an important model organism in many fields, and continues to aid our understanding of animal development. Functional genomics studies aimed at probing the mechanisms that regulate development require high-quality genomes and transcript annotations. The quality of these resources has improved dramatically over the last several years, but many isoforms and genes have yet to be identified. We hope to contribute to the process of improving these resources with the data presented here: a set of long cDNA sequencing reads, and a curated set of new genes and transcript isoforms not currently represented in the most up-to-date genome annotation currently available to the community of researchers who rely on the chicken genome.


Assuntos
Galinhas/genética , Genômica/métodos , Animais , Embrião de Galinha , DNA Complementar/genética , Coração/embriologia , RNA Mensageiro/genética
20.
Genome Biol ; 14(7): R72, 2013 Jul 18.
Artigo em Inglês | MEDLINE | ID: mdl-23867016

RESUMO

BACKGROUND: Large-scale annotation efforts have improved our ability to coarsely predict regulatory elements throughout vertebrate genomes. However, it is unclear how complex spatiotemporal patterns of gene expression driven by these elements emerge from the activity of short, transcription factor binding sequences. RESULTS: We describe a comprehensive promoter extension assay in which the regulatory potential of all 6 base-pair (bp) sequences was tested in the context of a minimal promoter. To enable this large-scale screen, we developed algorithms that use a reverse-complement aware decomposition of the de Bruijn graph to design a library of DNA oligomers incorporating every 6-bp sequence exactly once. Our library multiplexes all 4,096 unique 6-mers into 184 double-stranded 15-bp oligomers, which is sufficiently compact for in vivo testing. We injected each multiplexed construct into zebrafish embryos and scored GFP expression in 15 tissues at two developmental time points. Twenty-seven constructs produced consistent expression patterns, with the majority doing so in only one tissue. Functional sequences are enriched near biologically relevant genes, match motifs for developmental transcription factors, and are required for enhancer activity. By concatenating tissue-specific functional sequences, we generated completely synthetic enhancers for the notochord, epidermis, spinal cord, forebrain and otic lateral line, and show that short regulatory sequences do not always function modularly. CONCLUSIONS: This work introduces a unique in vivo catalog of short, functional regulatory sequences and demonstrates several important principles of regulatory element organization. Furthermore, we provide resources for designing compact, reverse-complement aware k-mer libraries.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Oligonucleotídeos/genética , Especificidade de Órgãos/genética , Sequências Reguladoras de Ácido Nucleico/genética , Biologia Sintética/métodos , Peixe-Zebra/genética , Animais , Sequência de Bases , Dissecação , Embrião não Mamífero/metabolismo , Elementos Facilitadores Genéticos , Ontologia Genética , Dados de Sequência Molecular , Motivos de Nucleotídeos/genética , Peixe-Zebra/embriologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA