Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Cell ; 166(5): 1269-1281.e19, 2016 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-27565349

RESUMEN

The glucocorticoid receptor (GR) binds the human genome at >10,000 sites but only regulates the expression of hundreds of genes. To determine the functional effect of each site, we measured the glucocorticoid (GC) responsive activity of nearly all GR binding sites (GBSs) captured using chromatin immunoprecipitation (ChIP) in A549 cells. 13% of GBSs assayed had GC-induced activity. The responsive sites were defined by direct GR binding via a GC response element (GRE) and exclusively increased reporter-gene expression. Meanwhile, most GBSs lacked GC-induced reporter activity. The non-responsive sites had epigenetic features of steady-state enhancers and clustered around direct GBSs. Together, our data support a model in which clusters of GBSs observed with ChIP-seq reflect interactions between direct and tethered GBSs over tens of kilobases. We further show that those interactions can synergistically modulate the activity of direct GBSs and may therefore play a major role in driving gene activation in response to GCs.


Asunto(s)
Genoma Humano , Glucocorticoides/metabolismo , Receptores de Glucocorticoides/metabolismo , Factores de Transcripción/metabolismo , Activación Transcripcional , Células A549 , Sitios de Unión/efectos de los fármacos , Inmunoprecipitación de Cromatina , Dexametasona/metabolismo , Dexametasona/farmacología , Genes Reporteros , Glucocorticoides/farmacología , Humanos , Unión Proteica/efectos de los fármacos , Elementos de Respuesta
2.
Am J Hum Genet ; 108(8): 1436-1449, 2021 08 05.
Artículo en Inglés | MEDLINE | ID: mdl-34216551

RESUMEN

Despite widespread clinical genetic testing, many individuals with suspected genetic conditions lack a precise diagnosis, limiting their opportunity to take advantage of state-of-the-art treatments. In some cases, testing reveals difficult-to-evaluate structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted. We performed targeted long-read sequencing (T-LRS) using adaptive sampling on the Oxford Nanopore platform on 40 individuals, 10 of whom lacked a complete molecular diagnosis. We computationally targeted up to 151 Mbp of sequence per individual and searched for pathogenic substitutions, structural variants, and methylation differences using a single data source. We detected all genomic aberrations-including single-nucleotide variants, copy number changes, repeat expansions, and methylation differences-identified by prior clinical testing. In 8/8 individuals with complex structural rearrangements, T-LRS enabled more precise resolution of the mutation, leading to changes in clinical management in one case. In ten individuals with suspected Mendelian conditions lacking a precise genetic diagnosis, T-LRS identified pathogenic or likely pathogenic variants in six and variants of uncertain significance in two others. T-LRS accurately identifies pathogenic structural variants, resolves complex rearrangements, and identifies Mendelian variants not detected by other technologies. T-LRS represents an efficient and cost-effective strategy to evaluate high-priority genes and regions or complex clinical testing results.


Asunto(s)
Aberraciones Cromosómicas , Análisis Citogenético/métodos , Enfermedades Genéticas Congénitas/diagnóstico , Enfermedades Genéticas Congénitas/genética , Predisposición Genética a la Enfermedad , Genoma Humano , Mutación , Variaciones en el Número de Copia de ADN , Femenino , Pruebas Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Cariotipificación , Masculino , Análisis de Secuencia de ADN
3.
Genome Res ; 31(5): 877-889, 2021 05.
Artículo en Inglés | MEDLINE | ID: mdl-33722938

RESUMEN

High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of the variance in STARR-seq data. We then develop a statistical model to correct those biases and to improve detection of regulatory elements. This approach substantially improves precision and recall over current methods, improves detection of both activating and repressive regulatory elements, and controls for false discoveries despite strong local correlations in signal.


Asunto(s)
Elementos de Facilitación Genéticos , Genoma Humano , Sesgo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos
4.
Mol Ther ; 29(11): 3243-3257, 2021 11 03.
Artículo en Inglés | MEDLINE | ID: mdl-34509668

RESUMEN

Targeted gene-editing strategies have emerged as promising therapeutic approaches for the permanent treatment of inherited genetic diseases. However, precise gene correction and insertion approaches using homology-directed repair are still limited by low efficiencies. Consequently, many gene-editing strategies have focused on removal or disruption, rather than repair, of genomic DNA. In contrast, homology-independent targeted integration (HITI) has been reported to effectively insert DNA sequences at targeted genomic loci. This approach could be particularly useful for restoring full-length sequences of genes affected by a spectrum of mutations that are also too large to deliver by conventional adeno-associated virus (AAV) vectors. Here, we utilize an AAV-based, HITI-mediated approach for correction of full-length dystrophin expression in a humanized mouse model of Duchenne muscular dystrophy (DMD). We co-deliver CRISPR-Cas9 and a donor DNA sequence to insert the missing human exon 52 into its corresponding position within the DMD gene and achieve full-length dystrophin correction in skeletal and cardiac muscle. Additionally, as a proof-of-concept strategy to correct genetic mutations characterized by diverse patient mutations, we deliver a superexon donor encoding the last 28 exons of the DMD gene as a therapeutic strategy to restore full-length dystrophin in >20% of the DMD patient population. This work highlights the potential of HITI-mediated gene correction for diverse DMD mutations and advances genome editing toward realizing the promise of full-length gene restoration to treat genetic disease.


Asunto(s)
Sistemas CRISPR-Cas , Dependovirus/genética , Distrofina/genética , Exones , Edición Génica , Vectores Genéticos/genética , Distrofia Muscular de Duchenne/genética , Distrofia Muscular de Duchenne/terapia , Animales , Modelos Animales de Enfermedad , Expresión Génica , Orden Génico , Técnicas de Transferencia de Gen , Ingeniería Genética , Terapia Genética/métodos , Humanos , Ratones , Ratones Transgénicos , Músculo Esquelético/metabolismo , Mutación , Miocardio/metabolismo , Integración Viral
5.
Genome Res ; 28(9): 1272-1284, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-30097539

RESUMEN

Glucocorticoids are potent steroid hormones that regulate immunity and metabolism by activating the transcription factor (TF) activity of glucocorticoid receptor (GR). Previous models have proposed that DNA binding motifs and sites of chromatin accessibility predetermine GR binding and activity. However, there are vast excesses of both features relative to the number of GR binding sites. Thus, these features alone are unlikely to account for the specificity of GR binding and activity. To identify genomic and epigenetic contributions to GR binding specificity and the downstream changes resultant from GR binding, we performed hundreds of genome-wide measurements of TF binding, epigenetic state, and gene expression across a 12-h time course of glucocorticoid exposure. We found that glucocorticoid treatment induces GR to bind to nearly all pre-established enhancers within minutes. However, GR binds to only a small fraction of the set of accessible sites that lack enhancer marks. Once GR is bound to enhancers, a combination of enhancer motif composition and interactions between enhancers then determines the strength and persistence of GR binding, which consequently correlates with dramatic shifts in enhancer activation. Over the course of several hours, highly coordinated changes in TF binding and histone modification occupancy occur specifically within enhancers, and these changes correlate with changes in the expression of nearby genes. Following GR binding, changes in the binding of other TFs precede changes in chromatin accessibility, suggesting that other TFs are also sensitive to genomic features beyond that of accessibility.


Asunto(s)
Elementos de Facilitación Genéticos , Código de Histonas , Motivos de Nucleótidos , Receptores de Glucocorticoides/metabolismo , Activación Transcripcional , Línea Celular Tumoral , Epigénesis Genética , Humanos , Unión Proteica , Factores de Transcripción/metabolismo
6.
Bioinformatics ; 36(2): 331-338, 2020 01 15.
Artículo en Inglés | MEDLINE | ID: mdl-31368479

RESUMEN

MOTIVATION: High-throughput reporter assays dramatically improve our ability to assign function to noncoding genetic variants, by measuring allelic effects on gene expression in the controlled setting of a reporter gene. Unlike genetic association tests, such assays are not confounded by linkage disequilibrium when loci are independently assayed. These methods can thus improve the identification of causal disease mutations. While work continues on improving experimental aspects of these assays, less effort has gone into developing methods for assessing the statistical significance of assay results, particularly in the case of rare variants captured from patient DNA. RESULTS: We describe a Bayesian hierarchical model, called Bayesian Inference of Regulatory Differences, which integrates prior information and explicitly accounts for variability between experimental replicates. The model produces substantially more accurate predictions than existing methods when allele frequencies are low, which is of clear advantage in the search for disease-causing variants in DNA captured from patient cohorts. Using the model, we demonstrate a clear tradeoff between variant sequencing coverage and numbers of biological replicates, and we show that the use of additional biological replicates decreases variance in estimates of effect size, due to the properties of the Poisson-binomial distribution. We also provide a power and sample size calculator, which facilitates decision making in experimental design parameters. AVAILABILITY AND IMPLEMENTATION: The software is freely available from www.geneprediction.org/bird. The experimental design web tool can be accessed at http://67.159.92.22:8080. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Programas Informáticos , Alelos , Teorema de Bayes , Frecuencia de los Genes , Humanos , Desequilibrio de Ligamiento
7.
Bioinformatics ; 34(21): 3616-3623, 2018 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-29701825

RESUMEN

Motivation: Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed and produce functional proteins. Results: We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and non-coding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or non-coding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products and we propose that they may commonly act as cryptic factors in disease. Availability and implementation: The software is available from geneprediction.org/SGRF. Supplementary information: Supplementary information is available at Bioinformatics online.


Asunto(s)
Exones , Empalme del ARN , Programas Informáticos , Humanos , Anotación de Secuencia Molecular , Análisis de Secuencia de ARN
8.
Genome Res ; 25(8): 1206-14, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26084464

RESUMEN

We report a novel high-throughput method to empirically quantify individual-specific regulatory element activity at the population scale. The approach combines targeted DNA capture with a high-throughput reporter gene expression assay. As demonstration, we measured the activity of more than 100 putative regulatory elements from 95 individuals in a single experiment. In agreement with previous reports, we found that most genetic variants have weak effects on distal regulatory element activity. Because haplotypes are typically maintained within but not between assayed regulatory elements, the approach can be used to identify causal regulatory haplotypes that likely contribute to human phenotypes. Finally, we demonstrate the utility of the method to functionally fine map causal regulatory variants in regions of high linkage disequilibrium identified by expression quantitative trait loci (eQTL) analyses.


Asunto(s)
Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Secuencias Reguladoras de Ácidos Nucleicos , Biología Computacional/métodos , Genoma Humano , Haplotipos , Humanos , Modelación Específica para el Paciente , Sitios de Carácter Cuantitativo
9.
Bioinformatics ; 33(10): 1437-1446, 2017 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-28011790

RESUMEN

MOTIVATION: The accurate interpretation of genetic variants is critical for characterizing genotype-phenotype associations. Because the effects of genetic variants can depend strongly on their local genomic context, accurate genome annotations are essential. Furthermore, as some variants have the potential to disrupt or alter gene structure, variant interpretation efforts stand to gain from the use of individualized annotations that account for differences in gene structure between individuals or strains. RESULTS: We describe a suite of software tools for identifying possible functional changes in gene structure that may result from sequence variants. ACE ('Assessing Changes to Exons') converts phased genotype calls to a collection of explicit haplotype sequences, maps transcript annotations onto them, detects gene-structure changes and their possible repercussions, and identifies several classes of possible loss of function. Novel transcripts predicted by ACE are commonly supported by spliced RNA-seq reads, and can be used to improve read alignment and transcript quantification when an individual-specific genome sequence is available. Using publicly available RNA-seq data, we show that ACE predictions confirm earlier results regarding the quantitative effects of nonsense-mediated decay, and we show that predicted loss-of-function events are highly concordant with patterns of intolerance to mutations across the human population. ACE can be readily applied to diverse species including animals and plants, making it a broadly useful tool for use in eukaryotic population-based resequencing projects, particularly for assessing the joint impact of all variants at a locus. AVAILABILITY AND IMPLEMENTATION: ACE is written in open-source C ++ and Perl and is available from geneprediction.org/ACE. CONTACT: myandell@genetics.utah.edu or tim.reddy@duke.edu. SUPPLEMENTARY INFORMATION: Supplementary information is available at Bioinformatics online.


Asunto(s)
Genómica/métodos , Polimorfismo Genético , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Animales , Eucariontes/genética , Exones , Haplotipos , Humanos , Mutación , Empalme del ARN
10.
Nat Methods ; 10(7): 630-3, 2013 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-23708386

RESUMEN

High-throughput sequencing has opened numerous possibilities for the identification of regulatory RNA-binding events. Cross-linking and immunoprecipitation of Argonaute proteins can pinpoint a microRNA (miRNA) target site within tens of bases but leaves the identity of the miRNA unresolved. A flexible computational framework, microMUMMIE, integrates sequence with cross-linking features and reliably identifies the miRNA family involved in each binding event. It considerably outperforms sequence-only approaches and quantifies the prevalence of noncanonical binding modes.


Asunto(s)
Algoritmos , Mapeo de Interacción de Proteínas/métodos , Proteínas de Unión al ARN/genética , ARN/genética , ARN/metabolismo , Análisis de Secuencia de ARN/métodos , Integración de Sistemas
11.
Mol Ther ; 23(3): 523-32, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25492562

RESUMEN

Duchenne muscular dystrophy (DMD) is caused by genetic mutations that result in the absence of dystrophin protein expression. Oligonucleotide-induced exon skipping can restore the dystrophin reading frame and protein production. However, this requires continuous drug administration and may not generate complete skipping of the targeted exon. In this study, we apply genome editing with zinc finger nucleases (ZFNs) to permanently remove essential splicing sequences in exon 51 of the dystrophin gene and thereby exclude exon 51 from the resulting dystrophin transcript. This approach can restore the dystrophin reading frame in ~13% of DMD patient mutations. Transfection of two ZFNs targeted to sites flanking the exon 51 splice acceptor into DMD patient myoblasts led to deletion of this genomic sequence. A clonal population was isolated with this deletion and following differentiation we confirmed loss of exon 51 from the dystrophin mRNA transcript and restoration of dystrophin protein expression. Furthermore, transplantation of corrected cells into immunodeficient mice resulted in human dystrophin expression localized to the sarcolemmal membrane. Finally, we quantified ZFN toxicity in human cells and mutagenesis at predicted off-target sites. This study demonstrates a powerful method to restore the dystrophin reading frame and protein expression by permanently deleting exons.


Asunto(s)
Distrofina/genética , Exones , Terapia Genética/métodos , Edición de ARN , ARN Mensajero/genética , Dedos de Zinc/genética , Animales , Secuencia de Bases , Distrofina/biosíntesis , Distrofina/química , Electroporación , Endonucleasas/genética , Endonucleasas/metabolismo , Humanos , Ratones , Ratones Endogámicos NOD , Ratones SCID , Datos de Secuencia Molecular , Distrofia Muscular de Duchenne/genética , Distrofia Muscular de Duchenne/metabolismo , Distrofia Muscular de Duchenne/patología , Distrofia Muscular de Duchenne/terapia , Mioblastos/metabolismo , Mioblastos/patología , Sistemas de Lectura Abierta , Plásmidos/química , Plásmidos/genética , Empalme del ARN , ARN Mensajero/química , ARN Mensajero/metabolismo , Eliminación de Secuencia
12.
Bioinformatics ; 30(14): 1958-64, 2014 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-24659106

RESUMEN

MOTIVATION: High-throughput sequencing of RNA in vivo facilitates many applications, not the least of which is the cataloging of variant splice isoforms of protein-coding messenger RNAs. Although many solutions have been proposed for reconstructing putative isoforms from deep sequencing data, these generally take as their substrate the collective alignment structure of RNA-seq reads and ignore the biological signals present in the actual nucleotide sequence. The majority of these solutions are graph-theoretic, relying on a splice graph representing the splicing patterns and exon expression levels indicated by the spliced-alignment process. RESULTS: We show how to augment splice graphs with additional information reflecting the biology of transcription, splicing and translation, to produce what we call an ORF (open reading frame) graph. We then show how ORF graphs can be used to produce isoform predictions with higher accuracy than current state-of-the-art approaches. AVAILABILITY AND IMPLEMENTATION: RSVP is available as C++ source code under an open-source licence: http://ohlerlab.mdc-berlin.de/software/RSVP/.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Sistemas de Lectura Abierta , Isoformas de ARN/química , Análisis de Secuencia de ARN/métodos , Arabidopsis/genética , Exones , Humanos , Isoformas de ARN/metabolismo , Empalme del ARN , Programas Informáticos
13.
Bioinformatics ; 29(13): i27-35, 2013 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-23812993

RESUMEN

MOTIVATION: Computational approaches for the annotation of phenotypes from image data have shown promising results across many applications, and provide rich and valuable information for studying gene function and interactions. While data are often available both at high spatial resolution and across multiple time points, phenotypes are frequently annotated independently, for individual time points only. In particular, for the analysis of developmental gene expression patterns, it is biologically sensible when images across multiple time points are jointly accounted for, such that spatial and temporal dependencies are captured simultaneously. METHODS: We describe a discriminative undirected graphical model to label gene-expression time-series image data, with an efficient training and decoding method based on the junction tree algorithm. The approach is based on an effective feature selection technique, consisting of a non-parametric sparse Bayesian factor analysis model. The result is a flexible framework, which can handle large-scale data with noisy incomplete samples, i.e. it can tolerate data missing from individual time points. RESULTS: Using the annotation of gene expression patterns across stages of Drosophila embryonic development as an example, we demonstrate that our method achieves superior accuracy, gained by jointly annotating phenotype sequences, when compared with previous models that annotate each stage in isolation. The experimental results on missing data indicate that our joint learning method successfully annotates genes for which no expression data are available for one or more stages.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Modelos Estadísticos , Algoritmos , Animales , Teorema de Bayes , Drosophila/embriología , Drosophila/genética , Desarrollo Embrionario/genética , Análisis Factorial , Hibridación in Situ , ARN Mensajero/análisis , ARN Mensajero/química , Estadísticas no Paramétricas , Vocabulario Controlado
14.
Nature ; 450(7172): 1096-9, 2007 Dec 13.
Artículo en Inglés | MEDLINE | ID: mdl-18075594

RESUMEN

All metazoan eukaryotes express microRNAs (miRNAs), roughly 22-nucleotide regulatory RNAs that can repress the expression of messenger RNAs bearing complementary sequences. Several DNA viruses also express miRNAs in infected cells, suggesting a role in viral replication and pathogenesis. Although specific viral miRNAs have been shown to autoregulate viral mRNAs or downregulate cellular mRNAs, the function of most viral miRNAs remains unknown. Here we report that the miR-K12-11 miRNA encoded by Kaposi's-sarcoma-associated herpes virus (KSHV) shows significant homology to cellular miR-155, including the entire miRNA 'seed' region. Using a range of assays, we show that expression of physiological levels of miR-K12-11 or miR-155 results in the downregulation of an extensive set of common mRNA targets, including genes with known roles in cell growth regulation. Our findings indicate that viral miR-K12-11 functions as an orthologue of cellular miR-155 and probably evolved to exploit a pre-existing gene regulatory pathway in B cells. Moreover, the known aetiological role of miR-155 in B-cell transformation suggests that miR-K12-11 may contribute to the induction of KSHV-positive B-cell tumours in infected patients.


Asunto(s)
Regulación de la Expresión Génica , Herpesvirus Humano 8/genética , MicroARNs/genética , ARN Viral/genética , Homología de Secuencia de Ácido Nucleico , Regiones no Traducidas 3'/genética , Regiones no Traducidas 3'/metabolismo , Linfocitos B/metabolismo , Linfocitos B/patología , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/genética , Factores de Transcripción con Cremalleras de Leucina de Carácter Básico/metabolismo , Línea Celular , Transformación Celular Viral/genética , Proteínas del Grupo de Complementación de la Anemia de Fanconi/genética , Proteínas del Grupo de Complementación de la Anemia de Fanconi/metabolismo , Perfilación de la Expresión Génica , Humanos , MicroARNs/metabolismo , Proteínas Proto-Oncogénicas c-fos/genética , Proteínas Proto-Oncogénicas c-fos/metabolismo , ARN Viral/metabolismo , Especificidad por Sustrato
15.
PLoS Comput Biol ; 6(12): e1001037, 2010 Dec 16.
Artículo en Inglés | MEDLINE | ID: mdl-21187896

RESUMEN

The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.


Asunto(s)
Biología Computacional/métodos , Evolución Molecular , Cadenas de Markov , Elementos Reguladores de la Transcripción/genética , Alineación de Secuencia/métodos , Animales , Secuencia de Bases , Simulación por Computador , Drosophila melanogaster/genética , Regulación de la Expresión Génica , Datos de Secuencia Molecular , Filogenia , Curva ROC , Análisis de Secuencia de ADN
16.
Bioinformatics ; 25(2): 175-82, 2009 Jan 15.
Artículo en Inglés | MEDLINE | ID: mdl-19017657

RESUMEN

MOTIVATION: The modeling of conservation patterns in genomic DNA has become increasingly popular for a number of bioinformatic applications. While several systems developed to date incorporate context-dependence in their substitution models, the impact on computational complexity and generalization ability of the resulting higher order models invites the question of whether simpler approaches to context modeling might permit appreciable reductions in model complexity and computational cost, without sacrificing prediction accuracy. RESULTS: We formulate several alternative methods for context modeling based on windowed Bayesian networks, and compare their effects on both accuracy and computational complexity for the task of discriminating functionally distinct segments in vertebrate DNA. Our results show that substantial reductions in the complexity of both the model and the associated inference algorithm can be achieved without reducing predictive accuracy.


Asunto(s)
Análisis de Secuencia de ADN/métodos , Algoritmos , Teorema de Bayes , Simulación por Computador , ADN/química , Genoma , Modelos Genéticos , Programas Informáticos
17.
PLoS Biol ; 4(9): e286, 2006 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-16933976

RESUMEN

The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance.


Asunto(s)
Genoma de Protozoos , Macronúcleo/genética , Modelos Biológicos , Tetrahymena thermophila/genética , Animales , Células Cultivadas , Mapeo Cromosómico/métodos , Cromosomas , Bases de Datos Genéticas , Células Eucariotas/fisiología , Evolución Molecular , Micronúcleo Germinal/genética , Modelos Animales , Filogenia , Transducción de Señal
18.
Genome Biol Evol ; 11(10): 3035-3053, 2019 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-31599933

RESUMEN

Changes in transcriptional regulation are thought to be a major contributor to the evolution of phenotypic traits, but the contribution of changes in chromatin accessibility to the evolution of gene expression remains almost entirely unknown. To address this important gap in knowledge, we developed a new method to identify DNase I Hypersensitive (DHS) sites with differential chromatin accessibility between species using a joint modeling approach. Our method overcomes several limitations inherent to conventional threshold-based pairwise comparisons that become increasingly apparent as the number of species analyzed rises. Our approach employs a single quantitative test which is more sensitive than existing pairwise methods. To illustrate, we applied our joint approach to DHS sites in fibroblast cells from five primates (human, chimpanzee, gorilla, orangutan, and rhesus macaque). We identified 89,744 DHS sites, of which 41% are identified as differential between species using the joint model compared with 33% using the conventional pairwise approach. The joint model provides a principled approach to distinguishing single from multiple chromatin accessibility changes among species. We found that nondifferential DHS sites are enriched for nucleotide conservation. Differential DHS sites with decreased chromatin accessibility relative to rhesus macaque occur more commonly near transcription start sites (TSS), while those with increased chromatin accessibility occur more commonly distal to TSS. Further, differential DHS sites near TSS are less cell type-specific than more distal regulatory elements. Taken together, these results point to distinct classes of DHS sites, each with distinct characteristics of selection, genomic location, and cell type specificity.


Asunto(s)
Cromatina/química , Evolución Molecular , Animales , Línea Celular , Desoxirribonucleasa I , Genómica , Gorilla gorilla/genética , Humanos , Macaca mulatta/genética , Modelos Genéticos , Pan troglodytes/genética , Pongo/genética , Sitio de Iniciación de la Transcripción
20.
Nat Commun ; 9(1): 5317, 2018 12 21.
Artículo en Inglés | MEDLINE | ID: mdl-30575722

RESUMEN

Environmental stimuli commonly act via changes in gene regulation. Human-genome-scale assays to measure such responses are indirect or require knowledge of the transcription factors (TFs) involved. Here, we present the use of human genome-wide high-throughput reporter assays to measure environmentally-responsive regulatory element activity. We focus on responses to glucocorticoids (GCs), an important class of pharmaceuticals and a paradigmatic genomic response model. We assay GC-responsive regulatory activity across >108 unique DNA fragments, covering the human genome at >50×. Those assays directly detected thousands of GC-responsive regulatory elements genome-wide. We then validate those findings with measurements of transcription factor occupancy, histone modifications, chromatin accessibility, and gene expression. We also detect allele-specific environmental responses. Notably, the assays did not require knowledge of GC response mechanisms. Thus, this technology can be used to agnostically quantify genomic responses for which the underlying mechanism remains unknown.


Asunto(s)
Regulación de la Expresión Génica/efectos de los fármacos , Genoma Humano , Glucocorticoides/farmacología , Elementos Reguladores de la Transcripción/efectos de los fármacos , Interacción Gen-Ambiente , Ensayos Analíticos de Alto Rendimiento , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA