Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 33
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Mol Ther ; 29(11): 3243-3257, 2021 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-34509668

RESUMO

Targeted gene-editing strategies have emerged as promising therapeutic approaches for the permanent treatment of inherited genetic diseases. However, precise gene correction and insertion approaches using homology-directed repair are still limited by low efficiencies. Consequently, many gene-editing strategies have focused on removal or disruption, rather than repair, of genomic DNA. In contrast, homology-independent targeted integration (HITI) has been reported to effectively insert DNA sequences at targeted genomic loci. This approach could be particularly useful for restoring full-length sequences of genes affected by a spectrum of mutations that are also too large to deliver by conventional adeno-associated virus (AAV) vectors. Here, we utilize an AAV-based, HITI-mediated approach for correction of full-length dystrophin expression in a humanized mouse model of Duchenne muscular dystrophy (DMD). We co-deliver CRISPR-Cas9 and a donor DNA sequence to insert the missing human exon 52 into its corresponding position within the DMD gene and achieve full-length dystrophin correction in skeletal and cardiac muscle. Additionally, as a proof-of-concept strategy to correct genetic mutations characterized by diverse patient mutations, we deliver a superexon donor encoding the last 28 exons of the DMD gene as a therapeutic strategy to restore full-length dystrophin in >20% of the DMD patient population. This work highlights the potential of HITI-mediated gene correction for diverse DMD mutations and advances genome editing toward realizing the promise of full-length gene restoration to treat genetic disease.

2.
Am J Hum Genet ; 108(8): 1436-1449, 2021 08 05.
Artigo em Inglês | MEDLINE | ID: mdl-34216551

RESUMO

Despite widespread clinical genetic testing, many individuals with suspected genetic conditions lack a precise diagnosis, limiting their opportunity to take advantage of state-of-the-art treatments. In some cases, testing reveals difficult-to-evaluate structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted. We performed targeted long-read sequencing (T-LRS) using adaptive sampling on the Oxford Nanopore platform on 40 individuals, 10 of whom lacked a complete molecular diagnosis. We computationally targeted up to 151 Mbp of sequence per individual and searched for pathogenic substitutions, structural variants, and methylation differences using a single data source. We detected all genomic aberrations-including single-nucleotide variants, copy number changes, repeat expansions, and methylation differences-identified by prior clinical testing. In 8/8 individuals with complex structural rearrangements, T-LRS enabled more precise resolution of the mutation, leading to changes in clinical management in one case. In ten individuals with suspected Mendelian conditions lacking a precise genetic diagnosis, T-LRS identified pathogenic or likely pathogenic variants in six and variants of uncertain significance in two others. T-LRS accurately identifies pathogenic structural variants, resolves complex rearrangements, and identifies Mendelian variants not detected by other technologies. T-LRS represents an efficient and cost-effective strategy to evaluate high-priority genes and regions or complex clinical testing results.


Assuntos
Aberrações Cromossômicas , Análise Citogenética/métodos , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Genoma Humano , Mutação , Variações do Número de Cópias de DNA , Feminino , Testes Genéticos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Cariotipagem , Masculino , Análise de Sequência de DNA
3.
Genome Res ; 31(5): 877-889, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-33722938

RESUMO

High-throughput reporter assays such as self-transcribing active regulatory region sequencing (STARR-seq) have made it possible to measure regulatory element activity across the entire human genome at once. The resulting data, however, present substantial analytical challenges. Here, we identify technical biases that explain most of the variance in STARR-seq data. We then develop a statistical model to correct those biases and to improve detection of regulatory elements. This approach substantially improves precision and recall over current methods, improves detection of both activating and repressive regulatory elements, and controls for false discoveries despite strong local correlations in signal.

4.
Bioinformatics ; 36(2): 331-338, 2020 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-31368479

RESUMO

MOTIVATION: High-throughput reporter assays dramatically improve our ability to assign function to noncoding genetic variants, by measuring allelic effects on gene expression in the controlled setting of a reporter gene. Unlike genetic association tests, such assays are not confounded by linkage disequilibrium when loci are independently assayed. These methods can thus improve the identification of causal disease mutations. While work continues on improving experimental aspects of these assays, less effort has gone into developing methods for assessing the statistical significance of assay results, particularly in the case of rare variants captured from patient DNA. RESULTS: We describe a Bayesian hierarchical model, called Bayesian Inference of Regulatory Differences, which integrates prior information and explicitly accounts for variability between experimental replicates. The model produces substantially more accurate predictions than existing methods when allele frequencies are low, which is of clear advantage in the search for disease-causing variants in DNA captured from patient cohorts. Using the model, we demonstrate a clear tradeoff between variant sequencing coverage and numbers of biological replicates, and we show that the use of additional biological replicates decreases variance in estimates of effect size, due to the properties of the Poisson-binomial distribution. We also provide a power and sample size calculator, which facilitates decision making in experimental design parameters. AVAILABILITY AND IMPLEMENTATION: The software is freely available from www.geneprediction.org/bird. The experimental design web tool can be accessed at http://67.159.92.22:8080. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Software , Alelos , Teorema de Bayes , Frequência do Gene , Humanos , Desequilíbrio de Ligação
5.
Genome Biol Evol ; 11(10): 3035-3053, 2019 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-31599933

RESUMO

Changes in transcriptional regulation are thought to be a major contributor to the evolution of phenotypic traits, but the contribution of changes in chromatin accessibility to the evolution of gene expression remains almost entirely unknown. To address this important gap in knowledge, we developed a new method to identify DNase I Hypersensitive (DHS) sites with differential chromatin accessibility between species using a joint modeling approach. Our method overcomes several limitations inherent to conventional threshold-based pairwise comparisons that become increasingly apparent as the number of species analyzed rises. Our approach employs a single quantitative test which is more sensitive than existing pairwise methods. To illustrate, we applied our joint approach to DHS sites in fibroblast cells from five primates (human, chimpanzee, gorilla, orangutan, and rhesus macaque). We identified 89,744 DHS sites, of which 41% are identified as differential between species using the joint model compared with 33% using the conventional pairwise approach. The joint model provides a principled approach to distinguishing single from multiple chromatin accessibility changes among species. We found that nondifferential DHS sites are enriched for nucleotide conservation. Differential DHS sites with decreased chromatin accessibility relative to rhesus macaque occur more commonly near transcription start sites (TSS), while those with increased chromatin accessibility occur more commonly distal to TSS. Further, differential DHS sites near TSS are less cell type-specific than more distal regulatory elements. Taken together, these results point to distinct classes of DHS sites, each with distinct characteristics of selection, genomic location, and cell type specificity.


Assuntos
Cromatina/química , Evolução Molecular , Animais , Linhagem Celular , Desoxirribonuclease I , Genômica , Gorilla gorilla/genética , Humanos , Macaca mulatta/genética , Modelos Genéticos , Pan troglodytes/genética , Pongo/genética , Sítio de Iniciação de Transcrição
6.
Nat Commun ; 9(1): 5317, 2018 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-30575722

RESUMO

Environmental stimuli commonly act via changes in gene regulation. Human-genome-scale assays to measure such responses are indirect or require knowledge of the transcription factors (TFs) involved. Here, we present the use of human genome-wide high-throughput reporter assays to measure environmentally-responsive regulatory element activity. We focus on responses to glucocorticoids (GCs), an important class of pharmaceuticals and a paradigmatic genomic response model. We assay GC-responsive regulatory activity across >108 unique DNA fragments, covering the human genome at >50×. Those assays directly detected thousands of GC-responsive regulatory elements genome-wide. We then validate those findings with measurements of transcription factor occupancy, histone modifications, chromatin accessibility, and gene expression. We also detect allele-specific environmental responses. Notably, the assays did not require knowledge of GC response mechanisms. Thus, this technology can be used to agnostically quantify genomic responses for which the underlying mechanism remains unknown.


Assuntos
Regulação da Expressão Gênica/efeitos dos fármacos , Genoma Humano , Glucocorticoides/farmacologia , Elementos Reguladores de Transcrição/efeitos dos fármacos , Interação Gene-Ambiente , Ensaios de Triagem em Larga Escala , Humanos
7.
Genome Res ; 28(9): 1272-1284, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-30097539

RESUMO

Glucocorticoids are potent steroid hormones that regulate immunity and metabolism by activating the transcription factor (TF) activity of glucocorticoid receptor (GR). Previous models have proposed that DNA binding motifs and sites of chromatin accessibility predetermine GR binding and activity. However, there are vast excesses of both features relative to the number of GR binding sites. Thus, these features alone are unlikely to account for the specificity of GR binding and activity. To identify genomic and epigenetic contributions to GR binding specificity and the downstream changes resultant from GR binding, we performed hundreds of genome-wide measurements of TF binding, epigenetic state, and gene expression across a 12-h time course of glucocorticoid exposure. We found that glucocorticoid treatment induces GR to bind to nearly all pre-established enhancers within minutes. However, GR binds to only a small fraction of the set of accessible sites that lack enhancer marks. Once GR is bound to enhancers, a combination of enhancer motif composition and interactions between enhancers then determines the strength and persistence of GR binding, which consequently correlates with dramatic shifts in enhancer activation. Over the course of several hours, highly coordinated changes in TF binding and histone modification occupancy occur specifically within enhancers, and these changes correlate with changes in the expression of nearby genes. Following GR binding, changes in the binding of other TFs precede changes in chromatin accessibility, suggesting that other TFs are also sensitive to genomic features beyond that of accessibility.


Assuntos
Elementos Facilitadores Genéticos , Código das Histonas , Motivos de Nucleotídeos , Receptores de Glucocorticoides/metabolismo , Ativação Transcricional , Linhagem Celular Tumoral , Epigênese Genética , Humanos , Ligação Proteica , Fatores de Transcrição/metabolismo
8.
Cell Syst ; 7(2): 146-160.e7, 2018 08 22.
Artigo em Inglês | MEDLINE | ID: mdl-30031775

RESUMO

The glucocorticoid receptor (GR) is a hormone-inducible transcription factor involved in metabolic and anti-inflammatory gene expression responses. To investigate what controls interactions between GR binding sites and their target genes, we used in situ Hi-C to generate high-resolution, genome-wide maps of chromatin interactions before and after glucocorticoid treatment. We found that GR binding to the genome typically does not cause new chromatin interactions to target genes but instead acts through chromatin interactions that already exist prior to hormone treatment. Both glucocorticoid-induced and glucocorticoid-repressed genes increased interactions with distal GR binding sites. In addition, while glucocorticoid-induced genes increased interactions with transcriptionally active chromosome compartments, glucocorticoid-repressed genes increased interactions with transcriptionally silent compartments. Lastly, while the architectural DNA-binding proteins CTCF and RAD21 were bound to most chromatin interactions, we found that glucocorticoid-responsive chromatin interactions were depleted for CTCF binding but enriched for RAD21. Together, these findings offer new insights into the mechanisms underlying GC-mediated gene activation and repression.


Assuntos
Cromatina/metabolismo , Regulação da Expressão Gênica , Glucocorticoides/metabolismo , Receptores de Glucocorticoides/metabolismo , Sítios de Ligação , Fator de Ligação a CCCTC/metabolismo , Proteínas de Ciclo Celular , Linhagem Celular , Cromatina/genética , Proteínas de Ligação a DNA , Genoma Humano , Humanos , Proteínas Nucleares/metabolismo , Fosfoproteínas/metabolismo , Ligação Proteica
9.
Bioinformatics ; 34(21): 3616-3623, 2018 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-29701825

RESUMO

Motivation: Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed and produce functional proteins. Results: We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and non-coding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or non-coding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products and we propose that they may commonly act as cryptic factors in disease. Availability and implementation: The software is available from geneprediction.org/SGRF. Supplementary information: Supplementary information is available at Bioinformatics online.


Assuntos
Éxons , Splicing de RNA , Software , Humanos , Anotação de Sequência Molecular , Análise de Sequência de RNA
11.
PLoS One ; 12(8): e0181604, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28797091

RESUMO

There is broad agreement that genetic mutations occurring outside of the protein-coding regions play a key role in human disease. Despite this consensus, we are not yet capable of discerning which portions of non-coding sequence are important in the context of human disease. Here, we present Orion, an approach that detects regions of the non-coding genome that are depleted of variation, suggesting that the regions are intolerant of mutations and subject to purifying selection in the human lineage. We show that Orion is highly correlated with known intolerant regions as well as regions that harbor putatively pathogenic variation. This approach provides a mechanism to identify pathogenic variation in the human non-coding genome and will have immediate utility in the diagnostic interpretation of patient genomes and in large case control studies using whole-genome sequences.


Assuntos
Variação Genética , Genoma Humano , Predisposição Genética para Doença , Genética Populacional , Humanos , Modelos Genéticos , Mutação , Fases de Leitura Aberta , Seleção Genética
12.
Bioinformatics ; 33(10): 1437-1446, 2017 May 15.
Artigo em Inglês | MEDLINE | ID: mdl-28011790

RESUMO

Motivation: The accurate interpretation of genetic variants is critical for characterizing genotype-phenotype associations. Because the effects of genetic variants can depend strongly on their local genomic context, accurate genome annotations are essential. Furthermore, as some variants have the potential to disrupt or alter gene structure, variant interpretation efforts stand to gain from the use of individualized annotations that account for differences in gene structure between individuals or strains. Results: We describe a suite of software tools for identifying possible functional changes in gene structure that may result from sequence variants. ACE ('Assessing Changes to Exons') converts phased genotype calls to a collection of explicit haplotype sequences, maps transcript annotations onto them, detects gene-structure changes and their possible repercussions, and identifies several classes of possible loss of function. Novel transcripts predicted by ACE are commonly supported by spliced RNA-seq reads, and can be used to improve read alignment and transcript quantification when an individual-specific genome sequence is available. Using publicly available RNA-seq data, we show that ACE predictions confirm earlier results regarding the quantitative effects of nonsense-mediated decay, and we show that predicted loss-of-function events are highly concordant with patterns of intolerance to mutations across the human population. ACE can be readily applied to diverse species including animals and plants, making it a broadly useful tool for use in eukaryotic population-based resequencing projects, particularly for assessing the joint impact of all variants at a locus. Availability and Implementation: ACE is written in open-source C ++ and Perl and is available from geneprediction.org/ACE. Contact: myandell@genetics.utah.edu or tim.reddy@duke.edu. Supplementary information: Supplementary information is available at Bioinformatics online.


Assuntos
Genômica/métodos , Polimorfismo Genético , Análise de Sequência de RNA/métodos , Software , Animais , Eucariotos/genética , Éxons , Haplótipos , Humanos , Mutação , Splicing de RNA
13.
Cell ; 166(5): 1269-1281.e19, 2016 Aug 25.
Artigo em Inglês | MEDLINE | ID: mdl-27565349

RESUMO

The glucocorticoid receptor (GR) binds the human genome at >10,000 sites but only regulates the expression of hundreds of genes. To determine the functional effect of each site, we measured the glucocorticoid (GC) responsive activity of nearly all GR binding sites (GBSs) captured using chromatin immunoprecipitation (ChIP) in A549 cells. 13% of GBSs assayed had GC-induced activity. The responsive sites were defined by direct GR binding via a GC response element (GRE) and exclusively increased reporter-gene expression. Meanwhile, most GBSs lacked GC-induced reporter activity. The non-responsive sites had epigenetic features of steady-state enhancers and clustered around direct GBSs. Together, our data support a model in which clusters of GBSs observed with ChIP-seq reflect interactions between direct and tethered GBSs over tens of kilobases. We further show that those interactions can synergistically modulate the activity of direct GBSs and may therefore play a major role in driving gene activation in response to GCs.


Assuntos
Genoma Humano , Glucocorticoides/metabolismo , Receptores de Glucocorticoides/metabolismo , Fatores de Transcrição/metabolismo , Ativação Transcricional , Células A549 , Sítios de Ligação/efeitos dos fármacos , Imunoprecipitação da Cromatina , Dexametasona/metabolismo , Dexametasona/farmacologia , Genes Reporter , Glucocorticoides/farmacologia , Humanos , Ligação Proteica/efeitos dos fármacos , Elementos de Resposta
14.
Genetics ; 203(2): 699-714, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27098910

RESUMO

Research on the genetics of natural populations was revolutionized in the 1990s by methods for genotyping noninvasively collected samples. However, these methods have remained largely unchanged for the past 20 years and lag far behind the genomics era. To close this gap, here we report an optimized laboratory protocol for genome-wide capture of endogenous DNA from noninvasively collected samples, coupled with a novel computational approach to reconstruct pedigree links from the resulting low-coverage data. We validated both methods using fecal samples from 62 wild baboons, including 48 from an independently constructed extended pedigree. We enriched fecal-derived DNA samples up to 40-fold for endogenous baboon DNA and reconstructed near-perfect pedigree relationships even with extremely low-coverage sequencing. We anticipate that these methods will be broadly applicable to the many research systems for which only noninvasive samples are available. The lab protocol and software ("WHODAD") are freely available at www.tung-lab.org/protocols-and-software.html and www.xzlab.org/software.html, respectively.


Assuntos
Genoma , Técnicas de Genotipagem/métodos , Papio/genética , Linhagem , Análise de Sequência de DNA/métodos , Animais , Fezes/química , Software
15.
Genome Res ; 25(8): 1206-14, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26084464

RESUMO

We report a novel high-throughput method to empirically quantify individual-specific regulatory element activity at the population scale. The approach combines targeted DNA capture with a high-throughput reporter gene expression assay. As demonstration, we measured the activity of more than 100 putative regulatory elements from 95 individuals in a single experiment. In agreement with previous reports, we found that most genetic variants have weak effects on distal regulatory element activity. Because haplotypes are typically maintained within but not between assayed regulatory elements, the approach can be used to identify causal regulatory haplotypes that likely contribute to human phenotypes. Finally, we demonstrate the utility of the method to functionally fine map causal regulatory variants in regions of high linkage disequilibrium identified by expression quantitative trait loci (eQTL) analyses.


Assuntos
Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequências Reguladoras de Ácido Nucleico , Biologia Computacional/métodos , Genoma Humano , Haplótipos , Humanos , Modelagem Computacional Específica para o Paciente , Locos de Características Quantitativas
16.
Nat Commun ; 6: 6244, 2015 Feb 18.
Artigo em Inglês | MEDLINE | ID: mdl-25692716

RESUMO

The CRISPR/Cas9 genome-editing platform is a promising technology to correct the genetic basis of hereditary diseases. The versatility, efficiency and multiplexing capabilities of the CRISPR/Cas9 system enable a variety of otherwise challenging gene correction strategies. Here, we use the CRISPR/Cas9 system to restore the expression of the dystrophin gene in cells carrying dystrophin mutations that cause Duchenne muscular dystrophy (DMD). We design single or multiplexed sgRNAs to restore the dystrophin reading frame by targeting the mutational hotspot at exons 45-55 and introducing shifts within exons or deleting one or more exons. Following gene editing in DMD patient myoblasts, dystrophin expression is restored in vitro. Human dystrophin is also detected in vivo after transplantation of genetically corrected patient cells into immunodeficient mice. Importantly, the unique multiplex gene-editing capabilities of the CRISPR/Cas9 system facilitate the generation of a single large deletion that can correct up to 62% of DMD mutations.


Assuntos
Sistemas CRISPR-Cas/genética , Distrofina/genética , Genoma , Distrofia Muscular de Duchenne/genética , Mutação , Animais , Separação Celular , Modelos Animais de Doenças , Éxons , Citometria de Fluxo , Deleção de Genes , Terapia Genética/métodos , Células HEK293 , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Masculino , Camundongos , Camundongos SCID , Plasmídeos/metabolismo , Reação em Cadeia da Polimerase
17.
Mol Ther ; 23(3): 523-32, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25492562

RESUMO

Duchenne muscular dystrophy (DMD) is caused by genetic mutations that result in the absence of dystrophin protein expression. Oligonucleotide-induced exon skipping can restore the dystrophin reading frame and protein production. However, this requires continuous drug administration and may not generate complete skipping of the targeted exon. In this study, we apply genome editing with zinc finger nucleases (ZFNs) to permanently remove essential splicing sequences in exon 51 of the dystrophin gene and thereby exclude exon 51 from the resulting dystrophin transcript. This approach can restore the dystrophin reading frame in ~13% of DMD patient mutations. Transfection of two ZFNs targeted to sites flanking the exon 51 splice acceptor into DMD patient myoblasts led to deletion of this genomic sequence. A clonal population was isolated with this deletion and following differentiation we confirmed loss of exon 51 from the dystrophin mRNA transcript and restoration of dystrophin protein expression. Furthermore, transplantation of corrected cells into immunodeficient mice resulted in human dystrophin expression localized to the sarcolemmal membrane. Finally, we quantified ZFN toxicity in human cells and mutagenesis at predicted off-target sites. This study demonstrates a powerful method to restore the dystrophin reading frame and protein expression by permanently deleting exons.


Assuntos
Distrofina/genética , Éxons , Terapia Genética/métodos , Edição de RNA , RNA Mensageiro/genética , Dedos de Zinco/genética , Animais , Sequência de Bases , Distrofina/biossíntese , Distrofina/química , Eletroporação , Endonucleases/genética , Endonucleases/metabolismo , Humanos , Camundongos , Camundongos Endogâmicos NOD , Camundongos SCID , Dados de Sequência Molecular , Distrofia Muscular de Duchenne/genética , Distrofia Muscular de Duchenne/metabolismo , Distrofia Muscular de Duchenne/patologia , Distrofia Muscular de Duchenne/terapia , Mioblastos/metabolismo , Mioblastos/patologia , Fases de Leitura Aberta , Plasmídeos/química , Plasmídeos/genética , Splicing de RNA , RNA Mensageiro/química , RNA Mensageiro/metabolismo , Deleção de Sequência
18.
Bioinformatics ; 30(14): 1958-64, 2014 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-24659106

RESUMO

MOTIVATION: High-throughput sequencing of RNA in vivo facilitates many applications, not the least of which is the cataloging of variant splice isoforms of protein-coding messenger RNAs. Although many solutions have been proposed for reconstructing putative isoforms from deep sequencing data, these generally take as their substrate the collective alignment structure of RNA-seq reads and ignore the biological signals present in the actual nucleotide sequence. The majority of these solutions are graph-theoretic, relying on a splice graph representing the splicing patterns and exon expression levels indicated by the spliced-alignment process. RESULTS: We show how to augment splice graphs with additional information reflecting the biology of transcription, splicing and translation, to produce what we call an ORF (open reading frame) graph. We then show how ORF graphs can be used to produce isoform predictions with higher accuracy than current state-of-the-art approaches. AVAILABILITY AND IMPLEMENTATION: RSVP is available as C++ source code under an open-source licence: http://ohlerlab.mdc-berlin.de/software/RSVP/.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala/métodos , Fases de Leitura Aberta , Isoformas de RNA/química , Análise de Sequência de RNA/métodos , Arabidopsis/genética , Éxons , Humanos , Isoformas de RNA/metabolismo , Splicing de RNA , Software
19.
Bioinformatics ; 29(13): i27-35, 2013 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-23812993

RESUMO

MOTIVATION: Computational approaches for the annotation of phenotypes from image data have shown promising results across many applications, and provide rich and valuable information for studying gene function and interactions. While data are often available both at high spatial resolution and across multiple time points, phenotypes are frequently annotated independently, for individual time points only. In particular, for the analysis of developmental gene expression patterns, it is biologically sensible when images across multiple time points are jointly accounted for, such that spatial and temporal dependencies are captured simultaneously. METHODS: We describe a discriminative undirected graphical model to label gene-expression time-series image data, with an efficient training and decoding method based on the junction tree algorithm. The approach is based on an effective feature selection technique, consisting of a non-parametric sparse Bayesian factor analysis model. The result is a flexible framework, which can handle large-scale data with noisy incomplete samples, i.e. it can tolerate data missing from individual time points. RESULTS: Using the annotation of gene expression patterns across stages of Drosophila embryonic development as an example, we demonstrate that our method achieves superior accuracy, gained by jointly annotating phenotype sequences, when compared with previous models that annotate each stage in isolation. The experimental results on missing data indicate that our joint learning method successfully annotates genes for which no expression data are available for one or more stages.


Assuntos
Perfilação da Expressão Gênica/métodos , Processamento de Imagem Assistida por Computador/métodos , Modelos Estatísticos , Algoritmos , Animais , Teorema de Bayes , Drosophila/embriologia , Drosophila/genética , Desenvolvimento Embrionário/genética , Análise Fatorial , Hibridização In Situ , RNA Mensageiro/análise , RNA Mensageiro/química , Estatísticas não Paramétricas , Vocabulário Controlado
20.
Nat Methods ; 10(7): 630-3, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23708386

RESUMO

High-throughput sequencing has opened numerous possibilities for the identification of regulatory RNA-binding events. Cross-linking and immunoprecipitation of Argonaute proteins can pinpoint a microRNA (miRNA) target site within tens of bases but leaves the identity of the miRNA unresolved. A flexible computational framework, microMUMMIE, integrates sequence with cross-linking features and reliably identifies the miRNA family involved in each binding event. It considerably outperforms sequence-only approaches and quantifies the prevalence of noncanonical binding modes.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas/métodos , Proteínas de Ligação a RNA/genética , RNA/genética , RNA/metabolismo , Análise de Sequência de RNA/métodos , Integração de Sistemas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...