Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
Genome Biol ; 15(3): R53, 2014 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-24667040

RESUMO

BACKGROUND: There is tremendous potential for genome sequencing to improve clinical diagnosis and care once it becomes routinely accessible, but this will require formalizing research methods into clinical best practices in the areas of sequence data generation, analysis, interpretation and reporting. The CLARITY Challenge was designed to spur convergence in methods for diagnosing genetic disease starting from clinical case history and genome sequencing data. DNA samples were obtained from three families with heritable genetic disorders and genomic sequence data were donated by sequencing platform vendors. The challenge was to analyze and interpret these data with the goals of identifying disease-causing variants and reporting the findings in a clinically useful format. Participating contestant groups were solicited broadly, and an independent panel of judges evaluated their performance. RESULTS: A total of 30 international groups were engaged. The entries reveal a general convergence of practices on most elements of the analysis and interpretation process. However, even given this commonality of approach, only two groups identified the consensus candidate variants in all disease cases, demonstrating a need for consistent fine-tuning of the generally accepted methods. There was greater diversity of the final clinical report content and in the patient consenting process, demonstrating that these areas require additional exploration and standardization. CONCLUSIONS: The CLARITY Challenge provides a comprehensive assessment of current practices for using genome sequencing to diagnose and report genetic diseases. There is remarkable convergence in bioinformatic techniques, but medical interpretation and reporting are areas that require further development by many groups.


Assuntos
Bases de Dados Genéticas/normas , Testes Genéticos/métodos , Genômica/métodos , Revisão da Pesquisa por Pares , Análise de Sequência de DNA/métodos , Criança , Feminino , Organização do Financiamento , Testes Genéticos/economia , Testes Genéticos/normas , Genômica/economia , Genômica/normas , Cardiopatias Congênitas/diagnóstico , Cardiopatias Congênitas/genética , Humanos , Masculino , Miopatias Congênitas Estruturais/diagnóstico , Miopatias Congênitas Estruturais/genética , Análise de Sequência de DNA/economia , Análise de Sequência de DNA/normas
2.
Methods ; 59(1): S24-8, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23036331

RESUMO

In recent years, gene fusions have gained significant recognition as biomarkers. They can assist treatment decisions, are seldom found in normal tissue and are detectable through Next-generation sequencing (NGS) of the transcriptome (RNA-seq). To transform the data provided by the sequencer into robust gene fusion detection several analysis steps are needed. Usually the first step is to map the sequenced transcript fragments (RNA-seq) to a reference genome. One standard application of this approach is to estimate expression and detect variants within known genes, e.g. SNPs and indels. In case of gene fusions, however, completely novel gene structures have to be detected. Here, we describe the detection of such gene fusion events based on our comprehensive transcript annotation (ElDorado). To demonstrate the utility of our approach, we extract gene fusion candidates from eight breast cancer cell lines, which we compare to experimentally verified gene fusions. We discuss several gene fusion events, like BCAS3-BCAS4 that was only detected in the breast cancer cell line MCF7. As supporting evidence we show that gene fusions occur more frequently in copy number enriched regions (CNV analysis). In addition, we present the Transcriptome Viewer (TViewer) a tool that allows to interactively visualize gene fusions. Finally, we support detected gene fusions through literature mining based annotations and network analyses. In conclusion, we present a platform that allows detecting gene fusions and supporting them through literature knowledge as well as rich visualization capabilities. This enables scientists to better understand molecular processes, biological functions and disease associations, which will ultimately lead to better biomedical knowledge for the development of biomarkers for diagnostics and therapies.


Assuntos
Mapeamento Cromossômico/métodos , Proteínas de Fusão Oncogênica/genética , Biomarcadores Tumorais/genética , Linhagem Celular Tumoral , Variações do Número de Cópias de DNA , Perfilação da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Anotação de Sequência Molecular/métodos , Análise de Sequência de DNA
3.
Nucleic Acids Res ; 40(6): 2668-82, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22121224

RESUMO

TDP-43 is linked to neurodegenerative diseases including frontotemporal dementia and amyotrophic lateral sclerosis. Mostly localized in the nucleus, TDP-43 acts in conjunction with other ribonucleoproteins as a splicing co-factor. Several RNA targets of TDP-43 have been identified so far, but its role(s) in pathogenesis remains unclear. Using Affymetrix exon arrays, we have screened for the first time for splicing events upon TDP-43 knockdown. We found alternative splicing of the ribosomal S6 kinase 1 (S6K1) Aly/REF-like target (SKAR) upon TDP-43 knockdown in non-neuronal and neuronal cell lines. Alternative SKAR splicing depended on the first RNA recognition motif (RRM1) of TDP-43 and on 5'-GA-3' and 5'-UG-3' repeats within the SKAR pre-mRNA. SKAR is a component of the exon junction complex, which recruits S6K1, thereby facilitating the pioneer round of translation and promoting cell growth. Indeed, we found that expression of the alternatively spliced SKAR enhanced S6K1-dependent signaling pathways and the translational yield of a splice-dependent reporter. Consistent with this, TDP-43 knockdown also increased translational yield and significantly increased cell size. This indicates a novel mechanism of deregulated translational control upon TDP-43 deficiency, which might contribute to pathogenesis of the protein aggregation diseases frontotemporal dementia and amyotrophic lateral sclerosis.


Assuntos
Processamento Alternativo , Proteínas de Ligação a DNA/fisiologia , Proteínas Nucleares/genética , Biossíntese de Proteínas , Proteínas de Ligação a RNA/fisiologia , Linhagem Celular , Proteínas de Ligação a DNA/antagonistas & inibidores , Proteínas de Ligação a DNA/metabolismo , Éxons , Humanos , Proteínas Nucleares/metabolismo , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Proteínas de Ligação a RNA/antagonistas & inibidores , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Sequências Repetitivas de Ácido Nucleico , Transfecção
4.
PLoS One ; 5(11): e13876, 2010 Nov 30.
Artigo em Inglês | MEDLINE | ID: mdl-21152420

RESUMO

Today, annotated amino acid sequences of more and more transcription factors (TFs) are readily available. Quantitative information about their DNA-binding specificities, however, are hard to obtain. Position frequency matrices (PFMs), the most widely used models to represent binding specificities, are experimentally characterized only for a small fraction of all TFs. Even for some of the most intensively studied eukaryotic organisms (i.e., human, rat and mouse), roughly one-sixth of all proteins with annotated DNA-binding domain have been characterized experimentally. Here, we present a new method based on support vector regression for predicting quantitative DNA-binding specificities of TFs in different eukaryotic species. This approach estimates a quantitative measure for the PFM similarity of two proteins, based on various features derived from their protein sequences. The method is trained and tested on a dataset containing 1 239 TFs with known DNA-binding specificity, and used to predict specific DNA target motifs for 645 TFs with high accuracy.


Assuntos
Algoritmos , Proteínas de Ligação a DNA/metabolismo , DNA/metabolismo , Fatores de Transcrição/metabolismo , Motivos de Aminoácidos/genética , Sequência de Aminoácidos , Animais , Sítios de Ligação/genética , Ligação Competitiva , Biologia Computacional/métodos , Proteínas de Ligação a DNA/genética , Humanos , Camundongos , Dados de Sequência Molecular , Ligação Proteica , Ratos , Reprodutibilidade dos Testes , Fatores de Transcrição/genética
5.
Algorithms Mol Biol ; 5: 28, 2010 Jun 25.
Artigo em Inglês | MEDLINE | ID: mdl-20579369

RESUMO

BACKGROUND: Mass spectrometry (MS) based protein profiling has become one of the key technologies in biomedical research and biomarker discovery. One bottleneck in MS-based protein analysis is sample preparation and an efficient fractionation step to reduce the complexity of the biological samples, which are too complex to be analyzed directly with MS. Sample preparation strategies that reduce the complexity of tryptic digests by using immunoaffinity based methods have shown to lead to a substantial increase in throughput and sensitivity in the proteomic mass spectrometry approach. The limitation of using such immunoaffinity-based approaches is the availability of the appropriate peptide specific capture antibodies. Recent developments in these approaches, where subsets of peptides with short identical terminal sequences can be enriched using antibodies directed against short terminal epitopes, promise a significant gain in efficiency. RESULTS: We show that the minimal set of terminal epitopes for the coverage of a target protein list can be found by the formulation as a set cover problem, preceded by a filtering pipeline for the exclusion of peptides and target epitopes with undesirable properties. CONCLUSIONS: For small datasets (a few hundred proteins) it is possible to solve the problem to optimality with moderate computational effort using commercial or free solvers. Larger datasets, like full proteomes require the use of heuristics.

6.
BMC Syst Biol ; 3: 67, 2009 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-19566957

RESUMO

BACKGROUND: Sensory proteins react to changing environmental conditions by transducing signals into the cell. These signals are integrated into core proteins that activate downstream target proteins such as transcription factors (TFs). This structure is referred to as a bow tie, and allows cells to respond appropriately to complex environmental conditions. Understanding this cellular processing of information, from sensory proteins (e.g., cell-surface proteins) to target proteins (e.g., TFs) is important, yet for many processes the signaling pathways remain unknown. RESULTS: Here, we present BowTieBuilder for inferring signal transduction pathways from multiple source and target proteins. Given protein-protein interaction (PPI) data signaling pathways are assembled without knowledge of the intermediate signaling proteins while maximizing the overall probability of the pathway. To assess the inference quality, BowTieBuilder and three alternative heuristics are applied to several pathways, and the resulting pathways are compared to reference pathways taken from KEGG. In addition, BowTieBuilder is used to infer a signaling pathway of the innate immune response in humans and a signaling pathway that potentially regulates an underlying gene regulatory network. CONCLUSION: We show that BowTieBuilder, given multiple source and/or target proteins, infers pathways with satisfactory recall and precision rates and detects the core proteins of each pathway.


Assuntos
Biologia Computacional/métodos , Modelos Biológicos , Transdução de Sinais , Ciclo Celular , Bases de Dados Genéticas , Redes Reguladoras de Genes , Humanos , Imunidade Inata , Sistema de Sinalização das MAP Quinases , Modelos Moleculares , Conformação Proteica , Mapeamento de Interação de Proteínas , Proteínas/química , Proteínas/metabolismo , Reprodutibilidade dos Testes , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/metabolismo
7.
BMC Syst Biol ; 3: 5, 2009 Jan 14.
Artigo em Inglês | MEDLINE | ID: mdl-19144170

RESUMO

BACKGROUND: To understand the dynamic behavior of cellular systems, mathematical modeling is often necessary and comprises three steps: (1) experimental measurement of participating molecules, (2) assignment of rate laws to each reaction, and (3) parameter calibration with respect to the measurements. In each of these steps the modeler is confronted with a plethora of alternative approaches, e. g., the selection of approximative rate laws in step two as specific equations are often unknown, or the choice of an estimation procedure with its specific settings in step three. This overall process with its numerous choices and the mutual influence between them makes it hard to single out the best modeling approach for a given problem. RESULTS: We investigate the modeling process using multiple kinetic equations together with various parameter optimization methods for a well-characterized example network, the biosynthesis of valine and leucine in C. glutamicum. For this purpose, we derive seven dynamic models based on generalized mass action, Michaelis-Menten and convenience kinetics as well as the stochastic Langevin equation. In addition, we introduce two modeling approaches for feedback inhibition to the mass action kinetics. The parameters of each model are estimated using eight optimization strategies. To determine the most promising modeling approaches together with the best optimization algorithms, we carry out a two-step benchmark: (1) coarse-grained comparison of the algorithms on all models and (2) fine-grained tuning of the best optimization algorithms and models. To analyze the space of the best parameters found for each model, we apply clustering, variance, and correlation analysis. CONCLUSION: A mixed model based on the convenience rate law and the Michaelis-Menten equation, in which all reactions are assumed to be reversible, is the most suitable deterministic modeling approach followed by a reversible generalized mass action kinetics model. A Langevin model is advisable to take stochastic effects into account. To estimate the model parameters, three algorithms are particularly useful: For first attempts the settings-free Tribes algorithm yields valuable results. Particle swarm optimization and differential evolution provide significantly better results with appropriate settings.


Assuntos
Algoritmos , Corynebacterium glutamicum/metabolismo , Leucina/biossíntese , Redes e Vias Metabólicas/fisiologia , Modelos Biológicos , Valina/biossíntese , Corynebacterium glutamicum/fisiologia , Cinética
8.
BMC Syst Biol ; 2: 39, 2008 Apr 30.
Artigo em Inglês | MEDLINE | ID: mdl-18447902

RESUMO

BACKGROUND: The development of complex biochemical models has been facilitated through the standardization of machine-readable representations like SBML (Systems Biology Markup Language). This effort is accompanied by the ongoing development of the human-readable diagrammatic representation SBGN (Systems Biology Graphical Notation). The graphical SBML editor CellDesigner allows direct translation of SBGN into SBML, and vice versa. For the assignment of kinetic rate laws, however, this process is not straightforward, as it often requires manual assembly and specific knowledge of kinetic equations. RESULTS: SBMLsqueezer facilitates exactly this modeling step via automated equation generation, overcoming the highly error-prone and cumbersome process of manually assigning kinetic equations. For each reaction the kinetic equation is derived from the stoichiometry, the participating species (e.g., proteins, mRNA or simple molecules) as well as the regulatory relations (activation, inhibition or other modulations) of the SBGN diagram. Such information allows distinctions between, for example, translation, phosphorylation or state transitions. The types of kinetics considered are numerous, for instance generalized mass-action, Hill, convenience and several Michaelis-Menten-based kinetics, each including activation and inhibition. These kinetics allow SBMLsqueezer to cover metabolic, gene regulatory, signal transduction and mixed networks. Whenever multiple kinetics are applicable to one reaction, parameter settings allow for user-defined specifications. After invoking SBMLsqueezer, the kinetic formulas are generated and assigned to the model, which can then be simulated in CellDesigner or with external ODE solvers. Furthermore, the equations can be exported to SBML, LaTeX or plain text format. CONCLUSION: SBMLsqueezer considers the annotation of all participating reactants, products and regulators when generating rate laws for reactions. Thus, for each reaction, only applicable kinetic formulas are considered. This modeling scheme creates kinetics in accordance with the diagrammatic representation. In contrast most previously published tools have relied on the stoichiometry and generic modulators of a reaction, thus ignoring and potentially conflicting with the information expressed through the process diagram. Additional material and the source code can be found at the project homepage (URL found in the Availability and requirements section).


Assuntos
Química Orgânica/métodos , Sistemas de Gerenciamento de Base de Dados , Interface Usuário-Computador , Algoritmos , Redes Reguladoras de Genes , Hipermídia , Armazenamento e Recuperação da Informação/métodos , Cinética , Redes e Vias Metabólicas , Modelos Biológicos , Modelos Químicos , Mapeamento de Interação de Proteínas/métodos , Transdução de Sinais , Biologia de Sistemas/métodos
9.
BMC Bioinformatics ; 8: 334, 2007 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-17850657

RESUMO

BACKGROUND: Cells dynamically adapt their gene expression patterns in response to various stimuli. This response is orchestrated into a number of gene expression modules consisting of co-regulated genes. A growing pool of publicly available microarray datasets allows the identification of modules by monitoring expression changes over time. These time-series datasets can be searched for gene expression modules by one of the many clustering methods published to date. For an integrative analysis, several time-series datasets can be joined into a three-dimensional gene-condition-time dataset, to which standard clustering or biclustering methods are, however, not applicable. We thus devise a probabilistic clustering algorithm for gene-condition-time datasets. RESULTS: In this work, we present the EDISA (Extended Dimension Iterative Signature Algorithm), a novel probabilistic clustering approach for 3D gene-condition-time datasets. Based on mathematical definitions of gene expression modules, the EDISA samples initial modules from the dataset which are then refined by removing genes and conditions until they comply with the module definition. A subsequent extension step ensures gene and condition maximality. We applied the algorithm to a synthetic dataset and were able to successfully recover the implanted modules over a range of background noise intensities. Analysis of microarray datasets has lead us to define three biologically relevant module types: 1) We found modules with independent response profiles to be the most prevalent ones. These modules comprise genes which are co-regulated under several conditions, yet with a different response pattern under each condition. 2) Coherent modules with similar responses under all conditions occurred frequently, too, and were often contained within these modules. 3) A third module type, which covers a response specific to a single condition was also detected, but rarely. All of these modules are essentially different types of biclusters. CONCLUSION: We successfully applied the EDISA to different 3D datasets. While previous studies were mostly aimed at detecting coherent modules only, our results show that coherent responses are often part of a more general module type with independent response profiles under different conditions. Our approach thus allows for a more comprehensive view of the gene expression response. After subsequent analysis of the resulting modules, the EDISA helped to shed light on the global organization of transcriptional control. An implementation of the algorithm is available at http://www-ra.informatik.uni-tuebingen.de/software/IAGEN/.


Assuntos
Algoritmos , Perfilação da Expressão Gênica/métodos , Modelos Biológicos , Família Multigênica/fisiologia , Reconhecimento Automatizado de Padrão/métodos , Proteoma/metabolismo , Transdução de Sinais/fisiologia , Análise por Conglomerados , Simulação por Computador , Software
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA