Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 12(1): 12324, 2022 07 19.
Artigo em Inglês | MEDLINE | ID: mdl-35853974

RESUMO

Differential gene expression normalised to a single housekeeping (HK) is used to identify disease mechanisms and therapeutic targets. HK gene selection is often arbitrary, potentially introducing systematic error and discordant results. Here we examine these risks in a disease model of brain hypoxia. We first identified the eight most frequently used HK genes through a systematic review. However, we observe that in both ex-vivo and in vivo, their expression levels varied considerably between conditions. When applying these genes to normalise expression levels of the validated stroke target gene, inducible Nox4, we obtained opposing results. As an alternative tool for unbiased HK gene selection, software tools exist but are limited to individual datasets lacking genome-wide search capability and user-friendly interfaces. We, therefore, developed the HouseKeepR algorithm to rapidly analyse multiple gene expression datasets in a disease-specific manner and rank HK gene candidates according to stability in an unbiased manner. Using a panel of de novo top-ranked HK genes for brain hypoxia, but not single genes, Nox4 induction was consistently reproduced. Thus, differential gene expression analysis is best normalised against a HK gene panel selected in an unbiased manner. HouseKeepR is the first user-friendly, bias-free, and broadly applicable tool to automatically propose suitable HK genes in a tissue- and disease-dependent manner.


Assuntos
Genes Essenciais , Hipóxia Encefálica , Algoritmos , Expressão Gênica , Perfilação da Expressão Gênica , Humanos
2.
Syst Med (New Rochelle) ; 2(1): 1-9, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31119214

RESUMO

Introduction: Drug-resistant infections are becoming increasingly frequent worldwide, causing hundreds of thousands of deaths annually. This is partly due to the very limited set of protein drug targets known for human-infecting viral genomes. The eleven influenza virus proteins, for instance, exploit host cell factors for replication and suppression of the antiviral immune responses. A systems medicine approach to identify relevant and druggable host factors would dramatically expand therapeutic options. Therapeutic target identification, however, has hitherto relied on static molecular networks, whereas in reality the interactome, in particular during an infection, is subject to constant change. Methods: We developed time-course network enrichment (TiCoNE), an expert-centered approach for discovering temporal response pathways. In the first stage of TiCoNE, time-series expression data is clustered in a human-augmented manner to identify groups of biological entities with coherent temporal responses. Throughout this process, the expert can add, remove, merge, or split temporal patterns. The resulting groups can then be mapped to an interaction network to identify enriched pathways and to analyze cross-talk enrichments and depletions between groups. Finally, temporal response groups of two experiments can be intersected, to identify condition-variant response patterns that represent promising drug-target candidates. Results: We applied TiCoNE to human gene expression data for influenza A virus infection and rhino virus infection, respectively. We then identified coherent temporal response patterns and employed our cross-talk analysis to establish two potential timelines of systems-level host responses for either infection. Next, we compared the two phenotypes and unraveled condition-variant temporal groups interacting on a networks level. The highest-ranking ones we then validated via literature search and wet-lab experiments. This not only confirmed many of our candidates as previously known, but we also identified phospholipid scramblase 1 (encoded by PLSCR1) as a previously not recognized host factor that is essential for influenza A virus infection. Conclusion: With TiCoNE we developed a novel approach for conjointly analyzing molecular networks with time-series expression data and demonstrated its power by identifying temporal drug-targets. We provide proof-of-concept that not only novel targets can be identified using our approach, but also that anti-infective drug target discovery can be enhanced by investigating temporal molecular networks of the host in response to viral infection.

3.
Nat Genet ; 51(4): 716-727, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30833796

RESUMO

Mesenchymal (stromal) stem cells (MSCs) constitute populations of mesodermal multipotent cells involved in tissue regeneration and homeostasis in many different organs. Here we performed comprehensive characterization of the transcriptional and epigenomic changes associated with osteoblast and adipocyte differentiation of human MSCs. We demonstrate that adipogenesis is driven by considerable remodeling of the chromatin landscape and de novo activation of enhancers, whereas osteogenesis involves activation of preestablished enhancers. Using machine learning algorithms for in silico modeling of transcriptional regulation, we identify a large and diverse transcriptional network of pro-osteogenic and antiadipogenic transcription factors. Intriguingly, binding motifs for these factors overlap with SNPs related to bone and fat formation in humans, and knockdown of single members of this network is sufficient to modulate differentiation in both directions, thus indicating that lineage determination is a delicate balance between the activities of many different transcription factors.


Assuntos
Adipogenia/genética , Osteogênese/genética , Fator de Células-Tronco/genética , Fatores de Transcrição/genética , Células A549 , Adipócitos/fisiologia , Diferenciação Celular/genética , Linhagem Celular Tumoral , Células Cultivadas , Células HEK293 , Humanos , Células-Tronco Mesenquimais/fisiologia , Osteoblastos/fisiologia , Polimorfismo de Nucleotídeo Único/genética
4.
Nat Genet ; 51(4): 766, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30911162

RESUMO

In the version of this article initially published, in the graph keys in Fig. 1i, the colors indicating 'Ob' and 'Ad' were red and blue, respectively, but should have been blue and red, respectively; the shapes indicating 'MUS' and 'BM' were a triangle and a square, respectively, but should have been a square and a triangle, respectively. The errors have been corrected in the HTML and PDF versions of the article.

5.
Nat Protoc ; 13(6): 1429-1444, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-29844526

RESUMO

Clustering is a popular technique for discovering groups of similar objects in large datasets. It is nowadays applied in all areas of life sciences, from biomedicine to physics. However, designing high-quality cluster analyses is a tedious and complicated task with manifold choices along the way. As a cluster analysis is often the first step of a succeeding downstream analysis, the clustering must be reliable, reproducible, and of the highest quality. To address these challenges, we recently developed ClustEval, an integrated and extensible platform for the automated and standardized design and execution of complex cluster analyses. It allows researchers to design and carry out cluster analyses involving a large number of clustering methods applied to many, large datasets. ClustEval helps to shed light on all major aspects of cluster analysis, from choosing the right similarity function to using validity indices and data preprocessing protocols. Only this high degree of automation allows the researcher to easily run a clustering task with many different tools, parameters, and settings in order to gain the best possible outcome. In this paper, we guide the user step by step through three fundamentally important and widely applicable use cases: (i) identification of the best clustering method for a new, user-given protein sequence similarity dataset; (ii) evaluation of the performance of a new, user-given clustering method (densityCut) against the state of the art; and (iii) prediction of the best method for a new protein sequence similarity dataset. This protocol guides the user through the most important features of ClustEval and takes ∼4 h to complete.


Assuntos
Pesquisa Biomédica/métodos , Bioestatística/métodos , Análise por Conglomerados , Software
6.
Pac Symp Biocomput ; 22: 39-50, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-27896960

RESUMO

Over the last decades, we have observed an ongoing tremendous growth of available sequencing data fueled by the advancements in wet-lab technology. The sequencing information is only the beginning of the actual understanding of how organisms survive and prosper. It is, for instance, equally important to also unravel the proteomic repertoire of an organism. A classical computational approach for detecting protein families is a sequence-based similarity calculation coupled with a subsequent cluster analysis. In this work we have intensively analyzed various clustering tools on a large scale. We used the data to investigate the behavior of the tools' parameters underlining the diversity of the protein families. Furthermore, we trained regression models for predicting the expected performance of a clustering tool for an unknown data set and aimed to also suggest optimal parameters in an automated fashion. Our analysis demonstrates the benefits and limitations of the clustering of proteins with low sequence similarity indicating that each protein family requires its own distinct set of tools and parameters. All results, a tool prediction service, and additional supporting material is also available online under http://proteinclustering.compbio.sdu.dk.


Assuntos
Proteínas/classificação , Proteínas/genética , Análise por Conglomerados , Biologia Computacional , Evolução Molecular , Proteômica , Análise de Regressão , Alinhamento de Sequência , Homologia de Sequência de Aminoácidos
7.
Nat Methods ; 12(11): 1033-8, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26389570

RESUMO

Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.


Assuntos
Análise por Conglomerados , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Reconhecimento Automatizado de Padrão/métodos , Algoritmos , Animais , Automação , Regulação da Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Estrutura Terciária de Proteína , Controle de Qualidade , Reprodutibilidade dos Testes , Software
8.
Sci Rep ; 4: 6837, 2014 Oct 30.
Artigo em Inglês | MEDLINE | ID: mdl-25355642

RESUMO

With the availability of newer and cheaper sequencing methods, genomic data are being generated at an increasingly fast pace. In spite of the high degree of complexity of currently available search routines, the massive number of sequences available virtually prohibits quick and correct identification of large groups of sequences sharing common traits. Hence, there is a need for clustering tools for automatic knowledge extraction enabling the curation of large-scale databases. Current sophisticated approaches on sequence clustering are based on pairwise similarity matrices. This is impractical for databases of hundreds of thousands of sequences as such a similarity matrix alone would exceed the available memory. In this paper, a new approach called MultiLevel Clustering (MLC) is proposed which avoids a majority of sequence comparisons, and therefore, significantly reduces the total runtime for clustering. An implementation of the algorithm allowed clustering of all 344,239 ITS (Internal Transcribed Spacer) fungal sequences from GenBank utilizing only a normal desktop computer within 22 CPU-hours whereas the greedy clustering method took up to 242 CPU-hours.


Assuntos
Biodiversidade , Análise por Conglomerados , Fungos/classificação , Algoritmos , Biologia Computacional/métodos , Curadoria de Dados , Conjuntos de Dados como Assunto , Proteínas Fúngicas/química , Proteínas Fúngicas/genética , Fungos/genética , Anotação de Sequência Molecular , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA