Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros











Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 12(1): 21485, 2022 12 12.
Artigo em Inglês | MEDLINE | ID: mdl-36509882

RESUMO

Sparse and robust classification models have the potential for revealing common predictive patterns that not only allow for categorizing objects into classes but also for generating mechanistic hypotheses. Identifying a small and informative subset of features is their main ingredient. However, the exponential search space of feature subsets and the heuristic nature of selection algorithms limit the coverage of these analyses, even for low-dimensional datasets. We present methods for reducing the computational complexity of feature selection criteria allowing for higher efficiency and coverage of screenings. We achieve this by reducing the preparation costs of high-dimensional subsets [Formula: see text] to those of one-dimensional ones [Formula: see text]. Our methods are based on a tight interaction between a parallelizable cross-validation traversal strategy and distance-based classification algorithms and can be used with any product distance or kernel. We evaluate the traversal strategy exemplarily in exhaustive feature subset selection experiments (perfect coverage). Its runtime, fitness landscape, and predictive performance are analyzed on publicly available datasets. Even in low-dimensional settings, we achieve approximately a 15-fold increase in exhaustively generating distance matrices for feature combinations bringing a new level of evaluations into reach.


Assuntos
Algoritmos , Projetos de Pesquisa
2.
Bioinformatics ; 38(21): 4893-4900, 2022 10 31.
Artigo em Inglês | MEDLINE | ID: mdl-36094334

RESUMO

MOTIVATION: Biological processes are complex systems with distinct behaviour. Despite the growing amount of available data, knowledge is sparse and often insufficient to investigate the complex regulatory behaviour of these systems. Moreover, different cellular phenotypes are possible under varying conditions. Mathematical models attempt to unravel these mechanisms by investigating the dynamics of regulatory networks. Therefore, a major challenge is to combine regulations and phenotypical information as well as the underlying mechanisms. To predict regulatory links in these models, we established an approach called CANTATA to support the integration of information into regulatory networks and retrieve potential underlying regulations. This is achieved by optimizing both static and dynamic properties of these networks. RESULTS: Initial results show that the algorithm predicts missing interactions by recapitulating the known phenotypes while preserving the original topology and optimizing the robustness of the model. The resulting models allow for hypothesizing about the biological impact of certain regulatory dependencies. AVAILABILITY AND IMPLEMENTATION: Source code of the application, example files and results are available at https://github.com/sysbio-bioinf/Cantata. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Redes Reguladoras de Genes , Software , Algoritmos , Modelos Teóricos
3.
Artigo em Inglês | MEDLINE | ID: mdl-21464514

RESUMO

Network inference algorithms can assist life scientists in unraveling gene-regulatory systems on a molecular level. In recent years, great attention has been drawn to the reconstruction of Boolean networks from time series. These need to be binarized, as such networks model genes as binary variables (either "expressed" or "not expressed"). Common binarization methods often cluster measurements or separate them according to statistical or information theoretic characteristics and may require many data points to determine a robust threshold. Yet, time series measurements frequently comprise only a small number of samples. To overcome this limitation, we propose a binarization that incorporates measurements at multiple resolutions. We introduce two such binarization approaches which determine thresholds based on limited numbers of samples and additionally provide a measure of threshold validity. Thus, network reconstruction and further analysis can be restricted to genes with meaningful thresholds. This reduces the complexity of network inference. The performance of our binarization algorithms was evaluated in network reconstruction experiments using artificial data as well as real-world yeast expression time series. The new approaches yield considerably improved correct network identification rates compared to other binarization techniques by effectively reducing the amount of candidate networks.


Assuntos
Algoritmos , Biologia Computacional/métodos , Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes/genética , Modelos Genéticos , Bases de Dados Genéticas , Saccharomyces cerevisiae
4.
Bioinformatics ; 27(11): 1529-36, 2011 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-21471013

RESUMO

MOTIVATION: Accurate, context-specific regulation of gene expression is essential for all organisms. Accordingly, it is very important to understand the complex relations within cellular gene regulatory networks. A tool to describe and analyze the behavior of such networks are Boolean models. The reconstruction of a Boolean network from biological data requires identification of dependencies within the network. This task becomes increasingly computationally demanding with large amounts of data created by recent high-throughput technologies. Thus, we developed a method that is especially suited for network structure reconstruction from large-scale data. In our approach, we took advantage of the fact that a specific transcription factor often will consistently either activate or inhibit a specific target gene, and this kind of regulatory behavior can be modeled using monotone functions. RESULTS: To detect regulatory dependencies in a network, we examined how the expression of different genes correlates to successive network states. For this purpose, we used Pearson correlation as an elementary correlation measure. Given a Boolean network containing only monotone Boolean functions, we prove that the correlation of successive states can identify the dependencies in the network. This method not only finds dependencies in randomly created artificial networks to very high percentage, but also reconstructed large fractions of both a published Escherichia coli regulatory network from simulated data and a yeast cell cycle network from real microarray data.


Assuntos
Redes Reguladoras de Genes , Modelos Genéticos , Escherichia coli/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Saccharomycetales/genética , Fatores de Transcrição/metabolismo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA