Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 93
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nonlinear Dyn ; 105(4): 3819-3833, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34429568

RESUMEN

We propose a new epidemic model considering the partial mapping relationship in a two-layered time-varying network, which aims to study the influence of information diffusion on epidemic spreading. In the model, one layer represents the epidemic-related information diffusion in the social networks, while the other layer denotes the epidemic spreading in physical networks. In addition, there just exist mapping relationships between partial pairs of nodes in the two-layered network, which characterizes the interaction between information diffusion and epidemic spreading. Meanwhile, the information and epidemics can only spread in their own layers. Afterwards, starting from the microscopic Markov chain (MMC) method, we can establish the dynamic equation of epidemic spreading and then analytically deduce its epidemic threshold, which demonstrates that the ratio of correspondence between two layers has a significant effect on the epidemic threshold of the proposed model. Finally, it is found that MMC method can well match with Monte Carlo (MC) simulations, and the relevant results can be helpful to understand the epidemic spreading properties in depth.

2.
BMC Genomics ; 21(1): 650, 2020 Sep 22.
Artículo en Inglés | MEDLINE | ID: mdl-32962626

RESUMEN

BACKGROUND: The small number of samples and the curse of dimensionality hamper the better application of deep learning techniques for disease classification. Additionally, the performance of clustering-based feature selection algorithms is still far from being satisfactory due to their limitation in using unsupervised learning methods. To enhance interpretability and overcome this problem, we developed a novel feature selection algorithm. In the meantime, complex genomic data brought great challenges for the identification of biomarkers and therapeutic targets. The current some feature selection methods have the problem of low sensitivity and specificity in this field. RESULTS: In this article, we designed a multi-scale clustering-based feature selection algorithm named MCBFS which simultaneously performs feature selection and model learning for genomic data analysis. The experimental results demonstrated that MCBFS is robust and effective by comparing it with seven benchmark and six state-of-the-art supervised methods on eight data sets. The visualization results and the statistical test showed that MCBFS can capture the informative genes and improve the interpretability and visualization of tumor gene expression and single-cell sequencing data. Additionally, we developed a general framework named McbfsNW using gene expression data and protein interaction data to identify robust biomarkers and therapeutic targets for diagnosis and therapy of diseases. The framework incorporates the MCBFS algorithm, network recognition ensemble algorithm and feature selection wrapper. McbfsNW has been applied to the lung adenocarcinoma (LUAD) data sets. The preliminary results demonstrated that higher prediction results can be attained by identified biomarkers on the independent LUAD data set, and we also structured a drug-target network which may be good for LUAD therapy. CONCLUSIONS: The proposed novel feature selection method is robust and effective for gene selection, classification, and visualization. The framework McbfsNW is practical and helpful for the identification of biomarkers and targets on genomic data. It is believed that the same methods and principles are extensible and applicable to other different kinds of data sets.


Asunto(s)
Adenocarcinoma del Pulmón/genética , Biomarcadores de Tumor/genética , Genómica/métodos , Neoplasias Pulmonares/genética , Aprendizaje Automático Supervisado , Adenocarcinoma del Pulmón/clasificación , Adenocarcinoma del Pulmón/patología , Biomarcadores de Tumor/metabolismo , Análisis por Conglomerados , Humanos , Neoplasias Pulmonares/clasificación , Neoplasias Pulmonares/patología , Programas Informáticos
3.
Brief Bioinform ; 19(3): 506-523, 2018 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-28069634

RESUMEN

Large-scale perturbation databases, such as Connectivity Map (CMap) or Library of Integrated Network-based Cellular Signatures (LINCS), provide enormous opportunities for computational pharmacogenomics and drug design. A reason for this is that in contrast to classical pharmacology focusing at one target at a time, the transcriptomics profiles provided by CMap and LINCS open the door for systems biology approaches on the pathway and network level. In this article, we provide a review of recent developments in computational pharmacogenomics with respect to CMap and LINCS and related applications.


Asunto(s)
Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Farmacogenética , Bibliotecas de Moléculas Pequeñas/farmacología , Transcriptoma , Bases de Datos Factuales , Redes Reguladoras de Genes , Humanos
4.
BMC Cancer ; 19(1): 1176, 2019 Dec 03.
Artículo en Inglés | MEDLINE | ID: mdl-31796020

RESUMEN

BACKGROUND: Deciphering the meaning of the human DNA is an outstanding goal which would revolutionize medicine and our way for treating diseases. In recent years, non-coding RNAs have attracted much attention and shown to be functional in part. Yet the importance of these RNAs especially for higher biological functions remains under investigation. METHODS: In this paper, we analyze RNA-seq data, including non-coding and protein coding RNAs, from lung adenocarcinoma patients, a histologic subtype of non-small-cell lung cancer, with deep learning neural networks and other state-of-the-art classification methods. The purpose of our paper is three-fold. First, we compare the classification performance of different versions of deep belief networks with SVMs, decision trees and random forests. Second, we compare the classification capabilities of protein coding and non-coding RNAs. Third, we study the influence of feature selection on the classification performance. RESULTS: As a result, we find that deep belief networks perform at least competitively to other state-of-the-art classifiers. Second, data from non-coding RNAs perform better than coding RNAs across a number of different classification methods. This demonstrates the equivalence of predictive information as captured by non-coding RNAs compared to protein coding RNAs, conventionally used in computational diagnostics tasks. Third, we find that feature selection has in general a negative effect on the classification performance which means that unfiltered data with all features give the best classification results. CONCLUSIONS: Our study is the first to use ncRNAs beyond miRNAs for the computational classification of cancer and for performing a direct comparison of the classification capabilities of protein coding RNAs and non-coding RNAs.


Asunto(s)
Neoplasias Pulmonares/clasificación , Neoplasias Pulmonares/genética , ARN Mensajero/metabolismo , ARN no Traducido/genética , Biología Computacional/métodos , Árboles de Decisión , Humanos , Neoplasias Pulmonares/patología , Aprendizaje Automático , MicroARNs/genética , Redes Neurales de la Computación , ARN Mensajero/genética , Análisis de Secuencia de ARN/métodos
5.
Curr Genomics ; 20(1): 38-48, 2019 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-31015790

RESUMEN

BACKGROUND: Cancer is a complex disease with a lucid etiology and in understanding the causation, we need to appreciate this complexity. OBJECTIVE: Here we are aiming to gain insights into the genetic associations of prostate cancer through a network-based systems approach using the BC3Net algorithm. METHODS: Specifically, we infer a prostate cancer Gene Regulatory Network (GRN) from a large-scale gene expression data set of 333 patient RNA-seq profiles obtained from The Cancer Genome Atlas (TCGA) database. RESULTS: We analyze the functional components of the inferred network by extracting subnetworks based on biological process information and interpret the role of known cancer genes within each process. Fur-thermore, we investigate the local landscape of prostate cancer genes and discuss pathological associa-tions that may be relevant in the development of new targeted cancer therapies. CONCLUSION: Our network-based analysis provides a practical systems biology approach to reveal the collective gene-interactions of prostate cancer. This allows a close interpretation of biological activity in terms of the hallmarks of cancer.

6.
J Math Biol ; 78(1-2): 441-463, 2019 01.
Artículo en Inglés | MEDLINE | ID: mdl-30291366

RESUMEN

We generalize chaos game representation (CGR) to higher dimensional spaces while maintaining its bijection, keeping such method sufficiently representative and mathematically rigorous compare to previous attempts. We first state and prove the asymptotic property of CGR and our generalized chaos game representation (GCGR) method. The prediction follows that the dissimilarity of sequences which possess identical subsequences but distinct positions would be lowered exponentially by the length of the identical subsequence; this effect was taking place unbeknownst to researchers. By shining a spotlight on it now, we show the effect fundamentally supports (G)CGR as a similarity measure or feature extraction technique. We develop two feature extraction techniques: GCGR-Centroid and GCGR-Variance. We use the GCGR-Centroid to analyze the similarity between protein sequences by using the datasets 9 ND5, 24 TF and 50 beta-globin proteins. We obtain consistent results compared with previous studies which proves the significance thereof. Finally, by utilizing support vector machines, we train the anticancer peptide prediction model by using both GCGR-Centroid and GCGR-Variance, and achieve a significantly higher prediction performance by employing the 3 well-studied anticancer peptide datasets.


Asunto(s)
Teoría del Juego , Proteínas Supresoras de Tumor/genética , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Biología Computacional , Bases de Datos de Proteínas/estadística & datos numéricos , Complejo I de Transporte de Electrón/genética , Humanos , Conceptos Matemáticos , Proteínas Mitocondriales/genética , Modelos Biológicos , NADH Deshidrogenasa/genética , Dinámicas no Lineales , Alineación de Secuencia/estadística & datos numéricos , Análisis de Secuencia de Proteína/estadística & datos numéricos , Homología de Secuencia de Aminoácido , Máquina de Vectores de Soporte , Transferrina/genética , Proteínas Supresoras de Tumor/clasificación , Proteínas Supresoras de Tumor/fisiología , Globinas beta/genética
7.
Entropy (Basel) ; 21(5)2019 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-33267196

RESUMEN

In this paper, we study several distance-based entropy measures on fullerene graphs. These include the topological information content of a graph I a ( G ) , a degree-based entropy measure, the eccentric-entropy I f σ ( G ) , the Hosoya entropy H ( G ) and, finally, the radial centric information entropy H e c c . We compare these measures on two infinite classes of fullerene graphs denoted by A 12 n + 4 and B 12 n + 6 . We have chosen these measures as they are easily computable and capture meaningful graph properties. To demonstrate the utility of these measures, we investigate the Pearson correlation between them on the fullerene graphs.

8.
BMC Bioinformatics ; 19(1): 396, 2018 Oct 29.
Artículo en Inglés | MEDLINE | ID: mdl-30373514

RESUMEN

BACKGROUND: Using knowledge-based interpretation to analyze omics data can not only obtain essential information regarding various biological processes, but also reflect the current physiological status of cells and tissue. The major challenge to analyze gene expression data, with a large number of genes and small samples, is to extract disease-related information from a massive amount of redundant data and noise. Gene selection, eliminating redundant and irrelevant genes, has been a key step to address this problem. RESULTS: The modified method was tested on four benchmark datasets with either two-class phenotypes or multiclass phenotypes, outperforming previous methods, with relatively higher accuracy, true positive rate, false positive rate and reduced runtime. CONCLUSIONS: This paper proposes an effective feature selection method, combining double RBF-kernels with weighted analysis, to extract feature genes from gene expression data, by exploring its nonlinear mapping ability.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica , Proteínas de Neoplasias/genética , Neoplasias/clasificación , Neoplasias/genética , Humanos , Fenotipo
9.
BMC Bioinformatics ; 18(1): 325, 2017 Jul 04.
Artículo en Inglés | MEDLINE | ID: mdl-28676075

RESUMEN

BACKGROUND: sgnesR (Stochastic Gene Network Expression Simulator in R) is an R package that provides an interface to simulate gene expression data from a given gene network using the stochastic simulation algorithm (SSA). The package allows various options for delay parameters and can easily included in reactions for promoter delay, RNA delay and Protein delay. A user can tune these parameters to model various types of reactions within a cell. As examples, we present two network models to generate expression profiles. We also demonstrated the inference of networks and the evaluation of association measure of edge and non-edge components from the generated expression profiles. RESULTS: The purpose of sgnesR is to enable an easy to use and a quick implementation for generating realistic gene expression data from biologically relevant networks that can be user selected. CONCLUSIONS: sgnesR is freely available for academic use. The R package has been tested for R 3.2.0 under Linux, Windows and Mac OS X.


Asunto(s)
Redes Reguladoras de Genes , Interfaz Usuario-Computador , Algoritmos , Expresión Génica , Internet
10.
Bioinformatics ; 32(21): 3345-3347, 2016 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-27402900

RESUMEN

MOTIVATION: Data from RNA-seq experiments provide us with many new possibilities to gain insights into biological and disease mechanisms of cellular functioning. However, the reproducibility and robustness of RNA-seq data analysis results is often unclear. This is in part attributed to the two counter acting goals of (i) a cost efficient and (ii) an optimal experimental design leading to a compromise, e.g. in the sequencing depth of experiments. RESULTS: We introduce an R package called samExploreR that allows the subsampling (m out of n bootstraping) of short-reads based on SAM files facilitating the investigation of sequencing depth related questions for the experimental design. Overall, this provides a systematic way for exploring the reproducibility and robustness of general RNA-seq studies. We exemplify the usage of samExploreR by studying the influence of the sequencing depth and the annotation on the identification of differentially expressed genes. AVAILABILITY AND IMPLEMENTATION: samExploreR is available as an R package from Bioconductor. CONTACT: v@bio-complexity.comSupplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
ARN/genética , Análisis de Secuencia de ARN , Reproducibilidad de los Resultados , Proyectos de Investigación , Programas Informáticos
11.
J Biomed Inform ; 75: 63-69, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28958485

RESUMEN

As therapeutic peptides have been taken into consideration in disease therapy in recent years, many biologists spent time and labor to verify various functional peptides from a large number of peptide sequences. In order to reduce the workload and increase the efficiency of identification of functional proteins, we propose a sequence-based model, q-FP (functional peptide prediction based on the q-Wiener Index), capable of recognizing potentially functional proteins. We extract three types of features by mixing graphic representation and statistical indices based on the q-Wiener index and physicochemical properties of amino acids. Our support-vector-machine-based model achieves an accuracy of 96.71%, 93.34%, 98.40%, and 91.40% for anticancer, virulent, and allergenic proteins datasets, respectively, by using 5-fold cross validation.


Asunto(s)
Biología Computacional , Gráficos por Computador , Péptidos/química , Algoritmos , Bases de Datos de Proteínas , Humanos , Máquina de Vectores de Soporte
12.
BMC Bioinformatics ; 17: 129, 2016 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-26987731

RESUMEN

BACKGROUND: It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells. RESULTS: In this paper, we provide a comparative analysis of five popular and four novel module detection algorithms. We study these module prediction methods for simulated benchmark networks as well as 10 biological protein interaction networks (PINs). A particular focus of our analysis is placed on the biological meaning of the predicted modules by utilizing the Gene Ontology (GO) database as gold standard for the definition of biological processes. Furthermore, we investigate the robustness of the results by perturbing the PINs simulating in this way our incomplete knowledge of protein networks. CONCLUSIONS: Overall, our study reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks. However, we also find pathways that are enriched in multiple modules, which could provide important information about the hierarchical organization of the system.


Asunto(s)
Algoritmos , Ontología de Genes , Mapeo de Interacción de Proteínas , Animales , Análisis por Conglomerados , Eucariontes/metabolismo , Humanos
13.
Bioinformatics ; 30(19): 2834-6, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-24928209

RESUMEN

SUMMARY: NetBioV (Network Biology Visualization) is an R package that allows the visualization of large network data in biology and medicine. The purpose of NetBioV is to enable an organized and reproducible visualization of networks by emphasizing or highlighting specific structural properties that are of biological relevance. AVAILABILITY AND IMPLEMENTATION: NetBioV is freely available for academic use. The package has been tested for R 2.14.2 under Linux, Windows and Mac OS X. It is available from Bioconductor.


Asunto(s)
Biología Computacional/métodos , Redes Reguladoras de Genes , Programas Informáticos , Algoritmos , Arabidopsis/metabolismo , Gráficos por Computador , Humanos , Linfoma de Células B/metabolismo , Lenguajes de Programación , Reproducibilidad de los Resultados
14.
BMC Bioinformatics ; 15 Suppl 6: S6, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25079297

RESUMEN

Cancer is a complex disease that has proven to be difficult to understand on the single-gene level. For this reason a functional elucidation needs to take interactions among genes on a systems-level into account. In this study, we infer a colon cancer network from a large-scale gene expression data set by using the method BC3Net. We provide a structural and a functional analysis of this network and also connect its molecular interaction structure with the chromosomal locations of the genes enabling the definition of cis- and trans-interactions. Furthermore, we investigate the interaction of genes that can be found in close neighborhoods on the chromosomes to gain insight into regulatory mechanisms. To our knowledge this is the first study analyzing the genome-scale colon cancer network.


Asunto(s)
Neoplasias del Colon/genética , Redes Reguladoras de Genes , Neoplasias del Colon/metabolismo , Biología Computacional , Perfilación de la Expresión Génica , Humanos , Proteínas/genética , Proteínas/metabolismo
16.
BMC Genomics ; 14: 324, 2013 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-23663484

RESUMEN

BACKGROUND: In recent years, various types of cellular networks have penetrated biology and are nowadays used omnipresently for studying eukaryote and prokaryote organisms. Still, the relation and the biological overlap among phenomenological and inferential gene networks, e.g., between the protein interaction network and the gene regulatory network inferred from large-scale transcriptomic data, is largely unexplored. RESULTS: We provide in this study an in-depth analysis of the structural, functional and chromosomal relationship between a protein-protein network, a transcriptional regulatory network and an inferred gene regulatory network, for S. cerevisiae and E. coli. Further, we study global and local aspects of these networks and their biological information overlap by comparing, e.g., the functional co-occurrence of Gene Ontology terms by exploiting the available interaction structure among the genes. CONCLUSIONS: Although the individual networks represent different levels of cellular interactions with global structural and functional dissimilarities, we observe crucial functions of their network interfaces for the assembly of protein complexes, proteolysis, transcription, translation, metabolic and regulatory interactions. Overall, our results shed light on the integrability of these networks and their interfacing biological processes.


Asunto(s)
Biología Computacional , Escherichia coli/citología , Escherichia coli/genética , Redes Reguladoras de Genes , Mapas de Interacción de Proteínas , Saccharomyces cerevisiae/citología , Saccharomyces cerevisiae/genética , Cromosomas Bacterianos/genética , Cromosomas Fúngicos/genética , Escherichia coli/metabolismo , Saccharomyces cerevisiae/metabolismo
17.
Bioinformatics ; 27(1): 140-1, 2011 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-21075747

RESUMEN

MOTIVATION: Network-based representations of biological data have become an important way to analyze high-throughput data. To interpret the large amount of data that is produced by different high-throughput technologies, networks offer multifaceted aspects to analyze the data. As networks represent biological relationships within their structure, it turned out to be fruitful to analyze their topology. Therefore, we developed a freely available, open source R-package called Quantitative Analysis of Complex Networks (QuACN) to meet this challenge. QuACN contains different, information-theoretic and non-information-theoretic, topological network descriptors to analyze, classify and compare biological networks. AVAILABILITY: QuACN is freely available under LGPL via CRAN (http://cran.r-project.org/web/packages/QuACN/).


Asunto(s)
Modelos Biológicos , Programas Informáticos
18.
J Theor Biol ; 310: 216-22, 2012 Oct 07.
Artículo en Inglés | MEDLINE | ID: mdl-22771628

RESUMEN

The identification and interpretation of metabolic biomarkers is a challenging task. In this context, network-based approaches have become increasingly a key technology in systems biology allowing to capture complex interactions in biological systems. In this work, we introduce a novel network-based method to identify highly predictive biomarker candidates for disease. First, we infer two different types of networks: (i) correlation networks, and (ii) a new type of network called ratio networks. Based on these networks, we introduce scores to prioritize features using topological descriptors of the vertices. To evaluate our method we use an example dataset where quantitative targeted MS/MS analysis was applied to a total of 52 blood samples from 22 persons with obesity (BMI >30) and 30 healthy controls. Using our network-based feature selection approach we identified highly discriminating metabolites for obesity (F-score >0.85, accuracy >85%), some of which could be verified by the literature.


Asunto(s)
Algoritmos , Redes y Vías Metabólicas , Metabolómica/métodos , Obesidad/metabolismo , Adulto , Estudios de Casos y Controles , Humanos , Persona de Mediana Edad , Modelos Biológicos
19.
ScientificWorldJournal ; 2012: 278352, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-22654582

RESUMEN

A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs), random forest (RF), k-nearest neighbors (k-NN), penalized logistic regression (PLR), and prediction analysis for microarrays (PAMs). We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.


Asunto(s)
Algoritmos , Biología Computacional/métodos
20.
BMC Bioinformatics ; 12: 492, 2011 Dec 24.
Artículo en Inglés | MEDLINE | ID: mdl-22195644

RESUMEN

BACKGROUND: Structural measures for networks have been extensively developed, but many of them have not yet demonstrated their sustainably. That means, it remains often unclear whether a particular measure is useful and feasible to solve a particular problem in network biology. Exemplarily, the classification of complex biological networks can be named, for which structural measures are used leading to a minimal classification error. Hence, there is a strong need to provide freely available software packages to calculate and demonstrate the appropriate usage of structural graph measures in network biology. RESULTS: Here, we discuss topological network descriptors that are implemented in the R-package QuACN and demonstrate their behavior and characteristics by applying them to a set of example graphs. Moreover, we show a representative application to illustrate their capabilities for classifying biological networks. In particular, we infer gene regulatory networks from microarray data and classify them by methods provided by QuACN. Note that QuACN is the first freely available software written in R containing a large number of structural graph measures. CONCLUSION: The R package QuACN is under ongoing development and we add promising groups of topological network descriptors continuously. The package can be used to answer intriguing research questions in network biology, e.g., classifying biological data or identifying meaningful biological features, by analyzing the topology of biological networks.


Asunto(s)
Redes Reguladoras de Genes , Programas Informáticos , Entropía , Mapas de Interacción de Proteínas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA