Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Bioinformatics ; 36(16): 4389-4398, 2020 08 15.
Artículo en Inglés | MEDLINE | ID: mdl-32227192

RESUMEN

MOTIVATION: Recently, multiobjective swarm intelligence optimization (SIO) algorithms have attracted considerable attention as disease model-free methods for detecting high-order single nucleotide polymorphism (SNP) interactions. However, a strict Pareto optimal set may filter out some of the SNP combinations associated with disease status. Furthermore, the lack of heuristic factors for finding SNP interactions and the preference for discrimination approaches to disease models are considerable challenges for SIO.In this study, we propose a multipopulation harmony search (HS) algorithm dedicated to the detection of high-order SNP interactions (MP-HS-DHSI). This method consists of three stages. In the first stage, HS with multipopulation (multiharmony memories) is used to discover a set of candidate high-order SNP combinations having an association with disease status. In HS, multiple criteria [Bayesian network-based K2-score, Jensen-Shannon divergence, likelihood ratio and normalized distance with joint entropy (ND-JE)] are adopted by four harmony memories to improve the ability to discriminate diverse disease models. A novel evaluation criterion named ND-JE is proposed to guide HS to explore clues for high-order SNP interactions. In the second and third stages, the G-test statistical method and multifactor dimensionality reduction are employed to verify the authenticity of the candidate solutions, respectively. RESULTS: We compared MP-HS-DHSI with four state-of-the-art SIO algorithms for detecting high-order SNP interactions for 20 simulation disease models and a real dataset of age-related macular degeneration. The experimental results revealed that our proposed method can accelerate the search speed efficiently and enhance the discrimination ability of diverse epistasis models. AVAILABILITY AND IMPLEMENTATION: https://github.com/shouhengtuo/MP-HS-DHSI. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Epistasis Genética , Polimorfismo de Nucleótido Simple , Algoritmos , Teorema de Bayes , Reducción de Dimensionalidad Multifactorial
2.
Biochem Biophys Res Commun ; 465(3): 437-42, 2015 Sep 25.
Artículo en Inglés | MEDLINE | ID: mdl-26282201

RESUMEN

Formation and progression of complex diseases are generally the joint effect of genetic and epigenetic disorders, thus an integrative analysis of epigenetic and genetic data is essential for understanding mechanism of the diseases. In this study, we integrate Illuminate 450k DNA methylation and gene expression data to calculate the weights of gene network using Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA). The approach considers all methylation values of CpG sites in a gene, rather than averaging them which was used in other studies ignoring the variability of the methylation sites. Through comparing topological features of control network with those of case network, including global and local features, candidate disease-associated genes and gene modules are identified. We apply the approach to real data, breast invasive carcinoma (BRCA). It successfully identifies susceptibility breast cancer-related genes, such as TP53, BRCA1, EP300, CDK2, MCM7 and so forth, within which most are previously known to breast cancer. Also, GO and pathway enrichment analysis indicate that these genes enrich in cell apoptosis and regulation of cell death which are cancer-related biological processes. Importantly, through analyzing the functions and comparing expression and methylation values of these genes between cases and controls, we find some genes, such as VASN, SNRPD3, and gene modules, targeted by POLR2C, CHMP1B and TAF9, which might be novel breast cancer-related biomarkers.


Asunto(s)
Biomarcadores de Tumor/genética , Neoplasias de la Mama/genética , Metilación de ADN/genética , ADN de Neoplasias/genética , Perfilación de la Expresión Génica/métodos , Proteínas de Neoplasias/genética , Secuencia de Bases , Neoplasias de la Mama/diagnóstico , Simulación por Computador , Femenino , Regulación de la Expresión Génica/genética , Estudios de Asociación Genética , Predisposición Genética a la Enfermedad/genética , Humanos , Modelos Genéticos , Datos de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Integración de Sistemas
3.
ScientificWorldJournal ; 2014: 637412, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24574905

RESUMEN

To enhance the performance of harmony search (HS) algorithm on solving the discrete optimization problems, this paper proposes a novel harmony search algorithm based on teaching-learning (HSTL) strategies to solve 0-1 knapsack problems. In the HSTL algorithm, firstly, a method is presented to adjust dimension dynamically for selected harmony vector in optimization procedure. In addition, four strategies (harmony memory consideration, teaching-learning strategy, local pitch adjusting, and random mutation) are employed to improve the performance of HS algorithm. Another improvement in HSTL method is that the dynamic strategies are adopted to change the parameters, which maintains the proper balance effectively between global exploration power and local exploitation power. Finally, simulation experiments with 13 knapsack problems show that the HSTL algorithm can be an efficient alternative for solving 0-1 knapsack problems.


Asunto(s)
Algoritmos , Inteligencia Artificial , Motor de Búsqueda/métodos , Simulación por Computador
4.
Interdiscip Sci ; 16(3): 688-711, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38954231

RESUMEN

To elucidate the genetic basis of complex diseases, it is crucial to discover the single-nucleotide polymorphisms (SNPs) contributing to disease susceptibility. This is particularly challenging for high-order SNP epistatic interactions (HEIs), which exhibit small individual effects but potentially large joint effects. These interactions are difficult to detect due to the vast search space, encompassing billions of possible combinations, and the computational complexity of evaluating them. This study proposes a novel explicit-encoding-based multitasking harmony search algorithm (MTHS-EE-DHEI) specifically designed to address this challenge. The algorithm operates in three stages. First, a harmony search algorithm is employed, utilizing four lightweight evaluation functions, such as Bayesian network and entropy, to efficiently explore potential SNP combinations related to disease status. Second, a G-test statistical method is applied to filter out insignificant SNP combinations. Finally, two machine learning-based methods, multifactor dimensionality reduction (MDR) as well as random forest (RF), are employed to validate the classification performance of the remaining significant SNP combinations. This research aims to demonstrate the effectiveness of MTHS-EE-DHEI in identifying HEIs compared to existing methods, potentially providing valuable insights into the genetic architecture of complex diseases. The performance of MTHS-EE-DHEI was evaluated on twenty simulated disease datasets and three real-world datasets encompassing age-related macular degeneration (AMD), rheumatoid arthritis (RA), and breast cancer (BC). The results demonstrably indicate that MTHS-EE-DHEI outperforms four state-of-the-art algorithms in terms of both detection power and computational efficiency. The source code is available at https://github.com/shouhengtuo/MTHS-EE-DHEI.git .


Asunto(s)
Algoritmos , Teorema de Bayes , Epistasis Genética , Polimorfismo de Nucleótido Simple , Polimorfismo de Nucleótido Simple/genética , Humanos , Aprendizaje Automático , Reducción de Dimensionalidad Multifactorial , Biología Computacional/métodos , Predisposición Genética a la Enfermedad
5.
Interdiscip Sci ; 14(4): 814-832, 2022 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-35788965

RESUMEN

MOTIVATION: Linear or nonlinear interactions of multiple single-nucleotide polymorphisms (SNPs) play an important role in understanding the genetic basis of complex human diseases. However, combinatorial analytics in high-dimensional space makes it extremely challenging to detect multiorder SNP interactions. Most classic approaches can only perform one task (for detecting k-order SNP interactions) in each run. Since prior knowledge of a complex disease is usually not available, it is difficult to determine the value of k for detecting k-order SNP interactions. METHODS: A novel multitasking ant colony optimization algorithm (named MTACO-DMSI) is proposed to detect multiorder SNP interactions, and it is divided into two stages: searching and testing. In the searching stage, multiple multiorder SNP interaction detection tasks (from 2nd-order to kth-order) are executed in parallel, and two subpopulations that separately adopt the Bayesian network-based K2-score and Jensen-Shannon divergence (JS-score) as evaluation criteria are generated for each task to improve the global search capability and the discrimination ability for various disease models. In the testing stage, the G test statistical test is adopted to further verify the authenticity of candidate solutions to reduce the error rate. RESULT: Three multiorder simulated disease models with different interaction effects and three real age-related macular degeneration (AMD), rheumatoid arthritis (RA) and type 1 diabetes (T1D) datasets were used to investigate the performance of the proposed MTACO-DMSI. The experimental results show that the MTACO-DMSI has a faster search speed and higher discriminatory power for diverse simulation disease models than traditional single-task algorithms. The results on real AMD data and RA and T1D datasets indicate that MTACO-DMSI has the ability to detect multiorder SNP interactions at a genome-wide scale. Availability and implementation: https://github.com/shouhengtuo/MTACO-DMSI/.


Asunto(s)
Diabetes Mellitus Tipo 1 , Polimorfismo de Nucleótido Simple , Humanos , Algoritmos , Teorema de Bayes , Diabetes Mellitus Tipo 1/genética , Epistasis Genética , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple/genética
6.
Genes (Basel) ; 9(9)2018 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-30158504

RESUMEN

Detecting high-order epistasis in genome-wide association studies (GWASs) is of importance when characterizing complex human diseases. However, the enormous numbers of possible single-nucleotide polymorphism (SNP) combinations and the diversity among diseases presents a significant computational challenge. Herein, a fast method for detecting high-order epistasis based on an interaction weight (FDHE-IW) method is evaluated in the detection of SNP combinations associated with disease. First, the symmetrical uncertainty (SU) value for each SNP is calculated. Then, the top-k SNPs are isolated as guiders to identify 2-way SNP combinations with significant interaction weight values. Next, a forward search is employed to detect high-order SNP combinations with significant interaction weight values as candidates. Finally, the findings were statistically evaluated using a G-test to isolate true positives. The developed algorithm was used to evaluate 12 simulated datasets and an age-related macular degeneration (AMD) dataset and was shown to perform robustly in the detection of some high-order disease-causing models.

7.
Sci Rep ; 8(1): 6353, 2018 Apr 17.
Artículo en Inglés | MEDLINE | ID: mdl-29662181

RESUMEN

A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has not been fixed in the paper.

8.
Sci Rep ; 7(1): 11529, 2017 09 14.
Artículo en Inglés | MEDLINE | ID: mdl-28912584

RESUMEN

Genome-wide association study is especially challenging in detecting high-order disease-causing models due to model diversity, possible low or even no marginal effect of the model, and extraordinary search and computations. In this paper, we propose a niche harmony search algorithm where joint entropy is utilized as a heuristic factor to guide the search for low or no marginal effect model, and two computationally lightweight scores are selected to evaluate and adapt to diverse of disease models. In order to obtain all possible suspected pathogenic models, niche technique merges with HS, which serves as a taboo region to avoid HS trapping into local search. From the resultant set of candidate SNP-combinations, we use G-test statistic for testing true positives. Experiments were performed on twenty typical simulation datasets in which 12 models are with marginal effect and eight ones are with no marginal effect. Our results indicate that the proposed algorithm has very high detection power for searching suspected disease models in the first stage and it is superior to some typical existing approaches in both detection power and CPU runtime for all these datasets. Application to age-related macular degeneration (AMD) demonstrates our method is promising in detecting high-order disease-causing models.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Simulación por Computador , Humanos
9.
PLoS One ; 12(4): e0175114, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28403224

RESUMEN

Harmony Search (HS) and Teaching-Learning-Based Optimization (TLBO) as new swarm intelligent optimization algorithms have received much attention in recent years. Both of them have shown outstanding performance for solving NP-Hard optimization problems. However, they also suffer dramatic performance degradation for some complex high-dimensional optimization problems. Through a lot of experiments, we find that the HS and TLBO have strong complementarity each other. The HS has strong global exploration power but low convergence speed. Reversely, the TLBO has much fast convergence speed but it is easily trapped into local search. In this work, we propose a hybrid search algorithm named HSTLBO that merges the two algorithms together for synergistically solving complex optimization problems using a self-adaptive selection strategy. In the HSTLBO, both HS and TLBO are modified with the aim of balancing the global exploration and exploitation abilities, where the HS aims mainly to explore the unknown regions and the TLBO aims to rapidly exploit high-precision solutions in the known regions. Our experimental results demonstrate better performance and faster speed than five state-of-the-art HS variants and show better exploration power than five good TLBO variants with similar run time, which illustrates that our method is promising in solving complex high-dimensional optimization problems. The experiment on portfolio optimization problems also demonstrate that the HSTLBO is effective in solving complex read-world application.


Asunto(s)
Inteligencia Artificial , Simulación por Computador , Solución de Problemas
10.
PLoS One ; 12(5): e0177662, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28520777

RESUMEN

The stratification of cancer into subtypes that are significantly associated with clinical outcomes is beneficial for targeted prognosis and treatment. In this study, we integrated somatic mutation and gene expression data to identify clusters of patients. In contrast to previous studies, we constructed cancer-type-specific significant co-expression networks (SCNs) rather than using a fixed gene network across all cancers, such as the network-based stratification (NBS) method, which ignores cancer heterogeneity. For each type of cancer, the gene expression data were used to construct the SCN network, while the gene somatic mutation data were mapped onto the network, propagated, and used for further clustering. For the clustering, we adopted an improved network-regularized non-negative matrix factorization (netNMF) (netNMF_HC) for a more precise classification. We applied our method to various datasets, including ovarian cancer (OV), lung adenocarcinoma (LUAD) and uterine corpus endometrial carcinoma (UCEC) cohorts derived from the TCGA (The Cancer Genome Atlas) project. Based on the results, we evaluated the performance of our method to identify survival-relevant subtypes and further compared it to the NBS method, which adopts priori networks and netNMF algorithm. The proposed algorithm outperformed the NBS method in identifying informative cancer subtypes that were significantly associated with clinical outcomes in most cancer types we studied. In particular, our method identified survival-associated UCEC subtypes that were not identified by the NBS method. Our analysis indicated valid subtyping of patient could be applied by mutation data with cancer-type-specific SCNs and netNMF_HC for individual cancers because of specific cancer co-expression patterns and more precise clustering.


Asunto(s)
Regulación Neoplásica de la Expresión Génica , Redes Reguladoras de Genes , Mutación , Neoplasias/genética , Transcriptoma , Algoritmos , Análisis por Conglomerados , Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Perfilación de la Expresión Génica , Humanos , Neoplasias/mortalidad , Pronóstico , Análisis de Supervivencia , Flujo de Trabajo
11.
PLoS One ; 11(3): e0150669, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27014873

RESUMEN

MOTIVATION: Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. METHOD: In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. RESULTS: We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset.


Asunto(s)
Teorema de Bayes , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/estadística & datos numéricos , Algoritmos , Simulación por Computador , Epistasis Genética , Estudio de Asociación del Genoma Completo/métodos , Humanos , Aprendizaje Automático , Polimorfismo de Nucleótido Simple/genética
12.
Mol Biosyst ; 11(8): 2227-37, 2015 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-26052692

RESUMEN

MicroRNAs (miRNAs) play an indispensable role in cancer initiation and progression. Different cancers have some common hallmarks in general. Analyzing miRNAs that consistently contribute to different cancers can help us to discover the relationship between miRNAs and traits shared by cancers. Most previous works focus on analyzing single miRNA. However, dysregulation of a single miRNA is generally not sufficient to contribute to complex cancer processes. In this study, we put emphasis on analyzing cooperation of miRNAs across cancers. We assume that miRNAs can cooperatively regulate oncogenic pathways and contribute to cancer hallmarks. Such a cooperation is modeled by a miRNA module referred to as a pan-cancer conserved miRNA module. The module consists of miRNAs which simultaneously regulate cancers and are significantly intra-correlated. A novel computational workflow for the module discovery is presented. Multiple modules are discovered from miRNA expression profiles using the method. The function of top two ranked modules are analyzed using the mRNAs which correlate to all the miRNAs in a module across cancers, inferring that the two modules function in regulating the cell cycle which relates to cancer hallmarks as self sufficiency in growth signals and insensitivity to antigrowth signals. Additionally, two novel miRNAs mir-590 and mir-629 are found to cooperate with well-known onco-miRNAs in the modules to contribute to cancers. We also found that PTEN, which is a well known tumor suppressor that regulates the cell cycle, is a common target of miRNAs in the top-one module and cooperative control of PTEN can be a reason for the miRNAs' cooperation. We believe that analyzing the cooperative mechanism of the miRNAs in modules rather than focusing on only single miRNAs may help us know more about the complicated relationship between miRNAs and cancers and develop more effective treatment strategies for cancers.


Asunto(s)
Secuencia Conservada/genética , Redes Reguladoras de Genes , MicroARNs/genética , Neoplasias/genética , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , MicroARNs/biosíntesis , Neoplasias/patología , ARN Mensajero/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA