Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
Brief Bioinform ; 24(6)2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37889118

RESUMO

Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of samples in the data, method dependence and non-reproducibility. This paper proposes a novel ensemble feature selection method, named Filter and Wrapper Stacking Ensemble (FWSE), to identify reproducible biomarkers from high-dimensional omics data. In FWSE, filter feature selection methods are run on numerous subsets of the data to eliminate irrelevant features, and then wrapper feature selection methods are applied to rank the top features. The method was validated on four high-dimensional medical datasets related to mental illnesses and cancer. The results indicate that the features selected by FWSE are stable and statistically more significant than the ones obtained by existing methods while also demonstrating biological relevance. Furthermore, FWSE is a generic method, applicable to various high-dimensional datasets in the fields of machine intelligence and bioinformatics.


Assuntos
Transtornos Mentais , Neoplasias , Humanos , Algoritmos , Inteligência Artificial , Biomarcadores , Neoplasias/diagnóstico , Neoplasias/genética
2.
Sci Rep ; 13(1): 456, 2023 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-36624117

RESUMO

Interpretable machine learning models for gene expression datasets are important for understanding the decision-making process of a classifier and gaining insights on the underlying molecular processes of genetic conditions. Interpretable models can potentially support early diagnosis before full disease manifestation. This is particularly important yet, challenging for mental health. We hypothesise this is due to extreme heterogeneity issues which may be overcome and explained by personalised modelling techniques. Thus far, most machine learning methods applied to gene expression datasets, including deep neural networks, lack personalised interpretability. This paper proposes a new methodology named personalised constrained neuro fuzzy inference (PCNFI) for learning personalised rules from high dimensional datasets which are structurally and semantically interpretable. Case studies on two mental health related datasets (schizophrenia and bipolar disorders) have shown that the relatively short and simple personalised fuzzy rules provided enhanced interpretability as well as better classification performance compared to other commonly used machine learning methods. Performance test on a cancer dataset also showed that PCNFI matches previous benchmarks. Insights from our approach also indicated the importance of two genes (ATRX and TSPAN2) as possible biomarkers for early differentiation of ultra-high risk, bipolar and healthy individuals. These genes are linked to cognitive ability and impulsive behaviour. Our findings suggest a significant starting point for further research into the biological role of cognitive and impulsivity-related differences. With potential applications across bio-medical research, the proposed PCNFI method is promising for diagnosis, prognosis, and the design of personalised treatment plans for better outcomes in the future.


Assuntos
Transtorno Bipolar , Lógica Fuzzy , Humanos , Detecção Precoce de Câncer , Redes Neurais de Computação , Expressão Gênica , Algoritmos
3.
Prog Brain Res ; 260: 129-165, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33637215

RESUMO

Masking has been widely used as a tinnitus therapy, with large individual differences in its effectiveness. The basis of this variation is unknown. We examined individual tinnitus and psychological responses to three masking types, energetic masking (bilateral broadband static or rain noise [BBN]), informational masking (BBN with a notch at tinnitus pitch and 3-dimensional cues) and a masker combining both effects (BBN with spatial cues). Eleven participants with chronic tinnitus were followed for 12 months, each person used each masking approach for 3 months with a 1 month washout-baseline. The Tinnitus Functional Index (TFI), Tinnitus Rating Scales, Positive and Negative Affect Scale and Depression Anxiety Stress Scales, were measured every month of treatment. Electroencephalography (EEG) and psychoacoustic assessment was undertaken at baseline and following 3 months of each masking sound. The computational modeling of EEG data was based on the framework of brain-inspired Spiking Neural Network (SNN) architecture called NeuCube, designed for this study for mapping, learning, visualizing and classifying of brain activity patterns. EEG was related to clinically significant change in the TFI using the SNN model. The SNN framework was able to predict sound therapy responders (93% accuracy) from non-responders (100% accuracy) using baseline EEG recordings. The combination of energetic and informational masking was an effective treatment sound in more individuals than the other sounds used. Although the findings are promising, they are preliminary and require confirmation in independent and larger samples.


Assuntos
Zumbido , Eletroencefalografia , Humanos , Redes Neurais de Computação , Mascaramento Perceptivo , Som , Zumbido/terapia
4.
IEEE/ACM Trans Comput Biol Bioinform ; 13(6): 1036-1044, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26915128

RESUMO

Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Network-based method (NetInf) to Infer cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level.


Assuntos
Análise Mutacional de DNA/métodos , Progressão da Doença , Genes Neoplásicos/genética , Proteínas de Neoplasias/genética , Neoplasias/genética , Transdução de Sinais/genética , Algoritmos , Regulação Neoplásica da Expressão Gênica/genética , Marcadores Genéticos/genética , Humanos
5.
BMC Med Res Methodol ; 15: 45, 2015 May 12.
Artigo em Inglês | MEDLINE | ID: mdl-25962444

RESUMO

BACKGROUND: Comparing the relative utility of diagnostic tests is challenging when available datasets are small, partial or incomplete. The analytical leverage associated with a large sample size can be gained by integrating several small datasets to enable effective and accurate across-dataset comparisons. Accordingly, we propose a methodology for a holistic comparative analysis and ranking of cancer diagnostic tests through dataset integration and imputation of missing values, using urothelial carcinoma (UC) as a case study. METHODS: Five datasets comprising samples from 939 subjects, including 89 with UC, where up to four diagnostic tests (cytology, NMP22®, UroVysion® Fluorescence In-Situ Hybridization (FISH) and Cxbladder Detect) were integrated into a single dataset containing all measured records and missing values. The tests were firstly ranked using three criteria: sensitivity, specificity and a standard variable (feature) ranking method popularly known as signal-to-noise ratio (SNR) index derived from the mean values for all subjects clinically known to have UC versus healthy subjects. Secondly, step-wise unsupervised and supervised imputation (the latter accounting for the 'clinical truth' as determined by cystoscopy) was performed using personalized modelling, k-nearest-neighbour methods, multiple logistic regression and multilayer perceptron neural networks. All imputation models were cross-validated by comparing their post-imputation predictive accuracy for UC with their pre-imputation accuracy. Finally, the post-imputation tests were re-ranked using the same three criteria. RESULTS: In both measured and imputed data sets, Cxbladder Detect ranked higher for sensitivity, and urine cytology a higher specificity, when compared with other UC tests. Cxbladder Detect consistently ranked higher than FISH and all other tests when SNR analyses were performed on measured, unsupervised and supervised imputed datasets. Supervised imputation resulted in a smaller cross-validation error. Cxbladder Detect was robust to imputation showing a 2% difference in its predictive versus clinical accuracy, outperforming FISH, NMP22 and cytology. CONCLUSION: All data analysed, pre- and post-imputation showed that Cxbladder Detect had higher SNR and outperformed all other comparator tests, including FISH. The methodology developed and validated for comparative ranking of the diagnostic tests for detecting UC, may be further applied to other cancer diagnostic datasets across population groups and multiple datasets.


Assuntos
Algoritmos , Carcinoma de Células de Transição/diagnóstico , Testes Diagnósticos de Rotina/métodos , Neoplasias da Bexiga Urinária/diagnóstico , Carcinoma de Células de Transição/genética , Citodiagnóstico , Bases de Dados Factuais/estatística & dados numéricos , Testes Diagnósticos de Rotina/normas , Testes Diagnósticos de Rotina/estatística & dados numéricos , Humanos , Hibridização in Situ Fluorescente , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Neoplasias da Bexiga Urinária/genética
6.
BMC Bioinformatics ; 16 Suppl 5: S3, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25859819

RESUMO

BACKGROUND: Large-scale cancer genomic projects are providing lots of data on genomic, epigenomic and gene expression aberrations in many cancer types. One key challenge is to detect functional driver pathways and to filter out nonfunctional passenger genes in cancer genomics. Vandin et al. introduced the Maximum Weight Sub-matrix Problem to find driver pathways and showed that it is an NP-hard problem. METHODS: To find a better solution and solve the problem more efficiently, we present a network-based method (NBM) to detect overlapping driver pathways automatically. This algorithm can directly find driver pathways or gene sets de novo from somatic mutation data utilizing two combinatorial properties, high coverage and high exclusivity, without any prior information. We firstly construct gene networks based on the approximate exclusivity between each pair of genes using somatic mutation data from many cancer patients. Secondly, we present a new greedy strategy to add or remove genes for obtaining overlapping gene sets with driver mutations according to the properties of high exclusivity and high coverage. RESULTS: To assess the efficiency of the proposed NBM, we apply the method on simulated data and compare results obtained from the NBM, RME, Dendrix and Multi-Dendrix. NBM obtains optimal results in less than nine seconds on a conventional computer and the time complexity is much less than the three other methods. To further verify the performance of NBM, we apply the method to analyze somatic mutation data from five real biological data sets such as the mutation profiles of 90 glioblastoma tumor samples and 163 lung carcinoma samples. NBM detects groups of genes which overlap with known pathways, including P53, RB and RTK/RAS/PI(3)K signaling pathways. New gene sets with p-value less than 1e-3 are found from the somatic mutation data. CONCLUSIONS: NBM can detect more biologically relevant gene sets. Results show that NBM outperforms other algorithms for detecting driver pathways or gene sets. Further research will be conducted with the use of novel machine learning techniques.


Assuntos
Algoritmos , Redes Reguladoras de Genes , Mutação/genética , Proteínas de Neoplasias/genética , Neoplasias/genética , Transdução de Sinais/genética , Feminino , Genômica/métodos , Glioblastoma/genética , Neoplasias de Cabeça e Pescoço/genética , Humanos , Neoplasias Pulmonares/genética , Neoplasias Ovarianas/genética
7.
Clin Cancer Res ; 13(2 Pt 1): 498-507, 2007 Jan 15.
Artigo em Inglês | MEDLINE | ID: mdl-17255271

RESUMO

PURPOSE: This study aimed to develop gene classifiers to predict colorectal cancer recurrence. We investigated whether gene classifiers derived from two tumor series using different array platforms could be independently validated by application to the alternate series of patients. EXPERIMENTAL DESIGN: Colorectal tumors from New Zealand (n = 149) and Germany (n = 55) patients had a minimum follow-up of 5 years. RNA was profiled using oligonucleotide printed microarrays (New Zealand samples) and Affymetrix arrays (German samples). Classifiers based on clinical data, gene expression data, and a combination of the two were produced and used to predict recurrence. The use of gene expression information was found to improve the predictive ability in both data sets. The New Zealand and German gene classifiers were cross-validated on the German and New Zealand data sets, respectively, to validate their predictive power. Survival analyses were done to evaluate the ability of the classifiers to predict patient survival. RESULTS: The prediction rates for the New Zealand and German gene-based classifiers were 77% and 84%, respectively. Despite significant differences in study design and technologies used, both classifiers retained prognostic power when applied to the alternate series of patients. Survival analyses showed that both classifiers gave a better stratification of patients than the traditional clinical staging. One classifier contained genes associated with cancer progression, whereas the other had a large immune response gene cluster concordant with the role of a host immune response in modulating colorectal cancer outcome. CONCLUSIONS: The successful reciprocal validation of gene-based classifiers on different patient cohorts and technology platforms supports the power of microarray technology for individualized outcome prediction of colorectal cancer patients. Furthermore, many of the genes identified have known biological functions congruent with the predicted outcomes.


Assuntos
Neoplasias Colorretais/genética , Neoplasias Colorretais/patologia , Perfilação da Expressão Gênica/instrumentação , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Idoso , Intervalo Livre de Doença , Feminino , Alemanha , Humanos , Masculino , Pessoa de Meia-Idade , Metástase Neoplásica , Nova Zelândia , Análise de Sequência com Séries de Oligonucleotídeos , Prognóstico , Recidiva , Fatores de Tempo , Resultado do Tratamento
8.
J Bioinform Comput Biol ; 3(5): 1107-36, 2005 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-16278950

RESUMO

This paper introduces a novel generic approach for classification problems with the objective of achieving maximum classification accuracy with minimum number of features selected. The method is illustrated with several case studies of gene expression data. Our approach integrates filter and wrapper gene selection methods with an added objective of selecting a small set of non-redundant genes that are most relevant for classification with the provision of bins for genes to be swapped in the search for their biological relevance. It is capable of selecting relatively few marker genes while giving comparable or better leave-one-out cross-validation accuracy when compared with gene ranking selection approaches. Additionally, gene profiles can be extracted from the evolving connectionist system, which provides a set of rules that can be further developed into expert systems. The approach uses an integration of Pearson correlation coefficient and signal-to-noise ratio methods with an adaptive evolving classifier applied through the leave-one-out method for validation. Datasets of gene expression from four case studies are used to illustrate the method. The results show the proposed approach leads to an improved feature selection process in terms of reducing the number of variables required and an increased in classification accuracy.


Assuntos
Inteligência Artificial , Biomarcadores Tumorais/análise , Perfilação da Expressão Gênica/métodos , Proteínas de Neoplasias/análise , Neoplasias/classificação , Neoplasias/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Reconhecimento Automatizado de Padrão/métodos , Análise por Conglomerados , Diagnóstico por Computador/métodos , Humanos , Neoplasias/diagnóstico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Integração de Sistemas
9.
Artif Intell Med ; 28(2): 165-89, 2003 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12893118

RESUMO

Microarray techniques have made it possible to observe the expression of thousands of genes simultaneously. They have recently been applied to study gene expression patterns in tissue samples. This may lead to highly desirable improvements in the diagnosis and treatment of human diseases. Statistical and machine learning methods have recently been used to classify cancer tissue based on gene expression data. Although some of these methods have achieved a high degree of accuracy, they generally lack transparency in their classification process. This, however, is crucial for the application in the medical field. In order to overcome this obstacle, we used knowledge-based neurocomputing (KBN), since KBN seeks to gain knowledge that is comprehensible to humans. In particular, we applied evolving fuzzy neural networks (EFuNNs) to classify cancer tissue, which is illustrated on the case studies of leukaemia and colon cancer. EFuNNs belong to the evolving connectionist system paradigm (ECOS) that has been recently introduced. They are well suited for adaptive learning and knowledge discovery. Fuzzy logic rules can be extracted from the trained networks and offer knowledge about the classification process in an easily accessible form. These rules point to genes that are strongly associated with specific types of cancer and may be used for the development of new tests and treatment discoveries.


Assuntos
Neoplasias do Colo/genética , Perfilação da Expressão Gênica , Leucemia/genética , Redes Neurais de Computação , Análise de Sequência com Séries de Oligonucleotídeos , Algoritmos , Biologia Computacional , Lógica Fuzzy , Humanos
10.
Appl Bioinformatics ; 2(3 Suppl): S53-8, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-15130817

RESUMO

Prediction of clinical behaviour and treatment for cancers is based on the integration of clinical and pathological parameters. Recent reports have demonstrated that gene expression profiling provides a powerful new approach for determining disease outcome. If clinical and microarray data each contain independent information then it should be possible to combine these datasets to gain more accurate prognostic information. Here, we have used existing clinical information and microarray data to generate a combined prognostic model for outcome prediction for diffuse large B-cell lymphoma (DLBCL). A prediction accuracy of 87.5% was achieved. This constitutes a significant improvement compared to the previously most accurate prognostic model with an accuracy of 77.6%. The model introduced here may be generally applicable to the combination of various types of molecular and clinical data for improving medical decision support systems and individualising patient care.


Assuntos
Algoritmos , Antineoplásicos/uso terapêutico , Inteligência Artificial , Diagnóstico por Computador/métodos , Linfoma de Células B/diagnóstico , Linfoma de Células B/tratamento farmacológico , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Medição de Risco/métodos , Lógica Fuzzy , Humanos , Linfoma de Células B/genética , Redes Neurais de Computação , Reconhecimento Automatizado de Padrão , Prognóstico , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA