Pesquisa | Portal de Pesquisa da BVS

ITree: a user-driven tool for interactive decision-making with classification trees.

Sokolowski, Hubert; Czajkowski, Marcin; Czajkowska, Anna; Jurczuk, Krzysztof; Kretowski, Marek.

Bioinformatics ; 40(5)2024 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-38640482

RESUMO

MOTIVATION: ITree is an intuitive web tool for the manual, semi-automatic, and automatic induction of decision trees. It enables interactive modifications of tree structures and incorporates Relative Expression Analysis for detecting complex patterns in high-throughput molecular data. This makes ITree a versatile tool for both research and education in biomedical data analysis. RESULTS: The tool allows users to instantly see the effects of modifications on decision trees, with updates to predictions and statistics displayed in real time, facilitating a deeper understanding of data classification processes. AVAILABILITY AND IMPLEMENTATION: Available online at https://itree.wi.pb.edu.pl. Source code and documentation are hosted on GitHub at https://github.com/hsokolowski/iTree and in supplement.

Assuntos

Árvores de Decisões , Software , Biologia Computacional/métodos , Algoritmos

Testing the Utility of Polygenic Risk Scores for Type 2 Diabetes and Obesity in Predicting Metabolic Changes in a Prediabetic Population: An Observational Study.

Padilla-Martinez, Felipe; Szczerbinski, Lukasz; Citko, Anna; Czajkowski, Marcin; Konopka, Paulina; Paszko, Adam; Wawrusiewicz-Kurylonek, Natalia; Górska, Maria; Kretowski, Adam.

Int J Mol Sci ; 23(24)2022 Dec 16.

Artigo em Inglês | MEDLINE | ID: mdl-36555722

RESUMO

Prediabetes is an intermediate state of hyperglycemia during which glycemic parameters are above normal levels but below the T2D threshold. T2D and its precursor prediabetes affect 6.28% and 7.3% of the world's population, respectively. The main objective of this paper was to create and compare two polygenic risk scores (PRSs) versus changes over time (Δ) in metabolic parameters related to prediabetes and metabolic complications. The genetics of 446 prediabetic patients from the Polish Registry of Diabetes cohort were investigated. Seventeen metabolic parameters were measured and compared at baseline and after five years using statistical analysis. Subsequently, genetic polymorphisms present in patients were determined to build a T2D PRS (68 SNPs) and an obesity PRS (21 SNPs). Finally, the association among the two PRSs and the Δ of the metabolic traits was assessed. After a multiple linear regression with adjustment for age, sex, and BMI at a nominal significance of (p < 0.05) and adjustment for multiple testing, the T2D PRS was found to be positively associated with Δ fat mass (FM) (p = 0.025). The obesity PRS was positively associated with Δ FM (p = 0.023) and Δ 2 h glucose (p = 0.034). The comparison of genotype frequencies showed that AA genotype carriers of rs10838738 were significantly higher in Δ 2 h glucose and in Δ 2 h insulin. Our findings suggest that prediabetic individuals with a higher risk of developing T2D experience increased Δ FM, and those with a higher risk of obesity experience increased Δ FM and Δ two-hour postprandial glucose. The associations found in this research could be a powerful tool for identifying prediabetic individuals with an increased risk of developing T2D and obesity.

Assuntos

Diabetes Mellitus Tipo 2 , Obesidade , Estado Pré-Diabético , Humanos , Índice de Massa Corporal , Diabetes Mellitus Tipo 2/complicações , Diabetes Mellitus Tipo 2/genética , Glucose , Obesidade/complicações , Obesidade/genética , Estado Pré-Diabético/complicações , Estado Pré-Diabético/genética , Fatores de Risco , Herança Multifatorial

Evolutionary approach for relative gene expression algorithms.

Czajkowski, Marcin; Kretowski, Marek.

ScientificWorldJournal ; 2014: 593503, 2014.

Artigo em Inglês | MEDLINE | ID: mdl-24790574

RESUMO

A Relative Expression Analysis (RXA) uses ordering relationships in a small collection of genes and is successfully applied to classiffication using microarray data. As checking all possible subsets of genes is computationally infeasible, the RXA algorithms require feature selection and multiple restrictive assumptions. Our main contribution is a specialized evolutionary algorithm (EA) for top-scoring pairs called EvoTSP which allows finding more advanced gene relations. We managed to unify the major variants of relative expression algorithms through EA and introduce weights to the top-scoring pairs. Experimental validation of EvoTSP on public available microarray datasets showed that the proposed solution significantly outperforms in terms of accuracy other relative expression algorithms and allows exploring much larger solution space.

Assuntos

Algoritmos , Biologia Computacional/métodos , Evolução Molecular , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Perfilação da Expressão Gênica/classificação , Perfilação da Expressão Gênica/estatística & dados numéricos , Aptidão Genética , Variação Genética , Mutação , Análise de Sequência com Séries de Oligonucleotídeos/classificação , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Recombinação Genética , Seleção Genética

Exploring protein relative relations in skeletal muscle proteomic analysis for insights into insulin resistance and type 2 diabetes.

Czajkowska, Anna; Czajkowski, Marcin; Szczerbinski, Lukasz; Jurczuk, Krzysztof; Reska, Daniel; Kwedlo, Wojciech; Kretowski, Marek; Zabielski, Piotr; Kretowski, Adam.

Sci Rep ; 14(1): 17631, 2024 07 31.

Artigo em Inglês | MEDLINE | ID: mdl-39085321

RESUMO

The escalating prevalence of insulin resistance (IR) and type 2 diabetes mellitus (T2D) underscores the urgent need for improved early detection techniques and effective treatment strategies. In this context, our study presents a proteomic analysis of post-exercise skeletal muscle biopsies from individuals across a spectrum of glucose metabolism states: normal, prediabetes, and T2D. This enabled the identification of significant protein relationships indicative of each specific glycemic condition. Our investigation primarily leveraged the machine learning approach, employing the white-box algorithm relative evolutionary hierarchical analysis (REHA), to explore the impact of regulated, mixed mode exercise on skeletal muscle proteome in subjects with diverse glycemic status. This method aimed to advance the diagnosis of IR and T2D and elucidate the molecular pathways involved in its development and the response to exercise. Additionally, we used proteomics-specific statistical analysis to provide a comparative perspective, highlighting the nuanced differences identified by REHA. Validation of the REHA model with a comparable external dataset further demonstrated its efficacy in distinguishing between diverse proteomic profiles. Key metrics such as accuracy and the area under the ROC curve confirmed REHA's capability to uncover novel molecular pathways and significant protein interactions, offering fresh insights into the effects of exercise on IR and T2D pathophysiology of skeletal muscle. The visualizations not only underscored significant proteins and their interactions but also showcased decision trees that effectively differentiate between various glycemic states, thereby enhancing our understanding of the biomolecular landscape of T2D.

Assuntos

Diabetes Mellitus Tipo 2 , Resistência à Insulina , Músculo Esquelético , Proteômica , Diabetes Mellitus Tipo 2/metabolismo , Humanos , Músculo Esquelético/metabolismo , Músculo Esquelético/patologia , Proteômica/métodos , Masculino , Feminino , Proteoma/metabolismo , Proteoma/análise , Exercício Físico/fisiologia , Adulto , Pessoa de Meia-Idade , Aprendizado de Máquina

The FCGR2A Is Associated with the Presence of Atherosclerotic Plaques in the Carotid Arteries-A Case-Control Study.

Szpakowicz, Anna; Szum-Jakubowska, Aleksandra; Lisowska, Anna; Dubatówka, Marlena; Raczkowski, Andrzej; Czajkowski, Marcin; Szczerbinski, Lukasz; Chlabicz, Malgorzata; Kretowski, Adam; Kaminski, Karol Adam.

J Clin Med ; 12(20)2023 Oct 12.

Artigo em Inglês | MEDLINE | ID: mdl-37892617

RESUMO

BACKGROUND: Atherosclerotic plaques in carotid arteries (APCA) are a prevalent condition with severe potential complications. Studies continuously search for innovative biomarkers for APCA, including those participating in cellular metabolic processes, cell adhesion, immune response, and complement activation. This study aimed to assess the relationship between APCA presence and a broad range of cardiometabolic biomarkers in the general population. METHODS: The study group consisted of consecutive participants of the population study Bialystok PLUS. The proximity extension assay (PEA) technique from the Olink Laboratory (Uppsala, Sweden) was used to measure the levels of 92 cardiometabolic biomarkers. RESULTS: The study comprised 693 participants (mean age 48.78 ± 15.27 years, 43.4% males, N = 301). APCA was identified in 46.2% of the participants (N = 320). Of the 92 biomarkers that were investigated, 54 were found to be significantly linked to the diagnosis of APCA. After adjusting for the traditional risk factors for atherosclerosis in multivariate analysis, the only biomarker that remained significantly associated with APCA was FCGR2A. CONCLUSION: In the general population, the prevalence of APCA is very high. A range of biomarkers are linked with APCA. Nonetheless, the majority of these associations are explained by traditional risk factors for atherosclerosis. The only biomarker that was independently associated with APCA was the FCGR2A.

A comparison of different machine-learning techniques for the selection of a panel of metabolites allowing early detection of brain tumors.

Godlewski, Adrian; Czajkowski, Marcin; Mojsak, Patrycja; Pienkowski, Tomasz; Gosk, Wioleta; Lyson, Tomasz; Mariak, Zenon; Reszec, Joanna; Kondraciuk, Marcin; Kaminski, Karol; Kretowski, Marek; Moniuszko, Marcin; Kretowski, Adam; Ciborowski, Michal.

Sci Rep ; 13(1): 11044, 2023 07 08.

Artigo em Inglês | MEDLINE | ID: mdl-37422554

RESUMO

Metabolomics combined with machine learning methods (MLMs), is a powerful tool for searching novel diagnostic panels. This study was intended to use targeted plasma metabolomics and advanced MLMs to develop strategies for diagnosing brain tumors. Measurement of 188 metabolites was performed on plasma samples collected from 95 patients with gliomas (grade I-IV), 70 with meningioma, and 71 healthy individuals as a control group. Four predictive models to diagnose glioma were prepared using 10 MLMs and a conventional approach. Based on the cross-validation results of the created models, the F1-scores were calculated, then obtained values were compared. Subsequently, the best algorithm was applied to perform five comparisons involving gliomas, meningiomas, and controls. The best results were obtained using the newly developed hybrid evolutionary heterogeneous decision tree (EvoHDTree) algorithm, which was validated using Leave-One-Out Cross-Validation, resulting in an F1-score for all comparisons in the range of 0.476-0.948 and the area under the ROC curves ranging from 0.660 to 0.873. Brain tumor diagnostic panels were constructed with unique metabolites, which reduces the likelihood of misdiagnosis. This study proposes a novel interdisciplinary method for brain tumor diagnosis based on metabolomics and EvoHDTree, exhibiting significant predictive coefficients.

Assuntos

Neoplasias Encefálicas , Glioma , Neoplasias Meníngeas , Meningioma , Humanos , Neoplasias Encefálicas/diagnóstico , Neoplasias Encefálicas/patologia , Glioma/patologia , Encéfalo/metabolismo , Meningioma/diagnóstico , Meningioma/patologia , Aprendizado de Máquina

Top scoring pair decision tree for gene expression data analysis.

Adv Exp Med Biol ; 696: 27-35, 2011.

Artigo em Inglês | MEDLINE | ID: mdl-21431543

RESUMO

Classification problems of microarray data may be successfully performed with approaches by human experts which are easy to understand and interpret, like decision trees or Top Scoring Pairs algorithms. In this chapter, we propose a hybrid solution that combines the above-mentioned methods. An application of presented decision trees, which splits instances based on pairwise comparisons of the gene expression values, may have considerable potential for genomic research and scientific modeling of underlying processes. We have compared proposed solution with the TSP-family methods and decision trees on 11 public domain microarray datasets and the results are promising.

Assuntos

Algoritmos , Árvores de Decisões , Perfilação da Expressão Gênica/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Neoplasias da Mama/genética , DNA de Neoplasias/genética , Interpretação Estatística de Dados , Bases de Dados de Ácidos Nucleicos , Feminino , Humanos

Multi-test decision tree and its application to microarray data classification.

Czajkowski, Marcin; Grzes, Marek; Kretowski, Marek.

Artif Intell Med ; 61(1): 35-44, 2014 May.

Artigo em Inglês | MEDLINE | ID: mdl-24630712

RESUMO

OBJECTIVE: The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. METHODS: We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. RESULTS: Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. CONCLUSION: This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts.

Assuntos

Biologia Computacional/métodos , Árvores de Decisões , Perfilação da Expressão Gênica , Análise em Microsséries , Algoritmos , Conjuntos de Dados como Assunto , Humanos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA