Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
1.
J Transl Med ; 22(1): 140, 2024 02 07.
Artigo em Inglês | MEDLINE | ID: mdl-38321494

RESUMO

Building Single Sample Predictors (SSPs) from gene expression profiles presents challenges, notably due to the lack of calibration across diverse gene expression measurement technologies. However, recent research indicates the viability of classifying phenotypes based on the order of expression of multiple genes. Existing SSP methods often rely on Top Scoring Pairs (TSP), which are platform-independent and easy to interpret through the concept of "relative expression reversals". Nevertheless, TSP methods face limitations in classifying complex patterns involving comparisons of more than two gene expressions. To overcome these constraints, we introduce a novel approach that extends TSP rules by constructing rank-based trees capable of encompassing extensive gene-gene comparisons. This method is bolstered by incorporating two ensemble strategies, boosting and random forest, to mitigate the risk of overfitting. Our implementation of ensemble rank-based trees employs boosting with LogitBoost cost and random forests, addressing both binary and multi-class classification problems. In a comparative analysis across 12 cancer gene expression datasets, our proposed methods demonstrate superior performance over both the k-TSP classifier and nearest template prediction methods. We have further refined our approach to facilitate variable selection and the generation of clear, precise decision rules from rank-based trees, enhancing interpretability. The cumulative evidence from our research underscores the significant potential of ensemble rank-based trees in advancing disease classification via gene expression data, offering a robust, interpretable, and scalable solution. Our software is available at https://CRAN.R-project.org/package=ranktreeEnsemble .


Assuntos
Neoplasias , Transcriptoma , Humanos , Software , Neoplasias/genética , Oncogenes , Algoritmos
2.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-33971670

RESUMO

Gene-expression profiling can be used to classify human tumors into molecular subtypes or risk groups, representing potential future clinical tools for treatment prediction and prognostication. However, it is less well-known how prognostic gene signatures derived in one malignancy perform in a pan-cancer context. In this study, a gene-rule-based single sample predictor (SSP) called classifier for lung adenocarcinoma molecular subtypes (CLAMS) associated with proliferation was tested in almost 15 000 samples from 32 cancer types to classify samples into better or worse prognosis. Of the 14 malignancies that presented both CLAMS classes in sufficient numbers, survival outcomes were significantly different for breast, brain, kidney and liver cancer. Patients with samples classified as better prognosis by CLAMS were generally of lower tumor grade and disease stage, and had improved prognosis according to other type-specific classifications (e.g. PAM50 for breast cancer). In all, 99.1% of non-lung cancer cases classified as better outcome by CLAMS were comprised within the range of proliferation scores of lung adenocarcinoma cases with a predicted better prognosis by CLAMS. This finding demonstrates the potential of tuning SSPs to identify specific levels of for instance tumor proliferation or other transcriptional programs through predictor training. Together, pan-cancer studies such as this may take us one step closer to understanding how gene-expression-based SSPs act, which gene-expression programs might be important in different malignancies, and how to derive tools useful for prognostication that are efficient across organs.


Assuntos
Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/mortalidade , Biomarcadores Tumorais , Biologia Computacional/métodos , Regulação Neoplásica da Expressão Gênica , Adenocarcinoma de Pulmão/diagnóstico , Adenocarcinoma de Pulmão/terapia , Bases de Dados Genéticas , Gerenciamento Clínico , Feminino , Perfilação da Expressão Gênica/métodos , Humanos , Estimativa de Kaplan-Meier , Masculino , Gradação de Tumores , Estadiamento de Neoplasias , Especificidade de Órgãos/genética , Prognóstico , Análise de Sobrevida , Transcriptoma , Resultado do Tratamento , Navegador
3.
Brief Bioinform ; 21(2): 729-740, 2020 03 23.
Artigo em Inglês | MEDLINE | ID: mdl-30721923

RESUMO

The development of multigene classifiers for cancer prognosis, treatment prediction, molecular subtypes or clinicopathological groups has been a cornerstone in transcriptomic analyses of human malignancies for nearly two decades. However, many reported classifiers are critically limited by different preprocessing needs like normalization and data centering. In response, a new breed of classifiers, single sample predictors (SSPs), has emerged. SSPs classify samples in an N-of-1 fashion, relying on, e.g. gene rules comparing expression values within a sample. To date, several methods have been reported, but there is a lack of head-to-head performance comparison for typical cancer classification problems, representing an unmet methodological need in cancer bioinformatics. To resolve this need, we performed an evaluation of two SSPs [k-top-scoring pair classifier (kTSP) and absolute intrinsic molecular subtyping (AIMS)] for two case examples of different magnitude of difficulty in non-small cell lung cancer: gene expression-based classification of (i) tumor histology and (ii) molecular subtype. Through the analysis of ~2000 lung cancer samples for each case example (n = 1918 and n = 2106, respectively), we compared the performance of the methods for different sample compositions, training data set sizes, gene expression platforms and gene rule selections. Three main conclusions are drawn from the comparisons: both methods are platform independent, they select largely overlapping gene rules associated with actual underlying tumor biology and, for large training data sets, they behave interchangeably performance-wise. While SSPs like AIMS and kTSP offer new possibilities to move gene expression signatures/predictors closer to a clinical context, they are still importantly limited by the difficultness of the classification problem at hand.


Assuntos
Carcinoma Pulmonar de Células não Pequenas/patologia , Regulação Neoplásica da Expressão Gênica , Neoplasias Pulmonares/patologia , Biomarcadores Tumorais/genética , Carcinoma Pulmonar de Células não Pequenas/classificação , Carcinoma Pulmonar de Células não Pequenas/genética , Estudos de Casos e Controles , Perfilação da Expressão Gênica/métodos , Humanos , Neoplasias Pulmonares/classificação , Neoplasias Pulmonares/genética
4.
Int J Cancer ; 148(1): 238-251, 2021 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-32745259

RESUMO

Disease recurrence in surgically treated lung adenocarcinoma (AC) remains high. New approaches for risk stratification beyond tumor stage are needed. Gene expression-based AC subtypes such as the Cancer Genome Atlas Network (TCGA) terminal-respiratory unit (TRU), proximal-inflammatory (PI) and proximal-proliferative (PP) subtypes have been associated with prognosis, but show methodological limitations for robust clinical use. We aimed to derive a platform independent single sample predictor (SSP) for molecular subtype assignment and risk stratification that could function in a clinical setting. Two-class (TRU/nonTRU=SSP2) and three-class (TRU/PP/PI=SSP3) SSPs using the AIMS algorithm were trained in 1655 ACs (n = 9659 genes) from public repositories vs TCGA centroid subtypes. Validation and survival analysis were performed in 977 patients using overall survival (OS) and distant metastasis-free survival (DMFS) as endpoints. In the validation cohort, SSP2 and SSP3 showed accuracies of 0.85 and 0.81, respectively. SSPs captured relevant biology previously associated with the TCGA subtypes and were associated with prognosis. In survival analysis, OS and DMFS for cases discordantly classified between TCGA and SSP2 favored the SSP2 classification. In resected Stage I patients, SSP2 identified TRU-cases with better OS (hazard ratio [HR] = 0.30; 95% confidence interval [CI] = 0.18-0.49) and DMFS (TRU HR = 0.52; 95% CI = 0.33-0.83) independent of age, Stage IA/IB and gender. SSP2 was transformed into a NanoString nCounter assay and tested in 44 Stage I patients using RNA from formalin-fixed tissue, providing prognostic stratification (relapse-free interval, HR = 3.2; 95% CI = 1.2-8.8). In conclusion, gene expression-based SSPs can provide molecular subtype and independent prognostic information in early-stage lung ACs. SSPs may overcome critical limitations in the applicability of gene signatures in lung cancer.


Assuntos
Adenocarcinoma de Pulmão/diagnóstico , Biomarcadores Tumorais/genética , Neoplasias Pulmonares/diagnóstico , Pulmão/patologia , Recidiva Local de Neoplasia/epidemiologia , Adenocarcinoma de Pulmão/genética , Adenocarcinoma de Pulmão/mortalidade , Adenocarcinoma de Pulmão/cirurgia , Algoritmos , Conjuntos de Dados como Assunto , Intervalo Livre de Doença , Feminino , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Humanos , Pulmão/cirurgia , Neoplasias Pulmonares/genética , Neoplasias Pulmonares/mortalidade , Neoplasias Pulmonares/cirurgia , Masculino , Modelos Genéticos , Recidiva Local de Neoplasia/genética , Estadiamento de Neoplasias , Valor Preditivo dos Testes , Prognóstico , Medição de Risco/métodos , Fatores de Risco
5.
BMC Med Genomics ; 9(1): 26, 2016 06 03.
Artigo em Inglês | MEDLINE | ID: mdl-27259591

RESUMO

BACKGROUND: At the molecular level breast cancer comprises a heterogeneous set of subtypes associated with clear differences in gene expression and clinical outcomes. Single sample predictors (SSPs) are built via a two-stage approach consisting of clustering and subtype predictor construction based on the cluster labels of individual cases. SSPs have been criticized because their subtype assignments for the same samples were only moderately concordant (Cohen's κ<0.6). METHODS: We propose a semi-supervised approach where for five datasets, consensus sets were constructed consisting of those samples that were concordantly subtyped by a number of different predictors. Next, nine subtype predictors - three SSPs, three subtype classification models (SCMs) and three novel rule-based predictors based on the St. Gallen surrogate intrinsic subtype definitions (STGs) - were constructed on the five consensus sets and their associated consensus subtype labels. The predictors were validated on a compendium of over 4,000 uniformly preprocessed Affymetrix microarrays. Concordance between subtype predictors was assessed using Cohen's kappa statistic. RESULTS: In this standardized setup, subtype predictors of the same type (either SCM, SSP, or STG) but with a different gene list and/or consensus training set were associated with almost perfect levels of agreement (median κ>0.8). Interestingly, for a given predictor type a change in consensus set led to higher concordance than a change to another gene list. The more challenging scenario where the predictor type, gene list and training set were all different resulted in predictors with only substantial levels of concordance (median κ=0.74) on independent validation data. CONCLUSIONS: Our results demonstrate that for a given subtype predictor type stringent standardization of the preprocessing stage, combined with carefully devised consensus training sets, leads to predictors that show almost perfect levels of concordance. However, predictors of a different type are only substantially concordant, despite reaching almost perfect levels of concordance on training data.


Assuntos
Neoplasias da Mama/classificação , Biologia Computacional , Consenso , Neoplasias da Mama/genética , Análise por Conglomerados , Humanos , Análise de Sequência com Séries de Oligonucleotídeos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA